Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProteinPaint txt for bundled coding SSMs #3

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

vladimirsouza
Copy link
Contributor

@vladimirsouza vladimirsouza commented Mar 22, 2024

The files added by this PR were created using ssm_to_proteinpaint function from this GAMBLR.utils PR.

pp_37_g <- ssm_to_proteinpaint(
maf_data = maf_37_g,
this_seq_type = "genome",
sample_type = "time_point",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use pairing_status instead of time_point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! However, since this is the bundled data, the pairing_status of all samples are matched.

> library(GAMBLR.data)
> meta <- get_gambl_metadata()
Using the bundled metadata in GAMBLR.data...
> table(meta$pairing_status)
matched 
    300 

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because you're only generating these files for genome data (not capture)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this return quite a few rows?

meta <- get_gambl_metadata(seq_type_filter="capture")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does return a few rows, but all of them are matched. So, if we use sample_type = "pairing_status" we are not going to separate anything, right?

> library(GAMBLR.data)
> meta <- get_gambl_metadata(seq_type_filter = "capture")
Using the bundled metadata in GAMBLR.data...
> table(meta$pairing_status)

matched 
     13 

Maybe you were think about the restrict data:

> meta <- GAMBLR.results::get_gambl_metadata(seq_type_filter = "capture")
> table(meta$pairing_status)                                                                                                                                      

  matched unmatched 
      527      2759 
> meta <- GAMBLR.results::get_gambl_metadata(seq_type_filter = "genome")
> table(meta$pairing_status)                                                                                                                                      

  matched unmatched 
     1717       214 

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the Reddy cohort in the bundled results? All of those are unmatched.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants