Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work on getting Mutect algorithm up and running #167

Open
wants to merge 63 commits into
base: master
Choose a base branch
from

Conversation

jstjohn
Copy link
Contributor

@jstjohn jstjohn commented May 18, 2015

This PR is to track progress and collaborate on an implementation of the MuTect algorithm referenced in #114.

tdanford and others added 17 commits April 25, 2015 10:07
…at filter reads to have a consistent interface.

Using this interface, I added filters to eliminate reads with high levels of soft clipping, and reads that are marked as duplicates.
…er string is set.

Further, refactored depth filter to provide generalized attribute based genotype filter.
Using generalized filter, added strand bias filter and multiallelic site filter.
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@fnothaft
Copy link
Member

Jenkins, add to whitelist and test this please.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/avocado-prb/99/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/avocado-prb/100/
Test PASSed.

*/
object MfmModel extends LikelihoodModel {

def e(q: Int): Double = pow(10.0, -0.1 * q.toDouble)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird, for some reason when I switch from val ei = e(obs.phred) to val ei = phredToSuccessProbability(obs.phred) my tests start failing! not sure what is going on there, your code looks correct at first glance. Leaving it as is for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it should have been phredToErrorProbability(obs.phred) just switched code to that.

@jstjohn
Copy link
Contributor Author

jstjohn commented Aug 1, 2015

Woot! Super close. Ok so Frank, the only thing I need to change is the known sites filter. The thing is that this is not a binary keep/throw away kind of situation. Basically if the site is a known variant, then you have a different prior probability of the mutation being germline than if it is a site that had never been seen before.E See lines 167-169 of MutectGenotyper.scala for how this is used.

As you can see, right now I have the placeholder in my code val dbSNPsite = false. Ideally this would be the real value, or I could just do this filter later -- during the postprocessing step as you have implemented in your latest PR for example. So you don't necessarily throw out all sites that overlap dbSNP, you just have to have more evidence in the normal that the site is actually not just a heterozygous site -- the burden of proof is higher for the site being unique to the tumor.

One way to get this is to pass this information through with the variant, and decide which cutoff to use for the log odds score during the post processing step. If we go in this direction, what is the best place to stick this arbitrary bit of data, the normal log odds of being germline? Thanks for your time and thoughts on this @fnothaft!

@fnothaft
Copy link
Member

fnothaft commented Aug 2, 2015

@jstjohn oh, hah! Goofy mistake on my side. I think that is an easy change to make, actually! I will have a PR against this branch with the fix tomorrow.

@jstjohn
Copy link
Contributor Author

jstjohn commented Aug 2, 2015

Thanks Frank!!

On Sat, Aug 1, 2015 at 6:09 PM, Frank Austin Nothaft <
[email protected]> wrote:

@jstjohn https://github.com/jstjohn oh, hah! Goofy mistake on my side.
I think that is an easy change to make, actually! I will have a PR against
this branch with the fix tomorrow.


Reply to this email directly or view it on GitHub
#167 (comment)
.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/avocado-prb/133/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/avocado-prb/137/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/avocado-prb/138/

Build result: FAILURE

GitHub pull request #167 of commit f7ad48b automatically merged.[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-02 (centos spark-test spark-compile) in workspace /home/jenkins/workspace/avocado-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/avocado.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/avocado.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/avocado.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/167/merge^{commit} # timeout=10 > git branch -a --contains 239bf46 # timeout=10 > git rev-parse remotes/origin/pr/167/merge^{commit} # timeout=10Checking out Revision 239bf46 (origin/pr/167/merge) > git config core.sparsecheckout # timeout=10 > git checkout -f 239bf462ff06fe881648bce0eb0c26fbe37e8618First time build. Skipping changelog.Triggering avocado-prb ? 2.3.0,centosTriggering avocado-prb ? 1.0.4,centosTriggering avocado-prb ? 2.2.0,centosavocado-prb ? 2.3.0,centos completed with result FAILUREavocado-prb ? 1.0.4,centos completed with result FAILUREavocado-prb ? 2.2.0,centos completed with result SUCCESS
Test FAILed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants