-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Work on getting Mutect algorithm up and running #167
base: master
Are you sure you want to change the base?
Conversation
…at filter reads to have a consistent interface. Using this interface, I added filters to eliminate reads with high levels of soft clipping, and reads that are marked as duplicates.
…er string is set. Further, refactored depth filter to provide generalized attribute based genotype filter. Using generalized filter, added strand bias filter and multiallelic site filter.
…version of adam/avocado
…tests which will be filled in
Can one of the admins verify this patch? |
Jenkins, add to whitelist and test this please. |
Test PASSed. |
Test PASSed. |
*/ | ||
object MfmModel extends LikelihoodModel { | ||
|
||
def e(q: Int): Double = pow(10.0, -0.1 * q.toDouble) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to use https://github.com/bigdatagenomics/adam/blob/master/adam-core/src/main/scala/org/bdgenomics/adam/util/PhredUtils.scala#L32 which precomputes the phred-to-log conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird, for some reason when I switch from val ei = e(obs.phred)
to val ei = phredToSuccessProbability(obs.phred)
my tests start failing! not sure what is going on there, your code looks correct at first glance. Leaving it as is for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it should have been phredToErrorProbability(obs.phred)
just switched code to that.
…erly tested, and fixed bug in filters so they work now
…Observation object and parsed from read MD tags by ReadExplorer
Adding known-sites variant filter.
Woot! Super close. Ok so Frank, the only thing I need to change is the known sites filter. The thing is that this is not a binary keep/throw away kind of situation. Basically if the site is a known variant, then you have a different prior probability of the mutation being germline than if it is a site that had never been seen before.E See lines 167-169 of MutectGenotyper.scala for how this is used. As you can see, right now I have the placeholder in my code One way to get this is to pass this information through with the variant, and decide which cutoff to use for the log odds score during the post processing step. If we go in this direction, what is the best place to stick this arbitrary bit of data, the normal log odds of being germline? Thanks for your time and thoughts on this @fnothaft! |
@jstjohn oh, hah! Goofy mistake on my side. I think that is an easy change to make, actually! I will have a PR against this branch with the fix tomorrow. |
Thanks Frank!! On Sat, Aug 1, 2015 at 6:09 PM, Frank Austin Nothaft <
|
…genotyper use broadcast.
Add dbsnp to mutect caller
…ring as discussed in the MuTect paper
Test PASSed. |
Updating from bigdatagenomics/avocado
Update ADAM to 0.18.2
…label reads. Runs with MuTect, but there are bugs within the program
Test PASSed. |
Test FAILed. Build result: FAILUREGitHub pull request #167 of commit f7ad48b automatically merged.[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-02 (centos spark-test spark-compile) in workspace /home/jenkins/workspace/avocado-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/avocado.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/avocado.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/avocado.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/167/merge^{commit} # timeout=10 > git branch -a --contains 239bf46 # timeout=10 > git rev-parse remotes/origin/pr/167/merge^{commit} # timeout=10Checking out Revision 239bf46 (origin/pr/167/merge) > git config core.sparsecheckout # timeout=10 > git checkout -f 239bf462ff06fe881648bce0eb0c26fbe37e8618First time build. Skipping changelog.Triggering avocado-prb ? 2.3.0,centosTriggering avocado-prb ? 1.0.4,centosTriggering avocado-prb ? 2.2.0,centosavocado-prb ? 2.3.0,centos completed with result FAILUREavocado-prb ? 1.0.4,centos completed with result FAILUREavocado-prb ? 2.2.0,centos completed with result SUCCESSTest FAILed. |
This PR is to track progress and collaborate on an implementation of the MuTect algorithm referenced in #114.