Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNP calling and known genotypes (question + feature suggestion) #16

Open
jdidion opened this issue Feb 16, 2017 · 2 comments
Open

SNP calling and known genotypes (question + feature suggestion) #16

jdidion opened this issue Feb 16, 2017 · 2 comments

Comments

@jdidion
Copy link

jdidion commented Feb 16, 2017

I have genotype array, WGS, and WGBS data for my samples. I am using this information to detect sample swaps. I find that Biscut genotype calls are highly concordant with WGS genotype calls except for the case where the reference is 'C' and the true genotype is 'TT'. I understand that it is not possible to accurately genotype in this case, but I am curious about the behavior of Biscuit. For example:

From the pileup, there is no evidence of a C allele:
chr1 852875 C 60 TTTTTTTTTTTTtttttttTTTTTTTtttTTTTttTTTttTTttTTTTtttTTTTTtttt

However, in the VCF, the allele support for this position shows 33 Cs and 26 Ts:
chr1 852875 . C T,A,G 34 PASS . DP:GT:GP:GQ:SP:CV:BT 60:0/1:84,4,115:99:C33,T26,A0:.:

Question: In this case, does Biscuit just generate the 'C' count from an expected distribution?

My suggestion is that a nice feature would be detecting sample swaps when genotype information is known. Basically just a script that compares a VCF of known genotypes to the Biscuit-generated VCF, ignoring sites where it is difficult/impossible to genotype correctly from WGBS, and output a likelihood score of the two VCFs having been generated from the same individual.

@ttriche
Copy link
Contributor

ttriche commented Feb 22, 2017

bcftools gtcheck will do this, if the VCFs are valid v4.1. I'm going to take a whack at that

@ttriche
Copy link
Contributor

ttriche commented Feb 28, 2017

6fc5d23 fixes the VCF issue (don't look at how trivial the fix was, you will feel bad, I did). bcftools csq now works on the generated VCF files; bcftools gtcheck should too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants