You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have genotype array, WGS, and WGBS data for my samples. I am using this information to detect sample swaps. I find that Biscut genotype calls are highly concordant with WGS genotype calls except for the case where the reference is 'C' and the true genotype is 'TT'. I understand that it is not possible to accurately genotype in this case, but I am curious about the behavior of Biscuit. For example:
From the pileup, there is no evidence of a C allele:
chr1 852875 C 60 TTTTTTTTTTTTtttttttTTTTTTTtttTTTTttTTTttTTttTTTTtttTTTTTtttt
However, in the VCF, the allele support for this position shows 33 Cs and 26 Ts:
chr1 852875 . C T,A,G 34 PASS . DP:GT:GP:GQ:SP:CV:BT 60:0/1:84,4,115:99:C33,T26,A0:.:
Question: In this case, does Biscuit just generate the 'C' count from an expected distribution?
My suggestion is that a nice feature would be detecting sample swaps when genotype information is known. Basically just a script that compares a VCF of known genotypes to the Biscuit-generated VCF, ignoring sites where it is difficult/impossible to genotype correctly from WGBS, and output a likelihood score of the two VCFs having been generated from the same individual.
The text was updated successfully, but these errors were encountered:
6fc5d23 fixes the VCF issue (don't look at how trivial the fix was, you will feel bad, I did). bcftools csq now works on the generated VCF files; bcftools gtcheck should too.
I have genotype array, WGS, and WGBS data for my samples. I am using this information to detect sample swaps. I find that Biscut genotype calls are highly concordant with WGS genotype calls except for the case where the reference is 'C' and the true genotype is 'TT'. I understand that it is not possible to accurately genotype in this case, but I am curious about the behavior of Biscuit. For example:
From the pileup, there is no evidence of a C allele:
chr1 852875 C 60 TTTTTTTTTTTTtttttttTTTTTTTtttTTTTttTTTttTTttTTTTtttTTTTTtttt
However, in the VCF, the allele support for this position shows 33 Cs and 26 Ts:
chr1 852875 . C T,A,G 34 PASS . DP:GT:GP:GQ:SP:CV:BT 60:0/1:84,4,115:99:C33,T26,A0:.:
Question: In this case, does Biscuit just generate the 'C' count from an expected distribution?
My suggestion is that a nice feature would be detecting sample swaps when genotype information is known. Basically just a script that compares a VCF of known genotypes to the Biscuit-generated VCF, ignoring sites where it is difficult/impossible to genotype correctly from WGBS, and output a likelihood score of the two VCFs having been generated from the same individual.
The text was updated successfully, but these errors were encountered: