You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I ran an older version of minicaller (version 6d7e78c) over a bam file and it output a VCF file with a selected multi-allelic variant shown below:
cusRef 24 . G A,C,T . . AC=1,1,1;AF=0.250,0.250,0.250;AN=4;DP=164410 GT:DP:DP4:DPG 0/3/2/1:164410:161365,259,2785,1:161624,1356,977,452
From the DPG field above, I calculated the alt allele frequency (AF) this way:
Alt allele T AF = 1356/(161624+1356+977+452)
Alt allele C AF = 977/(161624+1356+977+452)
Alt allele A AF = 452/(161624+1356+977+452)
I then ran the latest version of minicaller (version 69ca18e) over the same bam file. (I was able to resolve the "too many open files" error message by increasing the value in maxRecordsInRam. Thanks so much!)
In order to obtain all the variants, I turned off two filters by setting:
--bad-ad-ratio 1
In this case, it satisfies 1< ALT/(REF+ALT) < 0 so that no genotypes will be filtered
--gt-fraction 0
It satisfies ALT/(REF+ALT) < 0 so again no genotypes will be ignored.
In addition I set --min-gt-allele-depth 10 and --min-gt-depth 10.
I then looked into the variant in the same position, and here's the variant detected by the new version:
cusRef 24 . G T 38 . AC=1;AF=0.500;AN=2;DP=324605 GT:AD:DP:FT:GQ 0/1:161624,1356:162981:LowQual:38
My questions are:
(1) In the variant from the new version of minicaller, there is only one variant (G->T), whereas there are 3 variants (G->A,C,T) from the old version. Looks like the new version just selected the alt allele with the highest read counts. Can you please explain why?
(2) In the variant from new version, is it okay for me to calculate the alt allele frequency using the AD field this way?
Alt allele T AF = 1356/(161624+1356)
I ran an older version of minicaller (version 6d7e78c) over a bam file and it output a VCF file with a selected multi-allelic variant shown below:
cusRef 24 . G A,C,T . . AC=1,1,1;AF=0.250,0.250,0.250;AN=4;DP=164410 GT:DP:DP4:DPG 0/3/2/1:164410:161365,259,2785,1:161624,1356,977,452
From the DPG field above, I calculated the alt allele frequency (AF) this way:
Alt allele T AF = 1356/(161624+1356+977+452)
Alt allele C AF = 977/(161624+1356+977+452)
Alt allele A AF = 452/(161624+1356+977+452)
I then ran the latest version of minicaller (version 69ca18e) over the same bam file. (I was able to resolve the "too many open files" error message by increasing the value in
maxRecordsInRam
. Thanks so much!)In order to obtain all the variants, I turned off two filters by setting:
--bad-ad-ratio 1
In this case, it satisfies 1< ALT/(REF+ALT) < 0 so that no genotypes will be filtered
--gt-fraction 0
It satisfies ALT/(REF+ALT) < 0 so again no genotypes will be ignored.
In addition I set
--min-gt-allele-depth 10
and--min-gt-depth 10
.I then looked into the variant in the same position, and here's the variant detected by the new version:
cusRef 24 . G T 38 . AC=1;AF=0.500;AN=2;DP=324605 GT:AD:DP:FT:GQ 0/1:161624,1356:162981:LowQual:38
My questions are:
(1) In the variant from the new version of minicaller, there is only one variant (G->T), whereas there are 3 variants (G->A,C,T) from the old version. Looks like the new version just selected the alt allele with the highest read counts. Can you please explain why?
(2) In the variant from new version, is it okay for me to calculate the alt allele frequency using the AD field this way?
Alt allele T AF = 1356/(161624+1356)
Our environment
I apologize for a long post, but thank you so much for your attention!
Best,
Ting
The text was updated successfully, but these errors were encountered: