v2.0.0-beta.3 #360
Pinned
nebfield
announced in
Announcements
Replies: 1 comment
-
Just a note, it's important to update any |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Changelog
Important fix: Fix splitting duplicated variant IDs across multiple scoring files
Background
MATCH_COMBINE
step writes new scoring files for input toplink2 --score
Example
When using PGS000039, PGS000040, and PGS000041 in parallel some variants have different effect alleles at the same coordinates, for example:
22:40682469:T:C
with effect allele T (PGS000041_hmPOS_GRCh38)22:40682469:T:C
with effect allele C (PGS000039_hmPOS_GRCh38)Impact
In versions
v2.0.0-beta
,beta.1
, andbeta.2
the duplicated variant is written to the same scoring file and ignored by plink2. The duplicated variant doesn't contribute to the final calculated PGS.In all
v2.0.0-alpha
versions andbeta.3
a second scoring file is correctly written containing the other allele (additional alleles create extra scoring files automatically within the updatedMATCH_COMBINE
process). We have also updated the software tests to ensure this error doesn't occur in future releases.This problem is more likely to happen when larger scores are calculated in parallel. As more scores are calculated in parallel, it's more likely that variant IDs with different effect alleles will duplicate and be ignored during the score calculation stage.
While the overall impact on the final score is likely to be small we encourage users to upgrade to beta.3, especially if they calculate larger scores in parallel.
How do I know if my data are affected?
One missing variant appears in the output. This check is now included in the scoring module.
Other fixes
--keep_ambiguous
parameter Issue with '--keep_ambiguous' Option and Possible Bug #346This discussion was created from the release v2.0.0-beta.3.
Beta Was this translation helpful? Give feedback.
All reactions