Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The mehari wrapper for new versions (>=0.25) needs to always run bcftools norm #504

Open
Nicolai-vKuegelgen opened this issue Apr 29, 2024 · 2 comments

Comments

@Nicolai-vKuegelgen
Copy link
Contributor

Describe the bug
Mehari generally requires vcf files to not have multi-allelic variants, which can be removed by a bcftools norm call.
The current wrapper does this only when the input vcf file is also restricted to a given bed file, and the current mehari version (0.21.1) does not seem to fail on all multi-allelic variants, so it works. Soon, we will need to update the mehari version though.

To Reproduce
Steps to reproduce the behavior:

  1. (Manually) update mehari version to >= 0.25.5
  2. Run varfish-export step without using a bed file

Expected behavior
The mehari wrapper should always run bcftools norm.

@Nicolai-vKuegelgen
Copy link
Contributor Author

Note: actually the current mehari wrapper does not split multiallelic at all (bcftools norm -m-)

@Nicolai-vKuegelgen
Copy link
Contributor Author

Nicolai-vKuegelgen commented Apr 29, 2024

Addendum: the snappy (g)vcf files contain the "AS_UNIQ_ALT_READ_COUNT" INFO field, which causes Problems when running bcftools norm -m-

See also: related issues and field description. It might be best to disbale/remove this field from snappy output

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

1 participant