Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on haplotagging in modkit pileup output #380

Open
K1999ban opened this issue Feb 21, 2025 · 1 comment
Open

Clarification on haplotagging in modkit pileup output #380

K1999ban opened this issue Feb 21, 2025 · 1 comment
Labels
question Looking for clarification on inputs and/or outputs

Comments

@K1999ban
Copy link

I'm using modkit pileup to generate bedmethyl files for methylation analysis. I'm interested in understanding how haplotypes are handled in the output.

Specifically, I'd like to know:

  1. Does modkit pileup generate haplotagged bedmethyl files by default? If so, is there a way to disable this and generate a single, non-haplotagged file?
  2. If the output is haplotagged, how are unaligned fragments handled? Are they included in any of the output files or discarded?

My goal is to analyze all reads together, including unaligned fragments, without considering haplotypes. If modkit pileup doesn't currently support this, I'd be grateful for any suggestions or workarounds.

Thank you for your time and clarification!

@ArtRand
Copy link
Contributor

ArtRand commented Feb 24, 2025

Hello @K1999ban,

Does modkit pileup generate haplotagged bedmethyl files by default? If so, is there a way to disable this and generate a single, non-haplotagged file?

Modkit does not generate "haplotagged" the bedMethy files by default. There is a general way to partitioning reads in a modBAM by tag using the --partition-tag <TAG> flag. For haplotypes this is usually --partition-tag HP. When you use this option, you'll want the output to be a directory instead of a file. There is addition documentation online.

If the output is haplotagged, how are unaligned fragments handled? Are they included in any of the output files or discarded?

Do you mean "untagged" instead of "unaligned"? The untagged fragments are partitioned into an "ungrouped" bedMethyl file. Unaligned reads aren't used in pileup.

My goal is to analyze all reads together, including unaligned fragments, without considering haplotypes. If modkit pileup doesn't currently support this, I'd be grateful for any suggestions or workarounds.

If you want to have all of the reads together, simply omit the --partition-tag option, (i.e. use the default behavior). If you expect that you'll want the partitioned bedMethyls at some point, you could run pileup with --partiton-tag then modkit bedmethyl merge the results together so that you have the combined and separated tables.

Hope this helps!

@ArtRand ArtRand added the question Looking for clarification on inputs and/or outputs label Feb 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Looking for clarification on inputs and/or outputs
Projects
None yet
Development

No branches or pull requests

2 participants