Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to annotate bedmethyl file with cpg name instead of chr name and position and How can I find the pvalue for association of the cpg with the phenotype? #317

Open
ralanany opened this issue Dec 15, 2024 · 4 comments
Labels
question Looking for clarification on inputs and/or outputs

Comments

@ralanany
Copy link

No description provided.

@ArtRand
Copy link
Contributor

ArtRand commented Dec 19, 2024

Hello @ralanany,

Are you looking to join the chrom name, start, stop coordinates with a table of CpG "names"? Modkit doesn't have that exact functionality, but most dataframe and spreadsheet programs should be able to handle it. We're actively working on "phenotype" scores, but I don't have a solution for you right now.

@ArtRand ArtRand added the question Looking for clarification on inputs and/or outputs label Dec 19, 2024
@ralanany
Copy link
Author

ralanany commented Dec 26, 2024

Hi @ArtRand
Thanks for your reply, Actually, I don't have table with the cpg name, What I am looking for is how to find the cpg name for each position (cg000..)?
I have the bedmethyl file with chr no, chr position and modification. But the cpg name not mentioned. I want help to find the cpg name as below

cg13869341 chr1 [15865, 15865] *
cg14008030 chr1 [18827, 18827] *
cg12045430 chr1 [29407, 29407] *

thanks in advance

@ralanany
Copy link
Author

Hi @ArtRand Thanks for your reply, Actually, I don't have table with the cpg name, What I am looking for is how to find the cpg name for each position (cg000..)? I have the bedmethyl file with chr no, chr position and modification. But the cpg name not mentioned. I want help to find the cpg name as below

cg13869341 chr1 [15865, 15865] * cg14008030 chr1 [18827, 18827] * cg12045430 chr1 [29407, 29407] *

thanks in advance

Hi @ArtRand
Thanks for your reply, Actually, I don't have table with the cpg name, What I am looking for is how to find the cpg name for each position (cg000..)?
I have the bedmethyl file with chr no, chr position and modification. But the cpg name not mentioned. I want help to find the cpg name as below

cg13869341 chr1 [15865, 15865] *
cg14008030 chr1 [18827, 18827] *
cg12045430 chr1 [29407, 29407] *

thanks in advance

@ArtRand
Copy link
Contributor

ArtRand commented Dec 30, 2024

Hello @ralanany,

From a quick search on your CpG names, it looks like these identifiers come from an Illumina probeset. If this is the case, I would download these tables and transform them into a BED6 file (chrom, start, stop, name, score, strand, you can make score=0 for all of the records). Then use this file and perform a bedtools intersect -a $bedmethyl -b $probes_bed -wb. Your question makes me think it would be convenient if when using --include-bed in modkit pileup if the name (when present) was carried along in the bedMethyl output. I'll consider adding this enhancement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Looking for clarification on inputs and/or outputs
Projects
None yet
Development

No branches or pull requests

2 participants