Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abundance profile for multiple chromosomes per species? #17

Open
andromedia33 opened this issue Aug 25, 2017 · 0 comments
Open

Abundance profile for multiple chromosomes per species? #17

andromedia33 opened this issue Aug 25, 2017 · 0 comments

Comments

@andromedia33
Copy link

Hi,
I'd like to simulate reads generated from human samples, and have a question regarding the correct way of assigning abundance ratio in this case.
For the simplest case of one individual, my reference genome consists of 23 chromosomes, so my "multi-fasta genomes" file would look like the regular reference genome fasta file:

>chr1
ACGT
>chr2
ACGT
...
>chrX
ACGT

Now my question is, what should be the ratio assigned to each chromosome? Should the abundance ratio be equal for each chromosome? Or should they be proportionate to the chromosomal length? i.e. Should my abundance file be like this?

chr1          1/23
chr2          1/23
...
chrX          1/23

OR, should it be like this?

chr1          249250621/sum_of_lengths
chr2          243199373/sum_of_lengths
...
chrX          155270560/sum_of_lengths

I'm assuming the number of reads coming from each chromosome should scale with its length, so I guess I should go with the second option? Any clarification is greatly appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant