Abundance profile for multiple chromosomes per species? #17

andromedia33 · 2017-08-25T07:24:48Z

Hi,
I'd like to simulate reads generated from human samples, and have a question regarding the correct way of assigning abundance ratio in this case.
For the simplest case of one individual, my reference genome consists of 23 chromosomes, so my "multi-fasta genomes" file would look like the regular reference genome fasta file:

>chr1
ACGT
>chr2
ACGT
...
>chrX
ACGT

Now my question is, what should be the ratio assigned to each chromosome? Should the abundance ratio be equal for each chromosome? Or should they be proportionate to the chromosomal length? i.e. Should my abundance file be like this?

chr1          1/23
chr2          1/23
...
chrX          1/23

OR, should it be like this?

chr1          249250621/sum_of_lengths
chr2          243199373/sum_of_lengths
...
chrX          155270560/sum_of_lengths

I'm assuming the number of reads coming from each chromosome should scale with its length, so I guess I should go with the second option? Any clarification is greatly appreciated!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Abundance profile for multiple chromosomes per species? #17

Abundance profile for multiple chromosomes per species? #17

andromedia33 commented Aug 25, 2017

Abundance profile for multiple chromosomes per species? #17

Abundance profile for multiple chromosomes per species? #17

Comments

andromedia33 commented Aug 25, 2017