Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unmatched chromosome name get error #5

Open
worker000000 opened this issue Nov 6, 2020 · 6 comments
Open

unmatched chromosome name get error #5

worker000000 opened this issue Nov 6, 2020 · 6 comments

Comments

@worker000000
Copy link

worker000000 commented Nov 6, 2020

Dear professor,
thansk for such a accurate software.

when I am using it, it raise errors like follows. MAYbe caused by chrM and chrMT.

what is more, since your genome.dict has many patch chromosome names, like

SN:GL000207.1
SN:GL000226.1
SN:GL000229.1
SN:GL000231.1
, but when users input bam, they may use a different version of genome, even for hg19, the patch chromosome seems to be different, so why not the input is a fastq, but a bam? ,can you give me some suggestions

image

image

ERRORS screenshot like following
image

@worker000000
Copy link
Author

abother question is we know wgs always give big fragments of cnv. so why here the configure file, the window size is 500, people seems to use 1M instead of 500bp

@fanxinping
Copy link
Collaborator

We recommend remap your samples to the ref genome provided by us to avoid some unexpected behaviour. Or, you can generate your own ref data according to https://www.yfish.org/display/PUB/Accucopy#Accucopy-3.7Makeyourownreferencegenomepackage

@polyactis
Copy link
Owner

You probably need to watch some videos or read some reviews/tutorials to understand how DNA is extracted from a cell, fragmented, and PCRed before it is put on a DNA sequencing machine. 500bp is NOT the CNA length. It is the average length of DNA fragments to be sequenced by a high-throughput DNA sequencer, i.e. Illumina HiSeq or NovaSeq.

These so-called next-gen sequencers can only sequence 100-150bp for one fragment, not from start to end of a chromosome. Anyhow, you need to get familiar with what a next-gen sequencer can and cannot do.

abother question is we know wgs always give big fragments of cnv. so why here the configure file, the window size is 500, people seems to use 1M instead of 500bp

@worker000000
Copy link
Author

We recommend remap your samples to the ref genome provided by us to avoid some unexpected behaviour. Or, you can generate your own ref data according to https://www.yfish.org/display/PUB/Accucopy#Accucopy-3.7Makeyourownreferencegenomepackage

thanks a lot, so can this tool accept fastq file instead of bam?

@worker000000
Copy link
Author

You probably need to watch some videos or read some reviews/tutorials to understand how DNA is extracted from a cell, fragmented, and PCRed before it is put on a DNA sequencing machine. 500bp is NOT the CNA length. It is the average length of DNA fragments to be sequenced by a high-throughput DNA sequencer, i.e. Illumina HiSeq or NovaSeq.

These so-called next-gen sequencers can only sequence 100-150bp for one fragment, not from start to end of a chromosome. Anyhow, you need to get familiar with what a next-gen sequencer can and cannot do.

abother question is we know wgs always give big fragments of cnv. so why here the configure file, the window size is 500, people seems to use 1M instead of 500bp

thanks a lot. so how to understand here is 500 for segmentation
window_size the window size in base pair for segmentation. The segmentation program (GADA) first calculates the number of reads for each window and then perform segmentation over the genome. A small window size often leads to a large number of small segments. The recommended window size is 500bp.

@fanxinping
Copy link
Collaborator

We recommend remap your samples to the ref genome provided by us to avoid some unexpected behaviour. Or, you can generate your own ref data according to https://www.yfish.org/display/PUB/Accucopy#Accucopy-3.7Makeyourownreferencegenomepackage

thanks a lot, so can this tool accept fastq file instead of bam?

No, Accucopy accepts bam file only.

You probably need to watch some videos or read some reviews/tutorials to understand how DNA is extracted from a cell, fragmented, and PCRed before it is put on a DNA sequencing machine. 500bp is NOT the CNA length. It is the average length of DNA fragments to be sequenced by a high-throughput DNA sequencer, i.e. Illumina HiSeq or NovaSeq.
These so-called next-gen sequencers can only sequence 100-150bp for one fragment, not from start to end of a chromosome. Anyhow, you need to get familiar with what a next-gen sequencer can and cannot do.

abother question is we know wgs always give big fragments of cnv. so why here the configure file, the window size is 500, people seems to use 1M instead of 500bp

thanks a lot. so how to understand here is 500 for segmentation
window_size the window size in base pair for segmentation. The segmentation program (GADA) first calculates the number of reads for each window and then perform segmentation over the genome. A small window size often leads to a large number of small segments. The recommended window size is 500bp.

500bp for segmentation is just a proper parameter base on our testing and you can set other value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants