-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chromosome path not found in graph or haplotypes index #4411
Comments
I think the correct solution in your case would be to either 1) only convert the non-reference sequence into W lines, or 2) add the I should also note that the GFF for a given haplotype is expected to follow the Pan-SN naming specification, which has semantics |
I know that. I set it in this way solely to better represent my samples for now.
I tried these solutions.
It seems that the GFA file has strict format. It must include the reference path with P line and haplotype paths, and the name of reference path cannot follow the Pan-SN naming specification. In fact, for this species, I have multiple genomes and corresponding annotation files. This means that each haplotype may be considered as the reference. I tried to run with two annotation files(MHChr01 gff3 and ZSChr01 gff3) .
I got the result from
The I used the same GFA file to run
The paths in the new GFA file.
I got the result from
It seems that the results are the same, but |
If you have annotations for the other haplotypes, then that's what the Can you specify what the |
This is the GFA file includes
Maybe I confused something. Before I figure it out, the GFA file
Based on this GFA files, I have no way to use I don't know how to use Now I tried to use
However, it may be difficult to know haplotype specific transcripts clearly. When I read the article When I have multiple genomes and their corresponding annotations, I don't know how I should build them correctly. Because I have multiple annotation files, the same gene may have different start and end positions and different lengths on two haplotypes. There are also some transcripts that appear on specific haplotypes. These are two types of haplotype specific transcripts in my ideal situation. Don't worry, I'm just organizing my thoughts and ideas. I really look forward to your suggestions. |
I'm sorry. I misunderstood that transcripts don’t all have the same length. But I have a question: when I project the reference annotation onto the haplotype, not all transcript annotations are projected onto the haplotype. In what situations would that happen? |
I think to use the GFA as input for With the Regarding the |
Hi, I have the same issue here. I generated the graph using minigraph Cactus with different reference samples (see below). <style> </style>
I’ve tried to match the gene annotation file with the chromosome names, but I’m still encountering the same error message. How can I resolve this? Could you please provide me with an example of the gene annotation file? My script looks like this: vg rna -p --threads $SLURM_CPUS_PER_TASK \
-q --transcripts test.gtf --gbz-format \
--write-gbwt ${chr}.pantranscriptome.gbwt \
--use-hap-ref \
--write-info ${chr}.pantranscriptome.txt \
-f ${chr}.pantranscriptome.fa \
minigraph.chr22.gbz \
--gbwt-bidirectional > ${chr}.spliced_graph.pg |
@jeizenga Sorry for the late response, I was on vacation for a week. It would be great if you could still take some time to take a look. Here are the GFA file and annotation file I used. two annotation files: I can't work with the following commands with this GFA file.
Maybe I wasn’t clear before, but I find this a bit confusing. When I specify multiple annotation files without enabling projection, |
@jeizenga I think I have figured out how
The GFA file and GTF file used in
The GFA file and GTF file used in
Although |
1. What were you trying to do?
I used
vg rna
to build a spliced pangenome and a pantranscriptome with a GFA file from PGGB and a gff3 file. Similarly, I usedvg autoindex
to build index for mpmap and rpvg.2. What did you want to happen?
Get a spliced pangenome and a pantranscriptome from
vg rna
. Get index files fromvg autoindex
.3. What actually happened?
I got a error
ERROR: Chromosome path "MHChr01" not found in graph or haplotypes index (line 2)
fromvg rna
andvg autoindex
.4. If you got a line like
Stack trace path: /somewhere/on/your/computer/stacktrace.txt
, please copy-paste the contents of that file here:5. What data and command can the vg dev team use to make the problem happen?
This is a small graph from PGGB. Two haplotypes in the graph are named MHChr01#0#chr01 and ZSChr01#1#chr01.
sub.gfa.gz
Then I convert to GFA with W lines.
sub_w.gfa.gz
command:
vg convert -f sub.gfa > sub_w.gfa
This is a gff3 file.
mh_chr1.gff3.gz
commands for
vg rna
andvg autoindex
The paths in sub_w.gfa
The gff3 file
I don't know why it can't find chr paths. I tried to run
vg autoindex
with gfa_with_w_lines.gfa and gfa_with_w_lines.gtf in vg test and it can work.Then I tried to rename two haplotypes to MHChr01 and ZSChr01, so the GFA file can only be P lines.
Now
vg rna
can work.But
vg autoindex
still can't work.Could you help me see what the problem is? Thanks for your help.
6. What does running
vg version
say?The text was updated successfully, but these errors were encountered: