-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
header error in variant_calls.snps.phrased.vcf.gz #36
Comments
Hi, Can you check if there any any intermediate files in |
Hello @umahsn, thank you for your reply, And yes, there are a My input files are a BAM file (555,1 Ko) and my ref is a fasta file (3,6 Ko) |
Hi, I think there might be a problem with passing the filenames internally within NanoCaller for haploid genomes. Let me check this and get back to you. |
Can you tell me if /home/aziz/mapping/SRR23337893/variant_calls.snps.phased.vcf.gz or refsequenceID.snps.phased.vcf.gz files are empty and if they have a header? |
Hello @umahsn thanks you for your response. |
Hi, I checked the issue and it turns out that presence of colon symbol ":" in the names of reference sequences is causing the problem. NanoCaller uses a linux system commands to run whatsapp for phasing and bcftools for VCF file manipulation. As a result, if a file VCF file that is named after a reference sequence that has colon in the name, then linux is not able to resolve the path to the file correctly. Once I replace colon with some other symbol in the reference and BAM files, it runs correctly. |
Hello I ran this command in order to detect variants in my mapped ONT reads (mapped with minimap2)
NanoCaller --mode all --sequencing ont --haploid_genome --bam sorted_mapped_reads.bam --ref genes.fna
I got this as a result:
2023-06-23 12:27:16.562651: Starting NanoCaller.
NanoCaller command and arguments are saved in the following file: /home/aziz/mapping/SRR23337893/args
2023-06-23 12:27:16.947255: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
SNP Calling Progress: 100%|███████████████████████| 2/2 [00:00<00:00, 6.89it/s]
2023-06-23 12:27:18.763662: Combining SNP calls.
2023-06-23 12:27:18.764897: Compressing and indexing SNP calls.
Writing to /tmp/bcftools.dkVQT8
Merging 1 temporary files
Cleaning
Done
2023-06-23 12:27:18.824115: SNP calling completed. Time taken= 0.4034
Indel Calling Progress: 100%|█████████████████████| 2/2 [00:00<00:00, 3.99it/s]
2023-06-23 12:27:19.487620: Compressing and indexing indel calls.
Checking the headers and starting positions of 2 files
[E::bcf_hdr_read] Input is not detected as bcf or vcf format
Failed to parse header: /home/aziz/mapping/SRR23337893/variant_calls.snps.phased.vcf.gz
2023-06-23 12:27:20.501190: Indel calling completed. Time taken= 1.6770
2023-06-23 12:27:20.501373: Total Time Elapsed: 3.94 seconds
It seems that everything is going well, but there was a problem in the header in the file variant_calls.snps.phased.vcf.gz
2023-06-23 12:27:19.487620: Compressing and indexing indel calls.
Checking the headers and starting positions of 2 files
[E::bcf_hdr_read] Input is not detected as bcf or vcf format
Failed to parse header: /home/aziz/mapping/SRR23337893/variant_calls.snps.phased.vcf.gz
Does this error can influence my results, does anyone have an idea about it ? Thanks in advance
The text was updated successfully, but these errors were encountered: