Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phased bam file not generated using "all" mode #54

Open
salhajh opened this issue Feb 6, 2025 · 7 comments
Open

Phased bam file not generated using "all" mode #54

salhajh opened this issue Feb 6, 2025 · 7 comments

Comments

@salhajh
Copy link

salhajh commented Feb 6, 2025

Hello.

I would like to run NanoCaller using the "all" mode which as I understand it, should generate the phased bam file after calling the SNPs and automatically feed that through to the indel function (I have ONT sequencing data which has been aligned to a reference genome). However when I run this function:

python NanoCaller/NanoCaller --bam $target.bam --ref .ref.fa --sequencing ont --output ${patientID}-nanoCaller_output

it produces a SNP file and when it gets to the indel function, this error message comes up "[E::hts_open_format] Failed to open file "nanoCaller_output/intermediate_phase_files/chr1.phased.bam" : No such file or directory"

There are phased vcf files generated, however those are empty and no bam files are generated. The input bam file is correctly formatted and the index file is also available (I have been able to use this bam file with a SV caller). I am not sure if there is something missing from my code or input files, however I would appreciate assistance with fixing this issue. Otherwise, I guess the alternative is to generate the SNPs and indels separately, using a phased bam file for the indels.

Thank you!

@umahsn
Copy link
Collaborator

umahsn commented Feb 6, 2025

Hi,

Can you run NanoCaller with --verbose parameter? It will help us figure out any errors in running the phasing commands here.

Additionally, can you check whether the intermediate_phase_files folder has non-empty VCF files ending with snps.unphased.vcf or snps.lowq.unphased.vcf.gz? It may be that the QUAL scores for the VCF file are too low and no variants end up in snps.unphased.vcf files that is used for phasing.

@salhajh
Copy link
Author

salhajh commented Feb 7, 2025

Thank you for your quick response. It seems that there is a problem with the software versions on the HPC I am running NanoCaller on, as I was able to run the same bam files on a separate computer and generate the combined SNP and indel vcf using "all" mode. I am currently working through the issues on the HPC to see if it can be solved.

Thank you once again!

@lilypeck
Copy link

Hello

I am having the same problem as @salhajh

I am using NanoCaller v3.6.0. In October I ran it on another genome of the same species which worked fine. However now it is not working on either the original genome or new ones.

My command is

apptainer exec --bind $PWD:$PWD \
        ../nanocaller_3.6.0.sif NanoCaller \
        --bam ${barcode}_haploid.final.sorted.bam \
        --ref ${barcode}_haploid.final.fa \
        --mode all \
        --sequencing ont \
        --cpu 10 \
        --prefix ${barcode}_haploid.final \
        --enable_whatshap

All of the vcfs are empty:

Image

and at the end of the jobscript are the errors

2025-02-28 13:23:09.517815: Compressing and indexing SNP calls.
mkdtemp(/work/7757745.2.pod_smp.q/bcftools.zNrZ6v) failed: No such file or directory

Failed to read from /u/project/vlsork/ldpeck/longreads/flye/samOD/barcode05/barcode05_haploid.final.unfiltered.snps.vcf.gz: unknown file type

2025-02-28 13:23:09.864276: SNP calling completed. Time taken= 1746.1362

  0%|          | 0/11831 [00:00<?, ?it/s]
Indel Calling Progress:   0%|          | 0/11831 [00:00<?, ?it/s][E::hts_open_format] Failed to open file "/u/project/vlsork/ldpeck/longreads/flye/samOD/barcode05/intermediate_phase_files/NC_044910.1_RagTag.phased.bam" : No such file or directory

Do you know why this is failing / why the temporary or phased bams aren't being created?

Thank you in advance!

Lily

joblog.7757745.txt

@salhajh
Copy link
Author

salhajh commented Mar 3, 2025

Hi Lily,

I just thought I would jump in to mention what I discovered the issue was for me. Though I had loaded in the correct modules within the conda environment, the HPC system I use was reverting back to different versions that are not compatible with NanoCaller, particularly tensorflow. So I would suggest maybe confirming which modules are not only being loaded but also utilised by the system, particularly if you create a conda environment with a specific python version, it might revert to the defaults for that version.

That was the main issue for me, however I am not sure if something else is not working, resulting in the error message you have provided. I just thought I would explain what was the solution for me just in case.

Good luck!

@lilypeck
Copy link

lilypeck commented Mar 5, 2025

Hi @salhajh

Thank you! I think that versions shouldn't be a problem because i'm using the apptainer image, but hopefully @umahsn can advise

@umahsn
Copy link
Collaborator

umahsn commented Mar 5, 2025

Hi @lilypeck, can you check if the intermediate_snp_files folder has any VCF files that are not empty, especially intermediate_snp_files/combined.snps.vcf?

@lilypeck
Copy link

lilypeck commented Mar 5, 2025

Hi @umahsn

Thanks for your reply. Yes there are lots of non-empty VCFs:

657462399 Feb 28 09:45 combined.snps.vcf
 76080654 Feb 28 09:45 barcode02_haploid.final.9.snps.vcf
 69209178 Feb 28 09:45 barcode02_haploid.final.12.snps.vcf
 68215396 Feb 28 09:45 barcode02_haploid.final.8.snps.vcf
 73511019 Feb 28 09:45 barcode02_haploid.final.4.snps.vcf
 70836117 Feb 28 09:45 barcode02_haploid.final.7.snps.vcf
 76859744 Feb 28 09:45 barcode02_haploid.final.5.snps.vcf

The quality scores in the combined.snps.vcf range between 3 - 99 with an average of 33. I checked the quality scores in the previous run which completed successfully and the quality scores looked about the same.

Thanks

Lily

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants