Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Markdup doesn't accept unpaired reads aligned with minimap2 #55

Open
charlenelawdes opened this issue Jun 5, 2024 · 5 comments
Open
Labels
bug Something isn't working

Comments

@charlenelawdes
Copy link

charlenelawdes commented Jun 5, 2024

Description of the bug

I have Nanopore reads aligned to the GRCh38_hmf reference using minimap2, as it's the recommended tool to use for Nanopore reads. I get an error message at the process NFCORE_ONCOANALYSER:WGTS:READ_PROCESSING:MARKDUPS

Command used and terminal output

nextflow run nf-core/oncoanalyser \
  -resume \
  -r 0.4.6 \
  -profile singularity \
  --mode wgts \
  --genome GRCh38_hmf \
  --input $SMPSHEET \
  --outdir $OUT \
  -c $APPTAINER_CONFIG

Relevant files

Jun.-05 11:43:16.852 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[jobId: 30075009; id: 4; name: NFCORE_ONCOANALYSER:WGTS:READ_PROCESSING:MARKDUPS (PPT45_SIGN1048); status: COMPLETED; exit: 1; error: -; workDir: /PPT45/oncoanalyser_test/scripts/work/d0/6d687313e1fcffb2700d85157ebf1f started: 1717602196849; exited: 2024-06-05T15:42:45Z; ]
Jun.-05 11:43:16.874 [Task monitor] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
  task: name=NFCORE_ONCOANALYSER:WGTS:READ_PROCESSING:MARKDUPS (PPT45_SIGN1048); work-dir=/PPT45/oncoanalyser_test/scripts/work/d0/6d687313e1fcffb2700d85157ebf1f
  error [nextflow.exception.ProcessFailedException]: Process `NFCORE_ONCOANALYSER:WGTS:READ_PROCESSING:MARKDUPS (PPT45_SIGN1048)` terminated with an error exit status (1)
Jun.-05 11:43:16.905 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'NFCORE_ONCOANALYSER:WGTS:READ_PROCESSING:MARKDUPS (PPT45_SIGN1048)'

Caused by:
  Process `NFCORE_ONCOANALYSER:WGTS:READ_PROCESSING:MARKDUPS (PPT45_SIGN1048)` terminated with an error exit status (1)

Command executed:

  markdups \
      -Xmx36721970381 \
      \
      -samtools $(which samtools) \
      -sambamba $(which sambamba) \
      \
      -sample SIGN1048 \
      -input_bam SIGN1048_aligned_sorted_RG.bam \
      \
      -form_consensus \
       \
      \
      -unmap_regions unmap_regions.38.tsv \
      -ref_genome GCA_000001405.15_GRCh38_no_alt_analysis_set.fna \
      -ref_genome_version 38 \
      \
      -write_stats \
      -threads 6 \
      \
      -output_bam SIGN1048.markdups.bam
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_ONCOANALYSER:WGTS:READ_PROCESSING:MARKDUPS":
      markdups: $(markdups -version | awk '{ print $NF }')
      sambamba: $(sambamba --version 2>&1 | egrep '^sambamba' | head -n 1 | awk '{ print $NF }')
      samtools: $(samtools --version 2>&1 | egrep '^samtools\s' | head -n 1 | sed 's/^.* //')
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  INFO:    Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
  INFO:    Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
  INFO:    Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is preferred
  /usr/local/bin/markdups: line 6: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8): No such file or directory
  15:42:44.608 [main] [INFO ] MarkDups version 1.1.5
  15:42:44.772 [main] [INFO ] output(./)
  15:42:45.038 [main] [INFO ] loaded 80309 unmapping regions from unmap_regions.38.tsv
  15:42:45.038 [main] [INFO ] duplicate logic: consensus
  15:42:45.039 [main] [INFO ] sample(SIGN1048) starting mark duplicates
  15:42:45.536 [Thread-0] [ERROR] read(id(c68f9a3b-05e1-438f-9daa-bf6ec949f9b0) coords(chr1:10001-41941) cigar(3302S24M1D83M1I28M2I10M1D30M1I41M1I19M1I9M2D6M1D29M1I12M1I5M3I5M1I3M1I18M5I62M1D6M1D4M1D25M1I6M1I31M1I44M2D116M7I1M1I71M26I104M2I5M3D8M4I4M4I6M8I34M105I106M3I5M3I39M1D7M1I43M1I78M1I1050M1D412M2D93M1D57M1I1M1I328M1D46M1I523M2D281M1I18M1D207M1D91M12D522M1I424M2D326M2I34M1I326M1I52M2I15M1D3M3D1M1D122M3D76M2I133M1I200M1D236M1D718M1D22M2I130M1D273M3I2M2I154M1D376M3D85M1I67M1I176M2D1511M2I2M1D466M1I7M2I282M1D5M1D3M1I120M1D220M1D11M2D374M2I117M1I118M1I239M1D2M1I3M1D1508M1I111M1I510M2D545M3D510M1I58M2D438M1D121M1I2M1D196M1I46M1I26M1I177M1I230M1I769M1D62M1D2M1I1015M3D180M3I204M1I506M1D172M2I5M1D241M5D3M3D543M1I161M1D250M2D1227M1I9M1D8M1I40M4D1362M1D9M1D575M2I2M1D1443M2D335M2D11M1D67M2I3M1I378M1I515M1I11M2D11M1D1204M1I10M1D141M2I44M1I44M3I228M1I622M1I1286M1I9M2D157M2I87M1D1214M9S) mate(*:0) flags(0)) exception: java.lang.IllegalStateException: Inappropriate call if not paired read
  java.lang.IllegalStateException: Inappropriate call if not paired read
  	at htsjdk.samtools.SAMRecord.requireReadPaired(SAMRecord.java:892)
  	at htsjdk.samtools.SAMRecord.getMateUnmappedFlag(SAMRecord.java:919)
  	at com.hartwig.hmftools.markdups.ReadPositionsCache.processRead(ReadPositionsCache.java:105)
  	at com.hartwig.hmftools.markdups.PartitionReader.processSamRecord(PartitionReader.java:207)
  	at com.hartwig.hmftools.markdups.BamReader.sliceRegion(BamReader.java:58)
  	at com.hartwig.hmftools.markdups.PartitionReader.processRegion(PartitionReader.java:123)
  	at com.hartwig.hmftools.markdups.PartitionThread.run(PartitionThread.java:61)

Work dir:
  /PPT45/oncoanalyser_test/scripts/work/d0/6d687313e1fcffb2700d85157ebf1f

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
Jun.-05 11:43:16.917 [Task monitor] INFO  nextflow.Session - Execution cancelled -- Finishing pending tasks before exit`

System information

Nextflow version : 23.10.0
Hardware: HPC
Executor: slurm
Container engine: Apptainer
OS: CentOS
nf-core/oncoanalyser version: 0.4.6

@charlenelawdes charlenelawdes added the bug Something isn't working label Jun 5, 2024
@scwatts
Copy link
Collaborator

scwatts commented Jun 6, 2024

Thanks for reporting this. There have been several bug fixes in the latest MarkDups release (v1.1.7). Would you be able to first see whether this data works with that release?

To do that I'd recommend navigating to the MarkDups work directory from your oncoanalyser analysis then download and run MarkDups 1.1.7, something like the following should work:

cd /PPT45/oncoanalyser_test/scripts/work/d0/6d687313e1fcffb2700d85157ebf1f/

wget https://github.com/hartwigmedical/hmftools/releases/download/mark-dups-v1.1.7/mark-dups_v1.1.7.jar

java -Xmx36721970381 -jar mark-dups_v1.1.7.jar \
  -samtools $(which samtools) \
  -sambamba $(which sambamba) \
  -sample SIGN1048 \
  -input_bam SIGN1048_aligned_sorted_RG.bam \
  -form_consensus \
  -unmap_regions unmap_regions.38.tsv \
  -ref_genome GCA_000001405.15_GRCh38_no_alt_analysis_set.fna \
  -ref_genome_version 38 \
  -write_stats \
  -threads 6 \
  -output_bam SIGN1048.markdups.bam

@scwatts
Copy link
Collaborator

scwatts commented Jul 12, 2024

I'll close this issue for now - if you'd like to continue discussing/debugging, please reopen!

@scwatts scwatts closed this as completed Jul 12, 2024
@bwbioinfo
Copy link

I've just tried with the updated version and am still getting the same issue :

Some extra information : these are ONT reads mapped with Minimap2

java -Xmx36721970381 -jar mark-dups_v1.1.7.jar \
     \
    \
    -samtools $(which samtools) \
    -sambamba $(which sambamba) \
    \
    -sample test \
    -input_bam test_merged.sorted.reheader.bam \
    \
    -form_consensus \
    -umi_enabled -umi_duplex -umi_duplex_delim + \
    \
    -unmap_regions unmap_regions.38.tsv \
    -ref_genome hg38.fa \
    -ref_genome_version 38 \
    \
    -write_stats \
    -threads 6 \
    \
    -output_bam test.markdups.bam
15:56:45.919 [main] [INFO ] MarkDups version 1.1.7
15:56:45.968 [main] [INFO ] output(./)
15:56:46.079 [main] [INFO ] loaded 80309 unmapping regions from unmap_regions.38.tsv
15:56:46.079 [main] [INFO ] duplicate logic: UMIs
15:56:46.080 [main] [INFO ] sample(test) starting mark duplicates
java.lang.IllegalStateException: Inappropriate call if not paired read
        at htsjdk.samtools.SAMRecord.requireReadPaired(SAMRecord.java:892)
        at htsjdk.samtools.SAMRecord.getMateUnmappedFlag(SAMRecord.java:919)
        at com.hartwig.hmftools.markdups.PartitionReader.processSamRecord(PartitionReader.java:204)
        at com.hartwig.hmftools.markdups.BamReader.sliceRegion(BamReader.java:58)
        at com.hartwig.hmftools.markdups.PartitionReader.processRegion(PartitionReader.java:125)
        at com.hartwig.hmftools.markdups.PartitionThread.run(PartitionThread.java:61)

@charlenelawdes
Copy link
Author

I'm unable to re-open the issue, but the problem is still ongoing

@scwatts
Copy link
Collaborator

scwatts commented Jan 14, 2025

Thanks for the updates I've now re-opened the issue. However, this appears to be a bug that would need to be fixed in MarkDups since the error is still occurring with the latest release.

Can you please open an issue in hartwigmedical/hmftools where MarkDups is developed for the maintainers to review? I'll keep this issue open until the problem is resolved over there.

@scwatts scwatts reopened this Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants