Skip to content

Commit

Permalink
Run modkit once + general changes to methylation subworkflow (#451)
Browse files Browse the repository at this point in the history
* Only running modkit once (depending on phasing)

* Moved methylation subworkflow to own directory

* Added tests

* Fixed test failures

* Updated tests

* Added input to methylation call site

* Renamed input channel 0

* Fixed alignment

* Fixed workflow dependencies

* Updated pipeline test snapshot

* Only importing modkit once, setting args in config

* Merged calls to METHYLATION

* Corrected process names

* Merge changelog

* Added parameters to methylation tests

* Removed input phased

* Added files to workflow output

* Moved parameter check to methylation callsite

* Fixed test config

* Fixed WhatsHap stats stub

* Fixed process selectors in test config

* Fixed join in test

* Removed dumps

* Updated pipeline test snaps

* Fixed call to PREPARE_GENOME in test

* Added methylation test to CI workflow

* Fixed formatting

* Removed unused channel

* Removed unused config

* Minor indentation and comment fixes

* Updated docs

* Added missing join

* Updates snapshots

* Changed formatting to appease the pre-commit gods
  • Loading branch information
Schmytzi authored Nov 5, 2024
1 parent b5cb868 commit 9953afa
Show file tree
Hide file tree
Showing 16 changed files with 979 additions and 219 deletions.
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ jobs:
- "ANNOTATE_SVS"
- "RANK_VARIANTS"
- "CALL_REPEAT_EXPANSIONS"
- "METHYLATION"
profile:
- "docker"

Expand Down
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#445](https://github.com/genomic-medicine-sweden/nallo/pull/445) - Added FOUND_IN tag and nf-test to rank variants
- [#446](https://github.com/genomic-medicine-sweden/nallo/pull/446) - Added the vcfstatsreport from DeepVariant to snv calling
- [#450](https://github.com/genomic-medicine-sweden/nallo/pull/450) - Added ranking of SVs (and CNVs)
- [#451](https://github.com/genomic-medicine-sweden/nallo/pull/451) - Added support for running methylation subworkflow without phasing
- [#451](https://github.com/genomic-medicine-sweden/nallo/pull/451) - Added nf-test to methylation

### `Changed`

Expand Down Expand Up @@ -75,6 +77,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#441](https://github.com/genomic-medicine-sweden/nallo/pull/441) - Changed the minimap2 preset for hifi reads back to `map-hifi`
- [#443](https://github.com/genomic-medicine-sweden/nallo/pull/443) - Refactored reference channel assignments
- [#443](https://github.com/genomic-medicine-sweden/nallo/pull/443) - Updated schemas for `vep_plugin_files` and `snp_db`
- [#451](https://github.com/genomic-medicine-sweden/nallo/pull/451) - Simplified methylation subworkflow
- [#474](https://github.com/genomic-medicine-sweden/nallo/pull/474) - Updated VEP and CADD channels to fix bugs introduced in [#443](https://github.com/genomic-medicine-sweden/nallo/pull/443)
- [#479](https://github.com/genomic-medicine-sweden/nallo/pull/479) - Replaced bgzip tabix with bcftools sort in rank variants to fix [#457](https://github.com/genomic-medicine-sweden/nallo/issues/457)
- [#484](https://github.com/genomic-medicine-sweden/nallo/pull/484) - Updated metro map and added SVG version
Expand Down
34 changes: 5 additions & 29 deletions conf/modules/methylation.config
Original file line number Diff line number Diff line change
Expand Up @@ -24,51 +24,27 @@ process {
]
}

withName: '.*:METHYLATION:MODKIT_PILEUP_UNPHASED' {
withName: '.*:METHYLATION:MODKIT_PILEUP' {
ext.args = { [
"${params.extra_modkit_options}",
'--combine-mods',
'--cpg',
'--combine-strands',
!params.skip_phasing_wf ? '--partition-tag HP' : '',
].join(' ') }
ext.prefix = { "${meta.id}_modkit_pileup" }
publishDir = [
path: { "${params.outdir}/methylation/modkit/pileup/unphased/${meta.id}" },
path: { "${params.outdir}/methylation/modkit/pileup/${meta.id}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.endsWith('.log') ? filename : null }
]
}

withName: '.*:METHYLATION:MODKIT_PILEUP_PHASED' {
ext.args = { [
"${params.extra_modkit_options}",
'--combine-mods',
'--cpg',
'--combine-strands',
'--partition-tag HP'
].join(' ') }
ext.prefix = { "${meta.id}_modkit_pileup_phased" }
publishDir = [
path: { "${params.outdir}/methylation/modkit/pileup/phased/${meta.id}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.endsWith('.log') ? filename : null }
]

}

withName: '.*:METHYLATION:BGZIP_MODKIT_PILEUP_UNPHASED' {
ext.prefix = { "${input.simpleName}" }
publishDir = [
path: { "${params.outdir}/methylation/modkit/pileup/unphased/${meta.id}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

withName: '.*:METHYLATION:BGZIP_MODKIT_PILEUP_PHASED' {
withName: '.*:METHYLATION:TABIX_BGZIPTABIX' {
ext.prefix = { "${input.simpleName}" }
publishDir = [
path: { "${params.outdir}/methylation/modkit/pileup/phased/${meta.id}" },
path: { "${params.outdir}/methylation/modkit/pileup/${meta.id}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand Down
16 changes: 8 additions & 8 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,14 +36,14 @@ If the pipeline is run with phasing, the aligned reads will be happlotagged usin

## Methylation pileups

[Modkit](https://github.com/nanoporetech/modkit) is used to create methylation pileups, producing bedMethyl files for both haplotagged and ungrouped reads. Additionaly, methylation information can be viewed in the BAM files, for example in IGV.

| Path | Description |
| ----------------------------------------------------------------------------------- | --------------------------------------------------------- |
| `methylation/modkit/pileup/phased/{sample}/*.modkit_pileup_phased_*.bed.gz` | bedMethyl file with summary counts from haplotagged reads |
| `methylation/modkit/pileup/phased/{sample}/*.modkit_pileup_phased_ungrouped.bed.gz` | bedMethyl file for ungrouped reads |
| `methylation/modkit/pileup/unphased/{sample}/*.modkit_pileup.bed.gz` | bedMethyl file with summary counts from all reads |
| `methylation/modkit/pileup/unphased/{sample}/*.bed.gz.tbi` | Index of the corresponding bedMethyl file |
[Modkit](https://github.com/nanoporetech/modkit) is used to create methylation pileups, producing bedMethyl files for both haplotagged and ungrouped reads. Additionally, methylation information can be viewed in the BAM files, for example in IGV.

| Path | Description |
| ---------------------------------------------------------------------------- | ----------------------------------------------------------------------------------- |
| `methylation/modkit/pileup/{sample}/*.modkit_pileup_phased_*.bed.gz` | bedMethyl file with summary counts from haplotagged reads (if phasing is turned on) |
| `methylation/modkit/pileup/{sample}/*.modkit_pileup_phased_ungrouped.bed.gz` | bedMethyl file for ungrouped reads (if phasing is turned on) |
| `methylation/modkit/pileup/{sample}/*.modkit_pileup.bed.gz` | bedMethyl file with summary counts from all reads (if phasing is turned off) |
| `methylation/modkit/pileup/{sample}/*.bed.gz.tbi` | Index of the corresponding bedMethyl file |

## MultiQC

Expand Down
2 changes: 1 addition & 1 deletion docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,7 @@ Turned off with `--skip_phasing_wf`.

### Methylation

This subworkflow relies on mapping, short variant calling and phasing subworkflows, but requires no additional files.
This subworkflow relies on mapping and short variant calling subworkflows, but requires no additional files.

Turned off with `--skip_methylation_wf`.

Expand Down
2 changes: 1 addition & 1 deletion modules/local/whatshap/stats/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ process WHATSHAP_STATS {
stub:
def prefix = task.ext.prefix ?: "${meta.id}"
"""
touch ${prefix}.stats.tsv.gz
touch ${prefix}.stats.tsv
touch ${prefix}.blocks.tsv
cat <<-END_VERSIONS > versions.yml
Expand Down
38 changes: 0 additions & 38 deletions subworkflows/local/methylation.nf

This file was deleted.

32 changes: 32 additions & 0 deletions subworkflows/local/methylation/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
include { MODKIT_PILEUP } from '../../../modules/nf-core/modkit/pileup/main'
include { TABIX_BGZIPTABIX } from '../../../modules/nf-core/tabix/bgziptabix/main'

workflow METHYLATION {

take:
ch_bam_bai // channel: [ val(meta), bam, bai ]
ch_fasta // channel: [ val(meta), fasta ]
ch_fai // channel: [ val(meta), fai ]
ch_bed // channel: [ val(meta), bed ]

main:
ch_versions = Channel.empty()

// Performs pileups per haplotype if the phasing workflow is on, set in config
MODKIT_PILEUP (ch_bam_bai, ch_fasta, ch_bed)
ch_versions = ch_versions.mix(MODKIT_PILEUP.out.versions)


MODKIT_PILEUP.out.bed
.transpose()
.set { ch_bgzip_modkit_pileup_in }

TABIX_BGZIPTABIX ( ch_bgzip_modkit_pileup_in )
ch_versions = ch_versions.mix(TABIX_BGZIPTABIX.out.versions)

emit:
bed = TABIX_BGZIPTABIX.out.gz_tbi.map { meta, bed, tbi -> [ meta, bed ] } // channel: [ val(meta), path(bed) ]
tbi = TABIX_BGZIPTABIX.out.gz_tbi.map { meta, bed, tbi -> [ meta, tbi ] } // channel: [ val(meta), path(tbi) ]
versions = ch_versions // channel: [ versions.yml ]
}

Loading

0 comments on commit 9953afa

Please sign in to comment.