Skip to content

Commit

Permalink
review suggestions
Browse files Browse the repository at this point in the history
  • Loading branch information
fellen31 committed Nov 14, 2024
1 parent 04cff8a commit 42694e9
Show file tree
Hide file tree
Showing 9 changed files with 71 additions and 71 deletions.
44 changes: 22 additions & 22 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#411](https://github.com/genomic-medicine-sweden/nallo/pull/411) - Updated longphase module to most recent version. ([#409](https://github.com/genomic-medicine-sweden/nallo/issues/409)).
- [#416](https://github.com/genomic-medicine-sweden/nallo/pull/416) - Updated WhatsHap to 2.3 and added the `--use-supplementary` flag to use supplementary reads for phasing by default. Changed modules to use biocontainers instead of custom containers. ([#296](https://github.com/genomic-medicine-sweden/nallo/issues/296))
- [#417](https://github.com/genomic-medicine-sweden/nallo/pull/417) - Updated SNV annotation tests to use correct configuration, and snapshot the md5sum, and summary of the variants
- [#418](https://github.com/genomic-medicine-sweden/nallo/pull/418) - Changed the default value of `--parallel_alignments` from 1 to 8, meaning the pipeline will perform parallel alignment by default
- [#418](https://github.com/genomic-medicine-sweden/nallo/pull/418) - Changed the default value of `--alignment_processes` from 1 to 8, meaning the pipeline will perform parallel alignment by default
- [#422](https://github.com/genomic-medicine-sweden/nallo/pull/422) - Updated nf-core/tools template to v3.0.1
- [#423](https://github.com/genomic-medicine-sweden/nallo/pull/423) - Updated metro map
- [#428](https://github.com/genomic-medicine-sweden/nallo/pull/428) - Changed from using bcftools to SVDB for SV merging
Expand Down Expand Up @@ -125,25 +125,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
| `--validationSkipDuplicateCheck` | |
| `--validationS3PathCheck` | |
| `--monochromeLogs` | `--monochrome_logs` |
| `skip_short_variant_calling` | `skip_snv_calling` |
| `skip_assembly_wf` | `skip_genome_assembly` |
| `skip_mapping_wf` | `skip_alignment` |
| `skip_methylation_wf` | `skip_methylation_analysis` |
| `skip_phasing_wf` | `skip_phasing` |
| `variant_caller` | `snv_caller` |
| `parallel_snv` | `snv_calling_processes` |
| `cadd_prescored` | `cadd_prescored_indels` |
| `snp_db` | `echtvar_snv_databases` |
| `variant_catalog` | `stranger_repeat_catalog` |
| `bed` | `target_bed` |
| `hificnv_xy` | `hificnv_expected_xy_cn` |
| `hificnv_xx` | `hificnv_expected_xx_cn` |
| `hificnv_exclude` | `hificnv_excluded_regions` |
| `reduced_penetrance` | `genmod_reduced_penetrance` |
| `score_config_snv` | `genmod_score_config_snvs` |
| `score_config_sv` | `genmod_score_config_svs` |
| `parallel_alignments` | `alignment_processes` |
| `svdb_dbs` | `svdb_sv_databases` |
| `--skip_short_variant_calling` | `--skip_snv_calling` |
| `--skip_assembly_wf` | `--skip_genome_assembly` |
| `--skip_mapping_wf` | `--skip_alignment` |
| `--skip_methylation_wf` | `--skip_methylation_pileups` |
| `--skip_phasing_wf` | `--skip_phasing` |
| `--variant_caller` | `--snv_caller` |
| `--parallel_snv` | `--snv_calling_processes` |
| `--cadd_prescored` | `--cadd_prescored_indels` |
| `--snp_db` | `--echtvar_snv_databases` |
| `--variant_catalog` | `--stranger_repeat_catalog` |
| `--bed` | `--target_regions` |
| `--hificnv_xy` | `--hificnv_expected_xy_cn` |
| `--hificnv_xx` | `--hificnv_expected_xx_cn` |
| `--hificnv_exclude` | `--hificnv_excluded_regions` |
| `--reduced_penetrance` | `--genmod_reduced_penetrance` |
| `--score_config_snv` | `--genmod_score_config_snvs` |
| `--score_config_sv` | `--genmod_score_config_svs` |
| `--alignment_processes` | `--alignment_processes` |
| `--svdb_dbs` | `--svdb_sv_databases` |

> [!NOTE]
> Parameter has been updated if both old and new parameter information is present.
Expand Down Expand Up @@ -247,7 +247,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#317](https://github.com/genomic-medicine-sweden/nallo/pull/317) - Changed so that `--reduced_penetrance` and `--score_config_snv` is required by rank variants and not SNV annotation
- [#318](https://github.com/genomic-medicine-sweden/nallo/pull/318) - Updated docs and schema to clarify pipeline usage
- [#321](https://github.com/genomic-medicine-sweden/nallo/pull/321) - Changed the input to BUILD_INTERVALS to have `meta.id` when building intervals from reference
- [#323](https://github.com/genomic-medicine-sweden/nallo/pull/323) - Changed `parallel_alignment` to `parallel_alignments` in CI tests as well
- [#323](https://github.com/genomic-medicine-sweden/nallo/pull/323) - Changed `parallel_alignment` to `alignment_processes` in CI tests as well
- [#330](https://github.com/genomic-medicine-sweden/nallo/pull/330) - Updated README and version bump
- [#332](https://github.com/genomic-medicine-sweden/nallo/pull/332) - Changed the PED file input to genmod to include inferred sex from somalier
- [#333](https://github.com/genomic-medicine-sweden/nallo/pull/333) - Updated TRGT to 0.7.0 and added `meta.id` as output sample name
Expand Down Expand Up @@ -282,7 +282,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
| | `--skip_aligned_read_qc` |
| | `--cadd_resources` |
| | `--cadd_prescored` |
| `--split_fastq` | `--parallel_alignments` |
| `--split_fastq` | `--alignment_processes` |
| `--extra_gvcfs` | |
| `--extra_snfs` | |
| `--dipcall_par` | `--par_regions` |
Expand Down
2 changes: 1 addition & 1 deletion conf/modules/qc_aligned_reads.config
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ process {
ext.args = { [
'--fast-mode',
'--no-per-base',
params.target_bed ? '' : '--by 500'
params.target_regions ? '' : '--by 500'
].join(' ') }
publishDir = [
path: { "${params.outdir}/qc/mosdepth/${meta.id}" },
Expand Down
4 changes: 2 additions & 2 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ params {
// References
fasta = params.pipelines_testdata_base_path + 'reference/hg38.test.fa.gz'
input = params.pipelines_testdata_base_path + 'testdata/samplesheet.csv'
target_bed = params.pipelines_testdata_base_path + 'reference/test_data.bed'
target_regions = params.pipelines_testdata_base_path + 'reference/test_data.bed'
hificnv_expected_xy_cn = params.pipelines_testdata_base_path + 'reference/expected_cn.hg38.XY.bed'
hificnv_expected_xx_cn = params.pipelines_testdata_base_path + 'reference/expected_cn.hg38.XX.bed'
hificnv_excluded_regions = params.pipelines_testdata_base_path + 'reference/empty.bed'
Expand All @@ -37,7 +37,7 @@ params {
genmod_reduced_penetrance = params.pipelines_testdata_base_path + 'reference/reduced_penetrance.tsv'
genmod_score_config_snvs = params.pipelines_testdata_base_path + 'reference/rank_model_snv.ini'
genmod_score_config_svs = params.pipelines_testdata_base_path + 'reference/rank_model_svs.ini'
variant_consequences_snv = params.pipelines_testdata_base_path + 'reference/variant_consequences_v2.txt'
variant_consequences_snvs = params.pipelines_testdata_base_path + 'reference/variant_consequences_v2.txt'
variant_consequences_svs = params.pipelines_testdata_base_path + 'reference/variant_consequences_v2.txt'
somalier_sites = params.pipelines_testdata_base_path + 'reference/somalier_sites.vcf.gz'

Expand Down
6 changes: 3 additions & 3 deletions docs/parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Allows skipping certain parts of the pipeline
| `skip_snv_calling` | Skip short variant calling | `boolean` | False | | |
| `skip_genome_assembly` | Skip genome assembly and assembly variant calling | `boolean` | False | | |
| `skip_alignment` | Skip read mapping (alignment) | `boolean` | False | | |
| `skip_methylation_analysis` | Skip generation of methylation pileups | `boolean` | False | | |
| `skip_methylation_pileups` | Skip generation of methylation pileups | `boolean` | False | | |
| `skip_repeat_calling` | Skip tandem repeat calling | `boolean` | False | | |
| `skip_repeat_annotation` | Skip tandem repeat annotation | `boolean` | False | | |
| `skip_phasing` | Skip phasing of variants and haplotagging of reads | `boolean` | False | | |
Expand Down Expand Up @@ -40,10 +40,10 @@ Define where the pipeline should find input data and save output data.
| `echtvar_snv_databases` | A csv file with echtvar databases to annotate SNVs with | `string` | | | |
| `svdb_sv_databases` | Databases used for structural variant annotation in vcf format. <details><summary>Help</summary><small>Path to comma-separated file containing information about the databases used for structural variant annotation.</small></details>| `string` | | | |
| `stranger_repeat_catalog` | A variant catalog json-file for stranger | `string` | | | |
| `variant_consequences_snv` | File containing list of SO terms listed in the order of severity from most severe to lease severe for annotating genomic SNVs. For more information check https://ensembl.org/info/genome/variation/prediction/predicted_data.html | `string` | | | |
| `variant_consequences_snvs` | File containing list of SO terms listed in the order of severity from most severe to lease severe for annotating genomic SNVs. For more information check https://ensembl.org/info/genome/variation/prediction/predicted_data.html | `string` | | | |
| `variant_consequences_svs` | File containing list of SO terms listed in the order of severity from most severe to lease severe for annotating genomic SVs. For more information check https://ensembl.org/info/genome/variation/prediction/predicted_data.html | `string` | | | |
| `vep_cache` | A path to the VEP cache location | `string` | | | |
| `bed` | A BED file with regions of interest, used to limit short variant calling. | `string` | | | |
| `target_regions` | A BED file with regions of interest, used to limit variant calling. | `string` | | | |
| `hificnv_expected_xy_cn` | A BED file containing expected copy number regions for XY samples. | `string` | | | |
| `hificnv_expected_xx_cn` | A BED file containing expected copy number regions for XX samples. | `string` | | | |
| `hificnv_excluded_regions` | A BED file specifying regions to exclude with HiFiCNV, such as centromeres. | `string` | | | |
Expand Down
12 changes: 6 additions & 6 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ This pipeline comes with three different presets that should be set with the `--
The selected preset will turn off subworkflows:

- `--skip_genome_assembly` and `--skip_repeat_wf` will be set to `true` for `ONT_R10`
- `--skip_methylation_analysis` will be set to `true` for `pacbio`
- `--skip_methylation_pileups` will be set to `true` for `pacbio`

## Subworkflows

Expand Down Expand Up @@ -117,7 +117,7 @@ For example, `nextflow run genomic-medicine-sweden/nallo -profile docker --outdi

```
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--skip_alignment is active, the pipeline has to be run with: --skip_qc --skip_genome_assembly --skip_call_paralogs --skip_snv_calling --skip_snv_annotation --skip_cnv_calling --skip_phasing --skip_rank_variants --skip_repeat_calling --skip_repeat_annotation --skip_methylation_analysis
--skip_alignment is active, the pipeline has to be run with: --skip_qc --skip_genome_assembly --skip_call_paralogs --skip_snv_calling --skip_snv_annotation --skip_cnv_calling --skip_phasing --skip_rank_variants --skip_repeat_calling --skip_repeat_annotation --skip_methylation_pileups
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
Expand Down Expand Up @@ -202,7 +202,7 @@ Turned off with `--skip_phasing`.

This subworkflow relies on mapping and short variant calling subworkflows, but requires no additional files.

Turned off with `--skip_methylation_analysis`.
Turned off with `--skip_methylation_pileups`.

### Repeat calling

Expand All @@ -228,14 +228,14 @@ Turned off with `--skip_repeat_annotation`.

This subworkflow relies on the mapping and short variant calling, and requires the following additional files:

<!-- TODO: genmod_score_config_snvs, genmod_reduced_penetrance and variant_consequences_snv should link to real examples -->
<!-- TODO: genmod_score_config_snvs, genmod_reduced_penetrance and variant_consequences_snvs should link to real examples -->

| Parameter | Description |
| ------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `vep_cache` | VEP cache matching your reference genome, either as a `.tar.gz` archive or path to a directory (e.g. [homo_sapiens_vep_110_GRCh38.tar.gz](https://ftp.ensembl.org/pub/release-110/variation/vep/homo_sapiens_vep_110_GRCh38.tar.gz)) |
| `vep_plugin_files` <sup>1</sup> | A csv file with VEP plugin files, pLI and LoFtool are required. Example provided below. |
| `echtvar_snv_databases` <sup>2</sup> |  A csv file with annotation databases from ([`echtvar encode`](https://github.com/brentp/echtvar)) |
| `variant_consequences_snv` | A list of SO terms listed in the order of severity from most severe to lease severe for annotating genomic and mitochondrial SNVs. Sample file [here](https://github.com/nf-core/test-datasets/blob/raredisease/reference/variant_consequences_v2.txt). You can learn more about these terms [here](https://ensembl.org/info/genome/variation/prediction/predicted_data.html) |
| `variant_consequences_snvs` | A list of SO terms listed in the order of severity from most severe to lease severe for annotating genomic and mitochondrial SNVs. Sample file [here](https://github.com/nf-core/test-datasets/blob/raredisease/reference/variant_consequences_v2.txt). You can learn more about these terms [here](https://ensembl.org/info/genome/variation/prediction/predicted_data.html) |

<sup>1</sup> Example file for input with `--vep_plugin_files`

Expand Down Expand Up @@ -311,7 +311,7 @@ This subworkflow ranks SVs, and relies on the mapping, SV calling and SV annotat

## Other highlighted parameters

- Limit SNV calling to regions in BED file (`--bed`).
- Limit SNV calling to regions in BED file (`--target_bed`).
- By default SNV-calling is split into 13 parallel processes, this speeds up the variant calling significantly. Limit this by setting `--snv_calling_processes` to a different number.
- By default the pipeline splits the input files into eight pieces, performs parallel alignment and then merges the files. This can be changed to a different number with `--alignment_processes`, or turned off by supplying a value of 1. Parallel alignment comes with some additional overhead, but can speed up the pipeline significantly.

Expand Down
6 changes: 3 additions & 3 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ params {

// Input options
input = null
target_bed = null
target_regions = null
cadd_resources = null
cadd_prescored_indels = null
par_regions = null
Expand All @@ -23,7 +23,7 @@ params {
genmod_score_config_svs = null
echtvar_snv_databases = null
svdb_sv_databases = null
variant_consequences_snv = null
variant_consequences_snvs = null
variant_consequences_svs = null
vep_cache = null
vep_plugin_files = null
Expand All @@ -37,7 +37,7 @@ params {
skip_call_paralogs = false
skip_cnv_calling = false
skip_alignment = false
skip_methylation_analysis = params.preset == 'pacbio' ? true : false
skip_methylation_pileups = params.preset == 'pacbio' ? true : false
skip_phasing = false
skip_qc = false
skip_rank_variants = false
Expand Down
8 changes: 4 additions & 4 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
"fa_icon": "fas fa-fast-forward",
"default": false
},
"skip_methylation_analysis": {
"skip_methylation_pileups": {
"type": "boolean",
"description": "Skip generation of methylation pileups",
"fa_icon": "fas fa-fast-forward",
Expand Down Expand Up @@ -184,7 +184,7 @@
"format": "file-path",
"exists": true
},
"variant_consequences_snv": {
"variant_consequences_snvs": {
"type": "string",
"description": "File containing list of SO terms listed in the order of severity from most severe to lease severe for annotating genomic SNVs. For more information check https://ensembl.org/info/genome/variation/prediction/predicted_data.html",
"fa_icon": "fas fa-file-csv"
Expand All @@ -200,7 +200,7 @@
"format": "path",
"exists": true
},
"target_bed": {
"target_regions": {
"type": "string",
"pattern": "^\\S+\\.bed$",
"format": "file-path",
Expand Down Expand Up @@ -465,7 +465,7 @@
"vep_plugin_files": {
"type": "string",
"mimetype": "text/csv",
"description": "A csv file with vep_plugin_files as header, and then paths to vep plugin files. Paths to pLI_values.txt and LoFtool_scores.txt are required.",
"description": "A csv file with vep_files as header, and then paths to vep plugin files. Paths to pLI_values.txt and LoFtool_scores.txt are required.",
"schema": "assets/vep_plugin_files_schema.json"
},
"deepvariant_model_type": {
Expand Down
Loading

0 comments on commit 42694e9

Please sign in to comment.