Skip to content

Commit

Permalink
v2.0.0-alpha.3 (#187)
Browse files Browse the repository at this point in the history
* Allow VCF inputs with variable chromosome fields, but filter the variants to canonical or user-specified chromosomes.

* Fix test

* Documentation edits for VCF import

* typo

* Clairification on sampleset naming conventions (also enforced in check_samplesheet within utils)

* set up offline testing

* fix singularity url

* update samplesheets

* fail when join goes bad

* make meta key for vmiss match converted vcf

* fix --only_projection

* add vcf ancestry test

* fix plink call

* what's wrong with my yaml :(

* arf

* double arf

* I hate yaml

* fix test name

* activate ancestry vcf singularity test

* fix test name

* add retry for score and combine (often runs out of RAM on larger datasets)

* Bump utils to v0.4.2 (#185)

* bump v0.4.1 -> v0.4.2

* remove whitespace

* fix tag

* Fix conda action on publish (#184)

* fix finding mamba

* test ancestry with conda too

* fix mamba profile

* remove matrix from ancestry

* fix conda channel message

* Fix plink2_score options (#181)

* Add mean-imputation back to scoring when samples aren't using a reference panel

* Ignore allele frequency calculation ONLY when we have a reference, else just calculate allele frequencies on scoring file variants (with --extract)

* Only use --error-on-freq-calc when non-mean-imputation is applied. Use extract to consider only variants from the scoring file.

* Better comment placement

* update test to match behaviour

* fix test

* add token to avoid rate limit

* check plink log more thoroughly

---------

Co-authored-by: Benjamin Wingfield <[email protected]>

* Note about score precision (Closes #162)

* Handle new reference naming (#173)

* More default memory for plink, needed for a bigger reference (gnomAD).

* More generalized file pattern for relatedness files in a reference ("GRCh3#_*.king.cutoff.out.id") with a single meta ID.

ToDo: rename the deg2 relatedness files in setup_resource.nf / bootstrap_ancestry.nf

* Update report.qmd to handle other colours

* give king cutoff files consistent names

* fix test

* add version check to extract_database

* add version check to extract_database

---------

Co-authored-by: Benjamin Wingfield <[email protected]>

* Update changelog.rst

* Version reference database with parameter (#189)

* version reference db separately

* Update RELEASE-CHECKLIST.md

Information about the ancestry version

---------

Co-authored-by: Sam Lambert <[email protected]>

---------

Co-authored-by: smlmbrt <[email protected]>
  • Loading branch information
nebfield and smlmbrt authored Oct 3, 2023
1 parent ba8e03c commit f2ed0cc
Show file tree
Hide file tree
Showing 47 changed files with 528 additions and 126 deletions.
74 changes: 74 additions & 0 deletions .github/workflows/ancestry-conda.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
name: Run ancestry test with mamba profile

on:
workflow_call:
inputs:
ancestry-cache-key:
type: string
required: true

jobs:
test_mamba_ancestry:
runs-on: ubuntu-latest
defaults:
run:
shell: bash -el {0}

steps:
- name: Check out pipeline code
uses: actions/checkout@v3

- name: Set environment variables
run: |
echo "ANCESTRY_REF_DIR=$RUNNER_TEMP" >> $GITHUB_ENV
echo "ANCESTRY_TARGET_DIR=$RUNNER_TEMP" >> $GITHUB_ENV
- name: Restore reference data
uses: actions/cache/restore@v3
with:
path: |
${{ env.ANCESTRY_TARGET_DIR }}/GRCh38_HAPNEST_TARGET_ALL.pgen
${{ env.ANCESTRY_TARGET_DIR }}/GRCh38_HAPNEST_TARGET_ALL.psam
${{ env.ANCESTRY_TARGET_DIR }}/GRCh38_HAPNEST_TARGET_ALL.pvar.zst
${{ env.ANCESTRY_REF_DIR }}/GRCh38_HAPNEST_reference.tar.zst
key: ${{ inputs.ancestry-cache-key }}
fail-on-cache-miss: true

- uses: conda-incubator/setup-miniconda@v2
with:
channels: conda-forge,bioconda,defaults
miniforge-variant: Mambaforge
miniforge-version: latest
python-version: "3.10"

- uses: actions/setup-java@v3
with:
distribution: 'corretto'
java-version: '17'

- name: install nxf
run: |
wget -qO- get.nextflow.io | bash
sudo mv nextflow /usr/local/bin/
- name: Set up test requirements
uses: actions/setup-python@v3
with:
python-version: '3.10'
cache: 'pip'

- run: pip install -r ${{ github.workspace }}/tests/requirements.txt

- name: Run ancestry test
run: TMPDIR=~ PROFILE=mamba pytest --kwdof --symlink --git-aware --wt 2 --tag "ancestry" --ignore tests/bin

- name: Upload logs on failure
if: failure()
uses: actions/upload-artifact@v3
with:
name: logs-conda-ancestry
path: |
/home/runner/pytest_workflow_*/*/.nextflow.log
/home/runner/pytest_workflow_*/*/log.out
/home/runner/pytest_workflow_*/*/log.err
/home/runner/pytest_workflow_*/*/output/*
160 changes: 160 additions & 0 deletions .github/workflows/ancestry-vcf.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
name: Run ancestry test with singularity or docker profiles with VCF input

on:
workflow_call:
inputs:
container-cache-key:
type: string
required: true
ancestry-cache-key:
type: string
required: true
docker:
type: boolean
singularity:
type: boolean

env:
NXF_SINGULARITY_CACHEDIR: ${{ github.workspace }}/singularity
SINGULARITY_VERSION: 3.8.3

jobs:
docker:
if: ${{ inputs.docker }}
runs-on: ubuntu-latest

steps:
- name: Set environment variables
run: |
echo "ANCESTRY_REF_DIR=$RUNNER_TEMP" >> $GITHUB_ENV
echo "ANCESTRY_TARGET_DIR=$RUNNER_TEMP" >> $GITHUB_ENV
- name: Check out pipeline code
uses: actions/checkout@v3

- uses: nf-core/setup-nextflow@v1

- name: Restore docker images
id: restore-docker
uses: actions/cache/restore@v3
with:
path: ${{ runner.temp }}/docker
key: ${{ inputs.container-cache-key }}
fail-on-cache-miss: true

- name: Load docker images from cache
run: |
find $HOME -name '*.tar'
find ${{ runner.temp }}/docker/ -name '*.tar' -exec sh -c 'docker load < {}' \;
- name: Restore reference data
uses: actions/cache/restore@v3
with:
path: |
${{ env.ANCESTRY_TARGET_DIR }}/GRCh38_HAPNEST_TARGET_ALL.pgen
${{ env.ANCESTRY_TARGET_DIR }}/GRCh38_HAPNEST_TARGET_ALL.psam
${{ env.ANCESTRY_TARGET_DIR }}/GRCh38_HAPNEST_TARGET_ALL.pvar.zst
${{ env.ANCESTRY_REF_DIR }}/GRCh38_HAPNEST_reference.tar.zst
key: ${{ inputs.ancestry-cache-key }}
fail-on-cache-miss: true

- name: Install plink2 to recode
run: sudo apt-get install -y plink2

- name: Recode VCF
run: plink2 --pfile ${ANCESTRY_TARGET_DIR}/GRCh38_HAPNEST_TARGET_ALL vzs --export vcf bgz --out ${ANCESTRY_TARGET_DIR}/GRCh38_HAPNEST_TARGET_ALL

- name: Set up test requirements
uses: actions/setup-python@v3
with:
python-version: '3.10'
cache: 'pip'

- run: pip install -r ${{ github.workspace }}/tests/requirements.txt

- name: Run ancestry test
run: TMPDIR=~ PROFILE=docker pytest --kwdof --symlink --git-aware --wt 2 --tag "ancestry vcf" --ignore tests/bin

- name: Upload logs on failure
if: failure()
uses: actions/upload-artifact@v3
with:
name: logs-singularity-ancestry
path: |
/home/runner/pytest_workflow_*/*/.nextflow.log
/home/runner/pytest_workflow_*/*/log.out
/home/runner/pytest_workflow_*/*/log.err
/home/runner/pytest_workflow_*/*/output/*
singularity:
if: ${{ inputs.singularity }}
runs-on: ubuntu-latest

steps:
- name: Set environment variables
run: |
echo "ANCESTRY_REF_DIR=$RUNNER_TEMP" >> $GITHUB_ENV
echo "ANCESTRY_TARGET_DIR=$RUNNER_TEMP" >> $GITHUB_ENV
- name: Check out pipeline code
uses: actions/checkout@v3

- uses: nf-core/setup-nextflow@v1

- name: Restore singularity setup
id: restore-singularity-setup
uses: actions/cache@v3
with:
path: /opt/hostedtoolcache/singularity/${{ env.SINGULARITY_VERSION }}/x64
key: ${{ runner.os }}-singularity-${{ env.SINGULARITY_VERSION }}
fail-on-cache-miss: true

- name: Add singularity to path
run: |
echo "/opt/hostedtoolcache/singularity/${{ env.SINGULARITY_VERSION }}/x64/bin" >> $GITHUB_PATH
- name: Restore singularity container images
id: restore-singularity
uses: actions/cache@v3
with:
path: ${{ env.NXF_SINGULARITY_CACHEDIR }}
key: ${{ inputs.container-cache-key }}

- name: Restore reference data
uses: actions/cache/restore@v3
with:
path: |
${{ env.ANCESTRY_TARGET_DIR }}/GRCh38_HAPNEST_TARGET_ALL.pgen
${{ env.ANCESTRY_TARGET_DIR }}/GRCh38_HAPNEST_TARGET_ALL.psam
${{ env.ANCESTRY_TARGET_DIR }}/GRCh38_HAPNEST_TARGET_ALL.pvar.zst
${{ env.ANCESTRY_REF_DIR }}/GRCh38_HAPNEST_reference.tar.zst
key: ${{ inputs.ancestry-cache-key }}
fail-on-cache-miss: true

- name: Install plink2 to recode
run: sudo apt-get install -y plink2

- name: Recode VCF
run: plink2 --pfile ${ANCESTRY_TARGET_DIR}/GRCh38_HAPNEST_TARGET_ALL vzs --export vcf bgz --out ${ANCESTRY_TARGET_DIR}/GRCh38_HAPNEST_TARGET_ALL

- name: Set up test requirements
uses: actions/setup-python@v3
with:
python-version: '3.10'
cache: 'pip'

- run: pip install -r ${{ github.workspace }}/tests/requirements.txt

- name: Run ancestry test
run: TMPDIR=~ PROFILE=singularity pytest --kwdof --symlink --git-aware --wt 2 --tag "ancestry vcf" --ignore tests/bin

- name: Upload logs on failure
if: failure()
uses: actions/upload-artifact@v3
with:
name: logs-singularity-ancestry
path: |
/home/runner/pytest_workflow_*/*/.nextflow.log
/home/runner/pytest_workflow_*/*/log.out
/home/runner/pytest_workflow_*/*/log.err
/home/runner/pytest_workflow_*/*/output/*
17 changes: 17 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ on:
branches:
- dev
- main
- fix_vcf
release:
types: [published]

Expand Down Expand Up @@ -123,3 +124,19 @@ jobs:
container-cache-key: ${{ needs.preload_singularity.outputs.cache-key }}
ancestry-cache-key: ${{ needs.preload_ancestry.outputs.cache-key }}
singularity: true

ancestry_vcf_docker:
needs: [preload_ancestry, preload_docker]
uses: ./.github/workflows/ancestry-vcf.yml
with:
container-cache-key: ${{ needs.preload_docker.outputs.cache-key }}
ancestry-cache-key: ${{ needs.preload_ancestry.outputs.cache-key }}
docker: true

ancestry_vcf_singularity:
needs: [preload_ancestry, preload_singularity]
uses: ./.github/workflows/ancestry.yml
with:
container-cache-key: ${{ needs.preload_singularity.outputs.cache-key }}
ancestry-cache-key: ${{ needs.preload_ancestry.outputs.cache-key }}
singularity: true
39 changes: 28 additions & 11 deletions .github/workflows/conda.yml
Original file line number Diff line number Diff line change
@@ -1,38 +1,55 @@
name: test conda on publish
name: test conda profiles on demand and on publish

on:
release:
types: [published]
workflow_dispatch:

jobs:
preload_ancestry:
uses: ./.github/workflows/preload-reference.yml

test_mamba_ancestry:
uses: ./.github/workflows/ancestry-conda.yml
needs: [preload_ancestry]
with:
ancestry-cache-key: ${{ needs.preload_ancestry.outputs.cache-key }}

test_mamba_standard:
runs-on: ubuntu-latest
defaults:
run:
shell: bash -el {0}
strategy:
fail-fast: false
matrix:
test_profile: ["test"]
profile: ["mamba"]
nxf_ver: ["22.10.0", "latest"]
nxf_ver: ["22.10.0", ""]

env:
NXF_VER: ${{ matrix.nxf_ver }}

steps:
- name: Check out pipeline code
uses: actions/checkout@v3

- uses: conda-incubator/setup-miniconda@v2
with:
channels: conda-forge,bioconda,defaults
miniforge-variant: Mambaforge
miniforge-version: latest
python-version: "3.10"

- uses: actions/setup-java@v3
with:
distribution: 'corretto'
java-version: '17'

- uses: nf-core/setup-nextflow@v1
with:
version: ${{ matrix.nxf_ver }}

- uses: conda-incubator/setup-miniconda@v2
with:
miniforge-variant: Mambaforge
miniforge-version: latest
channels: conda-forge,bioconda,defaults
- name: install nxf
run: |
wget -qO- get.nextflow.io | bash
sudo mv nextflow /usr/local/bin/
- name: Run pipeline with test data
run: |
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/standard-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ on:
env:
NXF_SINGULARITY_CACHEDIR: ${{ github.workspace }}/singularity
SINGULARITY_VERSION: 3.8.3
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

jobs:
docker:
Expand Down
4 changes: 4 additions & 0 deletions RELEASE-CHECKLIST.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@
- [ ] Has the changelog been updated?
- [ ] Update the nextflow schema

# Reference panels
- [ ] Did anything change to the modules for creating the reference panel? Bump ref_format_version in nextflow.config
- [ ] Publish new reference panels to FTP, update any documentation.

# Tests

- [ ] Make sure unit tests pass on singularity, docker, and conda (CI)
Expand Down
2 changes: 1 addition & 1 deletion assets/examples/samplesheet.csv
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
sampleset,path_prefix,chrom,format
cineca,assets/examples/target_genomes/cineca_synthetic_subset,22,pfile
cineca,target_genomes/cineca_synthetic_subset,22,pfile
2 changes: 1 addition & 1 deletion assets/examples/samplesheet_bfile.csv
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
sampleset,path_prefix,chrom,format
cineca,assets/examples/target_genomes/cineca_synthetic_subset,22,bfile
cineca,target_genomes/cineca_synthetic_subset,22,bfile
2 changes: 1 addition & 1 deletion assets/examples/samplesheet_vcf.csv
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
sampleset,path_prefix,chrom,format
cineca,assets/examples/target_genomes/cineca_synthetic_subset,22,vcf
cineca,target_genomes/cineca_synthetic_subset,22,vcf
Loading

0 comments on commit f2ed0cc

Please sign in to comment.