Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update CADD to version 1.6.post1 and include Smk environments in containers #4610

Merged
merged 10 commits into from
May 15, 2024

Conversation

fa2k
Copy link
Contributor

@fa2k fa2k commented Dec 14, 2023

The commit makes two changes:

  • For containers - uses a new container image from BioContainers. The main change is related to the fact that CADD will download conda environments when it's being run. The current container images are straight conversions from conda, and will download the environments into the conda envs path in the container filesystem. The new container in this PR already comes with the environments pre-bundled, which may have advantages for compatibility, efficiency and for running off-line. The new container image is not in Galaxyproject because it's not based on a conda package - so I use the biocontainers name for all container systems.
  • It uses a new CADD-scripts version: 1.6.post1. This contains some bugfixes in the code, but uses the same CADD annotation data (https://github.com/kircherlab/CADD-scripts/releases). A new bioconda package is just available for this CADD-scripts version, so both conda and containers use the same version.

PR checklist

Closes #4345

  • This comment contains a description of changes (with reason).
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • PROFILE=docker pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware
    • PROFILE=singularity pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware
    • PROFILE=conda pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware

All tests skipped successfully.

@fa2k fa2k marked this pull request as ready for review December 14, 2023 20:44
@fa2k fa2k requested a review from ramprasadn as a code owner December 14, 2023 20:44
@fa2k
Copy link
Contributor Author

fa2k commented Dec 14, 2023

Sorry this is not ready yet. The tests are failing, not skipping. Will check it.

fa2k added 2 commits December 15, 2023 11:31
Registry quay.io is the default and it only contains packages built directly from conda
@fa2k
Copy link
Contributor Author

fa2k commented Dec 15, 2023

The updated conda package isn't in bioconda, so it doesn't work right now. Maybe we can wait until the package is in bioconda. The pull request in bioconda was merged, but maybe it takes a bit of time? bioconda/bioconda-recipes#44775

@fa2k
Copy link
Contributor Author

fa2k commented Dec 18, 2023

I changed the environment file so it gets conda from conda-forge instead of anaconda. This was based on trial an error, getting the conda environment to build. I can't test it properly with conda, but the tests do pass.

It's ready for review.

@ramprasadn
Copy link
Contributor

@fa2k Biocontainers are now available on both quay and galaxyproject now. Could you update the links in your main script?

Could you also elaborate a bit on what issues were you having with the default conda environment and why you changed it to the one from conda-forge? Did you mean you were having some issues locally, and they went away with the switch to conda-forge version?

@fa2k
Copy link
Contributor Author

fa2k commented Jan 26, 2024

Thanks for the review @ramprasadn . As far as I can tell, the containers built from dockerfiles are not in quay.io; only the ones that are built directly from conda packages. That was confirmed by mboudet ( BioContainers/containers#547 (comment) ) in December. I don't know if it's the exact same situation with galaxyproject or it's a bug, but just looking at the huge list, I can see that the "cadd-scripts-with-envs:1.6.post1_cv1" package is not available there (https://depot.galaxyproject.org/singularity/ ).

As for the conda issue: The test PROFILE=conda pytest --tag cadd --symlink --keep-workflow-wd --git-aware is failing when I use anaconda::conda=4.14.0. I can try to look for the actual cause. Will send another update if I figure it out.

@fa2k
Copy link
Contributor Author

fa2k commented Jan 26, 2024

It turns out that I had made a change to the mamba version, and that's why I couldn't use the same conda version as before. I don't remember the reasoning for this, and I've reverted the mamba and conda versions so they are the same as before.

@fa2k
Copy link
Contributor Author

fa2k commented Apr 17, 2024

I have merged the latest changes from master. This is ready for review.

@SPPearce
Copy link
Contributor

Sorry that this has taken so long for anyone to actually review. You can add the "Ready for Review" tag, or even better ask on the nf-core slack request-review channel to get eyeballs on PRs.

@SPPearce SPPearce added this pull request to the merge queue May 15, 2024
Merged via the queue into nf-core:master with commit 35a1935 May 15, 2024
12 checks passed
@fa2k
Copy link
Contributor Author

fa2k commented May 15, 2024

Thanks for the review and explanation, I'll make sure to do one of these next time :)

@fa2k fa2k deleted the new-cadd-containers branch May 15, 2024 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Container image for CADD with dependencies included
3 participants