-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
01_perform_dda.snakefile
shuts down without error message on rule download_query_genome
#8
Comments
A Titus hot-take solution is to use |
actually, most sourmash commands take picklists, so you don't need to use extract specifically - you can use an exclusion picklist with gather directly. You can also do clever things with manifests, e.g. this link |
1 similar comment
actually, most sourmash commands take picklists, so you don't need to use extract specifically - you can use an exclusion picklist with gather directly. You can also do clever things with manifests, e.g. this link |
problem cropping up elsewhere for us, too: |
Temporary solution for this issue and the beginnings of something for this issue as well |
When running 01_perform_dda.snakefile and downloading a query genome for
Muribaculaceae bacterium Z82
from genbank with the dominating set differential abundance ruledownload_query_genome
of01_perform_dda.snakefile
, the snakefile would simply shut down with:The interesting thing was that there was no error message. So, I dug into the workflow and found that the genome URL below was broken with a 404 error:
https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/009/911/635/GCA_009911635.1_ASM991163v1/GCA_009911635.1_ASM991163v1_genomic.fna.gz
It turns out that the reference was removed and suppressed in GenBank.
I could use the package urllib that Taylor has used in this rule to create an error message and an if else loop that recursively removes the query genome and moves on.
https://docs.python.org/3/howto/urllib2.html#urlerror
@ctb
The text was updated successfully, but these errors were encountered: