Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

01_perform_dda.snakefile shuts down without error message on rule download_query_genome #8

Open
ccbaumler opened this issue Apr 6, 2023 · 5 comments

Comments

@ccbaumler
Copy link
Collaborator

When running 01_perform_dda.snakefile and downloading a query genome for Muribaculaceae bacterium Z82 from genbank with the dominating set differential abundance rule download_query_genome of 01_perform_dda.snakefile, the snakefile would simply shut down with:

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

The interesting thing was that there was no error message. So, I dug into the workflow and found that the genome URL below was broken with a 404 error:
https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/009/911/635/GCA_009911635.1_ASM991163v1/GCA_009911635.1_ASM991163v1_genomic.fna.gz
It turns out that the reference was removed and suppressed in GenBank.

I could use the package urllib that Taylor has used in this rule to create an error message and an if else loop that recursively removes the query genome and moves on.
https://docs.python.org/3/howto/urllib2.html#urlerror

@ctb

@ccbaumler
Copy link
Collaborator Author

A Titus hot-take solution is to use sourmash sig extract --picklist to remove all the deleted genomes from the database prior to gather.
Follow this example -> sourmash-bio/sourmash-examples#4 (comment)

@ctb
Copy link
Member

ctb commented Apr 18, 2023

actually, most sourmash commands take picklists, so you don't need to use extract specifically - you can use an exclusion picklist with gather directly.

picklist docs

You can also do clever things with manifests, e.g. this link

1 similar comment
@ctb
Copy link
Member

ctb commented Apr 18, 2023

actually, most sourmash commands take picklists, so you don't need to use extract specifically - you can use an exclusion picklist with gather directly.

picklist docs

You can also do clever things with manifests, e.g. this link

@ctb
Copy link
Member

ctb commented Apr 20, 2023

problem cropping up elsewhere for us, too:

dib-lab/genome-grist#277
dib-lab/charcoal#235

@ccbaumler
Copy link
Collaborator Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants