-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
index
doesn't work with a text file list of manifests
#347
Comments
index
doesn't work with multiple manifestsindex
doesn't work with a text file list of manifests
you are exactly right... they are not yet supported but rather desperately needed (see #266 and #235). there are a few issues that are likely to take priority over upgrading this behavior - in particular, #322 and #331 are top of my mind right now - but your use case is really important functionality that we hope to implement soon. |
(and yes, I think the documentation is also broken around this behavior. To quote Napoleon, “You can ask me for anything you like, except time” 😭 ) |
#364 "fixes" the documentation by commenting out the manifest CSV recommendations until we can support them. |
#430 is merged and released in v0.9.8, and this now works! Per the revised documentation for |
Hello, hope you are well!
I am very excited to try out the low-memory and fast searches created by RocksDB :) (Also, I will definitely be making use of
pairwise
!)On my way there, I encountered some unexpected behavior. I had an enormous sequence file (e.g. UniRef50, 65M protein sequences) and cut it up into chunks of 100k sequences to do
sourmash scripts manysketch -p protein,scaled=1,k=10,abund
without running out of resources.Then, I wanted to index these many files before searching them, but
sourmash scripts index
didn't work on a list of manifest files.Here's a minimal reproduction, using the data in
src/python/tests/test-data
:Then,
sourmash scripts index
failsI'm realizing now that
short.zip
are manifests and not sigs, but I was confused thatsourmash scripts index
wasn't able to work with them, because all the parameters matched when doingsourmash sig describe
:The workaround is using
sourmash sig cat
to combine the signatures into one file, but I was hoping not to do this until index creation since the input files are so big.Let me know if I'm not thinking about this problem correctly and there's a better way to do it.
Hope this was informative! Thank you!
The text was updated successfully, but these errors were encountered: