-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
differentiate between mutable and immutable MinHash
objects?
#1494
Comments
Probably #1045? But overall +1000 on this. We can unlock a lot of cool optimizations if we don't need to mutate the |
this was done in #1508, which continues to be a mostly innocuous change so far! I'm going to leave this open for a bit tho, to see if there's more to be done. |
closing - #1508 hasn't really caused any problems. |
As I write more/deeper code around search and prefetch and so on, I am starting worry about accidentally modifying
MinHash
objects.In brief -
MinHash
objects loaded from signature files can and should not be changed in any way - my suspicion is that in most of the code, immutableMinHash
objects are probably the right thing to be used!flatten
anddownsample
, which could similarly return immutableMinHash
objects.MinHash
- in addition to preventing/discovering bugs, presumably there are potential performance improvements on the Python side, and I'm also guessing that defaulting to immutableMinHash
objects over on the rust side could yield massive performance benefits.related to the idea of diversifying
MinHash
objects to differentiate betweennum
andscaled
objects #1354there's another issue out there about the two different rust implementations of hash storage that I can't find at the moment, that could factor into this.
The text was updated successfully, but these errors were encountered: