-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Fix several issues with the hashing method #2049
Conversation
* dict items are sorted _after_ hashing, fixing the behavior for keys which can not be sorted * unicode/str and str/bytes are explicitly handled, with a different type salt * in general, the combination between two hashes is done via the method of boost::hash_combine, not by concatenating the hexdigest
Please check the |
@giovannipizzi @dev-zero There is indeed a problem with unicode / bytes when going through the database: The keys of the |
general question: what was the motivation to use SHA-244 instead of SHA256? |
I'm not sure, maybe @lekah could know. |
wrt the hash combine, there is more to the story: boostorg/functional@0471fb7#diff-005c326126e7e14df58f94a00efdbeb5R208 |
ok, I've tracked the boost hash mixing down to here: https://github.com/aappleby/smhasher/wiki/MurmurHash3 |
The problem with these is that none of them is in Another option I found is that |
ok, I think BLAKE2 is the way to go, especially because of this: https://docs.python.org/3/library/hashlib.html#personalization and, since the hashing seems to be called quiet often, being fast is also not bad ;-) Wrt using |
Ok, using the backport (or, I guess that was the original library) seems like a good solution for Python2. As for the tree hashing, do you know if it's required to know the offset of each node within a specific depth level? If so, I guess that would be a bit hard to do, at least if we want to keep this recursive implementation. |
That's what I'm trying to understand. Currently I think that if you have an unlimited |
@dev-zero I don't think I'll have time to look into this closer before next week, would you mind if I assign these issues to you? |
Sure, no problem. |
Thanks! |
Any hashing function is fine, I don't remember why we chose 244. |
obsoleted by #2110 |
Fixes #2008, #2009.
keys which can not be sorted
type salt
method of boost::hash_combine, not by concatenating the hexdigest
TODO: Add migration that deletes all stale hashes.