Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider further optimisation on mod_function. #50

Open
sh1boot opened this issue Jan 13, 2024 · 1 comment
Open

Consider further optimisation on mod_function. #50

sh1boot opened this issue Jan 13, 2024 · 1 comment

Comments

@sh1boot
Copy link

sh1boot commented Jan 13, 2024

Supposing your hash values are uniformly distributed "random" values in the range of 0..2**64-1, you can skip the modulo and instead use random-number scaling techniques to distribute the result (the bucket number) evenly throughout that space. I don't think there's any reason to stick with modulo, is there?

The quickest-and-dirtiest of these would be:

size_t bucket(uint64_t hash) {
    return uint64_t(uint128_t(hash) * K >> 64);
}

But if the incoming hashes are not already uniform then try this one weird trick:

size_t bucket(uint64_t hash) {
     crc32(hash) * K >> 32;
}

(which only works when K is less than 2**32), and you'll have to find the right intrinsic for the target to get the hardware-accelerated CRC instruction)

This one is more thorough:

static inline uint64_t murmurmix64(uint64_t h) {
    h ^= h >> 33;
    h *= 0xff51afd7ed558ccdULL;
    h ^= h >> 33;
    h *= 0xc4ceb9fe1a85ec53ULL;
    h ^= h >> 33;

    return h;
}

size_t bucket(uint64_t hash) {
    hash = murmurmix64(hash);
    return uint64_t(uint128_t(hash) * K >> 64);
}

And this one is probably fine, covers up to 64-bit ranges, and is probably still faster than mod by a constant:

size_t bucket(uint64_t hash) {
    hash *= 0xc4ceb9fe1a85ec53ULL;
    return uint64_t(uint128_t(hash) * K >> 64);
}

Plus it probably makes no difference if K is not constant, and so you can avoid calling via a function pointer and just load the table size into a register and perform the multiply by that inline.

@sh1boot
Copy link
Author

sh1boot commented Jan 13, 2024

Actually, let's take this a step further:

size_t bucket(uint64_t& hash, size_t nbuckets) {
    uint128_t longhash = hash;
    longhash = hash * nbuckets;
    hash = uint64_t(longhash);
    return size_t(longhash >> 64);
}

This returns a bucket number for the current table, and if things go wrong and you feel like trying a different hash -- even if it's a different length -- just call it again! It has also updated the input hash to give the unused entropy from the original hash properly distributed across the next range.

Constraints are:

  • original input hashes must be uniformly distributed across all 2**64 possible values
  • Product of nbuckets values for all calls on the same hash must be less than 2**64.
  • if all values of nbuckets are odd then the function will continue to hallucinate viable bucket choices even after it runs out of entropy (after the product exceeds 2**64)

If you don't trust the caller to provide uniformly distributed hashes (eg., where STL will just pass raw ints, or whatever) then you can precondition it with one of the mixing operations in the original post here.

Here, I tried to explain it all:
https://www.tīkōuka.dev/hash-map-reduction-operations/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant