Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hashset feature] Replace dict with hashset for keys and expire #1178

Closed
wants to merge 28 commits into from

Conversation

zuiderkwast
Copy link
Contributor

@zuiderkwast zuiderkwast commented Oct 16, 2024

This PR is based on #1186. When it is merged, this PR will change target to unstable.

Keys and expire are embedded in robj.

Dict is replaced by hashset for the keys and expire tables. Hashset is a hash table that doesn't have a dict entry. Pointers point directly to robj, where also the expire is stored, if any. Also pubsub channels are changed, since they use kvstore. Kvstore is changed from dict to hashset.

Some db.c functions like dbAdd, setKey and setExpire are changed to return the reallocated object after embedding the key and optional expire. setKey does not increment the reference counter, since it would require duplicating the object.

An robj value with an embedded key can be called valkey (val + key). A valkey typedef is added for robj and it's used when the object has an embedded key, to distinguish it from an object from other usages of robj (such as a string object from the command parser in c->argv).

keys                                      expire
hashset            robj                   hashset
+---+         +-------------------+        +---+
| 1 --------->| type, enc, lru    |<-------- 1 |
| 2 |         | refcount, flags   |        | 2 |
| 3 |         | ptr               |        | 3 |
| . |         | expire?           |        | . |
| . |         | embedded key      |        | . |
| . |         | embedded value?   |        | . |
+---+         +-------------------+        +---+

? = optional

A quick benchmark using valkey-benchmark showed small improvements for all types, but more for get and set than complex types.

Rebarding memory usage, DEBUG POPULATE 10000000 (10M) resulted in memory usage (used_memory_human) of 669M, compared to 872M on unstable. That's a 23% drop in memory usage and 21.28 bytes per key. (DEBUG POPULATE created keys of length 11 and string values of length 13. Both of these fit in an robj allocation of size 32 bytes.)

Fixes #992.

TODO:

  • Implement memory prefetching (for io threading, to be ported from the dict implementation).

Copy link
Contributor

@SoftlyRaining SoftlyRaining left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't look at db.c yet

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think we'll implement something, or will we end up completely removing prefetch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes... Help wanted. :)

Copy link
Contributor Author

@zuiderkwast zuiderkwast Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@touitou-dan

The prefetch implementation for dict is accessing dict internals to do incremental lookup, one memory access at a time. It's breaking the abstraction and adds duplicated logic. I don't like that part.

For hashset, I have two ideas about how to fetch multiple keys in parallel:

  1. A function that looks up multiple keys in parallel. It can return a bitmap with a bit for each key, whether it was found or not.

    unsigned hashsetBatchFind(int count, hashset **sets, const void **keys, void **found);
  2. A function that incrementally looks up one key. It needs to be called repeatedly until it's done. The caller provides an (opaque) state that the implementation can use.

    int hashsetIncrementalFind(hashset *s, const void *key, void **found, hashsetIncrementalFindState *state);

    The prefetching code can run multiple such lookups in round-robin.

Signed-off-by: Viktor Söderqvist <[email protected]>
This changes the type of command tables from dict to hashset. Command
table lookup takes ~3% of overall CPU time in benchmarks, so it is a
good candidate for optimization.

My initial SET benchmark comparison suggests that hashset is about 4.5
times faster than dict and this replacement reduced overall CPU time by
2.79% 🥳

---------

Signed-off-by: Rain Valentine <[email protected]>
Signed-off-by: Rain Valentine <[email protected]>
Co-authored-by: Rain Valentine <[email protected]>
SoftlyRaining and others added 9 commits October 18, 2024 10:33
… rehashed buckets can be skipped (valkey-io#1147)

Fixes 2 bugs in hashsetNext():

- null dereference when iterating over new unused hashset
- change order of iteration so skipping past already-rehashed buckets
works correctly and won't miss elements

Minor optimization in hashsetScan and some renamed variables.

Signed-off-by: Rain Valentine <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Use variable name `s` for hashset instead of `t`.

Use variable name `element` instead of `elem`.

Remove the hashsetType.userdata field.

Some more fixes.

Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Copy link

codecov bot commented Oct 22, 2024

Codecov Report

Attention: Patch coverage is 93.85965% with 49 lines in your changes missing coverage. Please review.

Project coverage is 70.94%. Comparing base (5129254) to head (159bc70).

Files with missing lines Patch % Lines
src/kvstore.c 95.21% 12 Missing ⚠️
src/module.c 0.00% 10 Missing ⚠️
src/object.c 94.44% 7 Missing ⚠️
src/db.c 95.80% 6 Missing ⚠️
src/defrag.c 88.67% 6 Missing ⚠️
src/debug.c 63.63% 4 Missing ⚠️
src/memory_prefetch.c 40.00% 3 Missing ⚠️
src/replication.c 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           hashset    #1178      +/-   ##
===========================================
+ Coverage    70.77%   70.94%   +0.16%     
===========================================
  Files          115      115              
  Lines        64714    63760     -954     
===========================================
- Hits         45804    45233     -571     
+ Misses       18910    18527     -383     
Files with missing lines Coverage Δ
src/aof.c 80.11% <100.00%> (-0.02%) ⬇️
src/bitops.c 93.76% <100.00%> (-0.01%) ⬇️
src/cluster.c 88.35% <100.00%> (-0.02%) ⬇️
src/cluster_legacy.c 86.30% <100.00%> (ø)
src/evict.c 97.70% <100.00%> (-0.06%) ⬇️
src/expire.c 96.59% <100.00%> (+0.02%) ⬆️
src/geo.c 93.58% <100.00%> (-0.02%) ⬇️
src/hyperloglog.c 91.23% <100.00%> (ø)
src/lazyfree.c 86.11% <100.00%> (ø)
src/pubsub.c 96.27% <100.00%> (+0.04%) ⬆️
... and 19 more

... and 6 files with indirect coverage changes

@zuiderkwast zuiderkwast requested a review from hpatro October 22, 2024 16:04
@zuiderkwast zuiderkwast linked an issue Oct 22, 2024 that may be closed by this pull request
@hpatro
Copy link
Collaborator

hpatro commented Oct 23, 2024

This is great progress. Will take a look this week.

zuiderkwast and others added 8 commits October 25, 2024 21:18
Signed-off-by: Viktor Söderqvist <[email protected]>
…ns (valkey-io#1218)

This is targeting the hashset branch and will update that PR if merged.

Here's the UT I promised to replace a tcl test I deleted in another PR 😅

---------

Signed-off-by: Rain Valentine <[email protected]>
A new type hashsetPosition is added and the prototypes of the
following functions are changed:

* hashsetFindPositionForInsert
* hashsetInsertAtPosition
* hashsetTwoPhasePopFindRef
* hashsetTwoPhasePopDelete

Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
…ization (valkey-io#1231)

Original test violated strict aliasing rules. I first wrote verbose
comments about how to get away with storing non-pointers, but have now
updated it to simply avoid storing non-pointers. Much cleaner and safer.

Technically we could work around this and still store longs directly in
a hashset, but currently we don't have a need for that.

---------

Signed-off-by: Rain Valentine <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Co-authored-by: Viktor Söderqvist <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Instead of open addressing, we now use bucket chaining, with
allocated child buckets when a bucket gets full.

The scan and defrag APIs are changed, because we now need to be able
to defrag the allocated child buckets.

Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
…ables

Rename hashsetDefragInternals to hashsetDefragTables, because it doesn't defrag
the allocated child buckets, which as also internals.

Add trackMemUsage callback in hashsetType, to track memory allocations of child
buckets.

Edit code comments about the hashset design.

Signed-off-by: Viktor Söderqvist <[email protected]>
* Add functions for embedding key and expire in robj (object.c)
* Replace dict in kvstore
* defrag
* Handle dbAdd and dbReplaceValue return value
* Fix some things including MOVE, RENAME
* Fix test case "expire scan should skip dictionaries with lot's of empty buckets"
* Fix resize allowed maxmemory check
* Fix kvstore unit tests
* Fix maxmemory test (increase size)
* Delete test cases about shared integers
* Scan single step in active expire
* Seed hashset hash function on startup
* Set resize policy when a fork is active, and fix test case
* Fix rehashing test case in unit/info suite
* Fix more resize test cases in unit/other
* Fix scan testcases not expecting duplicates
* Another run at the unit/expire test suite
* Fix typo in defrag pubsub channels
* Make dbAddRDBLoad return the (maybe reallocated) object
* Add unit test file test_object.c
* Account for scan duplicates in valkey-cli test

Signed-off-by: Viktor Söderqvist <[email protected]>
* Embed expire in robj, only if needed or if there is unused space in the allocation.
* Always embed key in robj.
* dbAdd, setKey and setExpire return the reallocated object with embedded key.
* setKey does not increment the reference counter, since it would require duplicating the object.

Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
When key is embedded in object, all objects in the database need
to be unique so we can't use shared objects as values.

Signed-off-by: Viktor Söderqvist <[email protected]>
@zuiderkwast
Copy link
Contributor Author

Included in #1186.

@zuiderkwast zuiderkwast deleted the robj-hack branch December 10, 2024 22:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Embed key and TTL in robj
4 participants