[hashset feature] Replace dict with hashset for keys and expire #1178

zuiderkwast · 2024-10-16T08:47:01Z

This PR is based on #1186. When it is merged, this PR will change target to unstable.

Keys and expire are embedded in robj.

Dict is replaced by hashset for the keys and expire tables. Hashset is a hash table that doesn't have a dict entry. Pointers point directly to robj, where also the expire is stored, if any. Also pubsub channels are changed, since they use kvstore. Kvstore is changed from dict to hashset.

Some db.c functions like dbAdd, setKey and setExpire are changed to return the reallocated object after embedding the key and optional expire. setKey does not increment the reference counter, since it would require duplicating the object.

An robj value with an embedded key can be called valkey (val + key). A valkey typedef is added for robj and it's used when the object has an embedded key, to distinguish it from an object from other usages of robj (such as a string object from the command parser in c->argv).

keys                                      expire
hashset            robj                   hashset
+---+         +-------------------+        +---+
| 1 --------->| type, enc, lru    |<-------- 1 |
| 2 |         | refcount, flags   |        | 2 |
| 3 |         | ptr               |        | 3 |
| . |         | expire?           |        | . |
| . |         | embedded key      |        | . |
| . |         | embedded value?   |        | . |
+---+         +-------------------+        +---+

? = optional

A quick benchmark using valkey-benchmark showed small improvements for all types, but more for get and set than complex types.

Rebarding memory usage, DEBUG POPULATE 10000000 (10M) resulted in memory usage (used_memory_human) of 669M, compared to 872M on unstable. That's a 23% drop in memory usage and 21.28 bytes per key. (DEBUG POPULATE created keys of length 11 and string values of length 13. Both of these fit in an robj allocation of size 32 bytes.)

Fixes #992.

TODO:

Implement memory prefetching (for io threading, to be ported from the dict implementation).

src/kvstore.h

SoftlyRaining

Didn't look at db.c yet

src/server.h

src/evict.c

src/object.c

SoftlyRaining · 2024-10-17T00:22:22Z

src/memory_prefetch.c

Think we'll implement something, or will we end up completely removing prefetch?

Yes... Help wanted. :)

@touitou-dan

The prefetch implementation for dict is accessing dict internals to do incremental lookup, one memory access at a time. It's breaking the abstraction and adds duplicated logic. I don't like that part.

For hashset, I have two ideas about how to fetch multiple keys in parallel:

A function that looks up multiple keys in parallel. It can return a bitmap with a bit for each key, whether it was found or not.

unsigned hashsetBatchFind(int count, hashset **sets, const void **keys, void **found);

A function that incrementally looks up one key. It needs to be called repeatedly until it's done. The caller provides an (opaque) state that the implementation can use.

int hashsetIncrementalFind(hashset *s, const void *key, void **found, hashsetIncrementalFindState *state);

The prefetching code can run multiple such lookups in round-robin.

src/defrag.c

Signed-off-by: Viktor Söderqvist <[email protected]>

This changes the type of command tables from dict to hashset. Command table lookup takes ~3% of overall CPU time in benchmarks, so it is a good candidate for optimization. My initial SET benchmark comparison suggests that hashset is about 4.5 times faster than dict and this replacement reduced overall CPU time by 2.79% 🥳 --------- Signed-off-by: Rain Valentine <[email protected]> Signed-off-by: Rain Valentine <[email protected]> Co-authored-by: Rain Valentine <[email protected]>

… rehashed buckets can be skipped (valkey-io#1147) Fixes 2 bugs in hashsetNext(): - null dereference when iterating over new unused hashset - change order of iteration so skipping past already-rehashed buckets works correctly and won't miss elements Minor optimization in hashsetScan and some renamed variables. Signed-off-by: Rain Valentine <[email protected]>

Co-authored-by: Madelyn Olson <[email protected]> Signed-off-by: Viktor Söderqvist <[email protected]>

Use variable name `s` for hashset instead of `t`. Use variable name `element` instead of `elem`. Remove the hashsetType.userdata field. Some more fixes. Signed-off-by: Viktor Söderqvist <[email protected]>

Signed-off-by: Viktor Söderqvist <[email protected]>

src/hashset.c

codecov · 2024-10-22T14:56:46Z

Codecov Report

Attention: Patch coverage is 93.85965% with 49 lines in your changes missing coverage. Please review.

Project coverage is 70.94%. Comparing base (5129254) to head (159bc70).

Files with missing lines	Patch %	Lines
src/kvstore.c	95.21%	12 Missing ⚠️
src/module.c	0.00%	10 Missing ⚠️
src/object.c	94.44%	7 Missing ⚠️
src/db.c	95.80%	6 Missing ⚠️
src/defrag.c	88.67%	6 Missing ⚠️
src/debug.c	63.63%	4 Missing ⚠️
src/memory_prefetch.c	40.00%	3 Missing ⚠️
src/replication.c	0.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           hashset    #1178      +/-   ##
===========================================
+ Coverage    70.77%   70.94%   +0.16%     
===========================================
  Files          115      115              
  Lines        64714    63760     -954     
===========================================
- Hits         45804    45233     -571     
+ Misses       18910    18527     -383

Files with missing lines	Coverage Δ
src/aof.c	`80.11% <100.00%> (-0.02%)`	⬇️
src/bitops.c	`93.76% <100.00%> (-0.01%)`	⬇️
src/cluster.c	`88.35% <100.00%> (-0.02%)`	⬇️
src/cluster_legacy.c	`86.30% <100.00%> (ø)`
src/evict.c	`97.70% <100.00%> (-0.06%)`	⬇️
src/expire.c	`96.59% <100.00%> (+0.02%)`	⬆️
src/geo.c	`93.58% <100.00%> (-0.02%)`	⬇️
src/hyperloglog.c	`91.23% <100.00%> (ø)`
src/lazyfree.c	`86.11% <100.00%> (ø)`
src/pubsub.c	`96.27% <100.00%> (+0.04%)`	⬆️
... and 19 more

... and 6 files with indirect coverage changes

hpatro · 2024-10-23T17:05:58Z

This is great progress. Will take a look this week.

tests/unit/aofrw.tcl

Signed-off-by: Viktor Söderqvist <[email protected]>

…ns (valkey-io#1218) This is targeting the hashset branch and will update that PR if merged. Here's the UT I promised to replace a tcl test I deleted in another PR 😅 --------- Signed-off-by: Rain Valentine <[email protected]>

A new type hashsetPosition is added and the prototypes of the following functions are changed: * hashsetFindPositionForInsert * hashsetInsertAtPosition * hashsetTwoPhasePopFindRef * hashsetTwoPhasePopDelete Signed-off-by: Viktor Söderqvist <[email protected]>

Signed-off-by: Viktor Söderqvist <[email protected]>

…ization (valkey-io#1231) Original test violated strict aliasing rules. I first wrote verbose comments about how to get away with storing non-pointers, but have now updated it to simply avoid storing non-pointers. Much cleaner and safer. Technically we could work around this and still store longs directly in a hashset, but currently we don't have a need for that. --------- Signed-off-by: Rain Valentine <[email protected]> Signed-off-by: Viktor Söderqvist <[email protected]> Co-authored-by: Viktor Söderqvist <[email protected]>

Co-authored-by: Madelyn Olson <[email protected]> Signed-off-by: Viktor Söderqvist <[email protected]>

Instead of open addressing, we now use bucket chaining, with allocated child buckets when a bucket gets full. The scan and defrag APIs are changed, because we now need to be able to defrag the allocated child buckets. Signed-off-by: Viktor Söderqvist <[email protected]>

Signed-off-by: Viktor Söderqvist <[email protected]>

…ables Rename hashsetDefragInternals to hashsetDefragTables, because it doesn't defrag the allocated child buckets, which as also internals. Add trackMemUsage callback in hashsetType, to track memory allocations of child buckets. Edit code comments about the hashset design. Signed-off-by: Viktor Söderqvist <[email protected]>

Signed-off-by: Viktor Söderqvist <[email protected]>

* Add functions for embedding key and expire in robj (object.c) * Replace dict in kvstore * defrag * Handle dbAdd and dbReplaceValue return value * Fix some things including MOVE, RENAME * Fix test case "expire scan should skip dictionaries with lot's of empty buckets" * Fix resize allowed maxmemory check * Fix kvstore unit tests * Fix maxmemory test (increase size) * Delete test cases about shared integers * Scan single step in active expire * Seed hashset hash function on startup * Set resize policy when a fork is active, and fix test case * Fix rehashing test case in unit/info suite * Fix more resize test cases in unit/other * Fix scan testcases not expecting duplicates * Another run at the unit/expire test suite * Fix typo in defrag pubsub channels * Make dbAddRDBLoad return the (maybe reallocated) object * Add unit test file test_object.c * Account for scan duplicates in valkey-cli test Signed-off-by: Viktor Söderqvist <[email protected]>

* Embed expire in robj, only if needed or if there is unused space in the allocation. * Always embed key in robj. * dbAdd, setKey and setExpire return the reallocated object with embedded key. * setKey does not increment the reference counter, since it would require duplicating the object. Signed-off-by: Viktor Söderqvist <[email protected]>

Signed-off-by: Viktor Söderqvist <[email protected]>

When key is embedded in object, all objects in the database need to be unique so we can't use shared objects as values. Signed-off-by: Viktor Söderqvist <[email protected]>

zuiderkwast · 2024-11-18T17:08:14Z

Included in #1186.

madolson reviewed Oct 16, 2024

View reviewed changes

src/kvstore.h Show resolved Hide resolved

SoftlyRaining reviewed Oct 17, 2024

View reviewed changes

Hashset implementation (squashed)

02221d1

Signed-off-by: Viktor Söderqvist <[email protected]>

zuiderkwast force-pushed the hashset branch from 19576b5 to 8fe59b3 Compare October 17, 2024 13:35

zuiderkwast force-pushed the hashset branch from 8fe59b3 to 3038293 Compare October 18, 2024 07:22

SoftlyRaining and others added 9 commits October 18, 2024 10:33

Apply suggestions from code review

a7f34b8

Co-authored-by: Madelyn Olson <[email protected]> Signed-off-by: Viktor Söderqvist <[email protected]>

More code review suggestions

69ec76a

Use variable name `s` for hashset instead of `t`. Use variable name `element` instead of `elem`. Remove the hashsetType.userdata field. Some more fixes. Signed-off-by: Viktor Söderqvist <[email protected]>

Fix comment and delete unused code

bd1ecfb

Signed-off-by: Viktor Söderqvist <[email protected]>

Mistake in previous incomplete variable renaming

10d3abf

Signed-off-by: Viktor Söderqvist <[email protected]>

Fix trailing whitespace in comment

2204dd6

Signed-off-by: Viktor Söderqvist <[email protected]>

Fix more review comments

5eb3e17

Signed-off-by: Viktor Söderqvist <[email protected]>

Use snake_case for struct fields + make stats struct opaque

b5f61df

Signed-off-by: Viktor Söderqvist <[email protected]>

Fixup: test case reference to renamed field

5129254

Signed-off-by: Viktor Söderqvist <[email protected]>

SoftlyRaining reviewed Oct 21, 2024

View reviewed changes

src/hashset.c Outdated Show resolved Hide resolved

zuiderkwast force-pushed the robj-hack branch from cc16018 to fdb5022 Compare October 22, 2024 14:42

zuiderkwast requested a review from hpatro October 22, 2024 16:04

zuiderkwast linked an issue Oct 22, 2024 that may be closed by this pull request

Embed key and TTL in robj #992

Closed

zuiderkwast commented Oct 24, 2024

View reviewed changes

tests/unit/aofrw.tcl Outdated Show resolved Hide resolved

zuiderkwast and others added 8 commits October 25, 2024 21:18

Address more review comments

f59ada4

Signed-off-by: Viktor Söderqvist <[email protected]>

Spelling + build warning

8e9bbdb

Signed-off-by: Viktor Söderqvist <[email protected]>

update comment about sub-command set

1702aff

Co-authored-by: Madelyn Olson <[email protected]> Signed-off-by: Viktor Söderqvist <[email protected]>

Formatting and typos

4e99476

Signed-off-by: Viktor Söderqvist <[email protected]>

zuiderkwast added 7 commits November 8, 2024 00:09

Remove debug code in unit test

905de31

Signed-off-by: Viktor Söderqvist <[email protected]>

Improve hashset random fairness with reservoir sampling

8c66eb3

Signed-off-by: Viktor Söderqvist <[email protected]>

clang-format

a7c6f2c

Signed-off-by: Viktor Söderqvist <[email protected]>

Optimize INCR (can't use shared object)

bb2517a

Signed-off-by: Viktor Söderqvist <[email protected]>

zuiderkwast force-pushed the robj-hack branch from 159bc70 to 970ef65 Compare November 9, 2024 23:22

zuiderkwast added 2 commits November 10, 2024 00:29

Adjust to bucket-chaining hashset implementation

0bfdce1

Signed-off-by: Viktor Söderqvist <[email protected]>

Remove code for shared integer objects

00ff495

When key is embedded in object, all objects in the database need to be unique so we can't use shared objects as values. Signed-off-by: Viktor Söderqvist <[email protected]>

zuiderkwast force-pushed the robj-hack branch from 970ef65 to 00ff495 Compare November 11, 2024 10:59

zuiderkwast mentioned this pull request Nov 15, 2024

New hash table #1186

Merged

zuiderkwast force-pushed the hashset branch 2 times, most recently from 40f6624 to 5ada386 Compare November 18, 2024 15:57

zuiderkwast closed this Nov 18, 2024

zuiderkwast deleted the robj-hack branch December 10, 2024 22:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[hashset feature] Replace dict with hashset for keys and expire #1178

[hashset feature] Replace dict with hashset for keys and expire #1178

zuiderkwast commented Oct 16, 2024 •

edited

Loading

SoftlyRaining left a comment

SoftlyRaining Oct 17, 2024

zuiderkwast Oct 17, 2024

zuiderkwast Nov 14, 2024 •

edited

Loading

codecov bot commented Oct 22, 2024 •

edited

Loading

hpatro commented Oct 23, 2024

zuiderkwast commented Nov 18, 2024

[hashset feature] Replace dict with hashset for keys and expire #1178

[hashset feature] Replace dict with hashset for keys and expire #1178

Conversation

zuiderkwast commented Oct 16, 2024 • edited Loading

SoftlyRaining left a comment

Choose a reason for hiding this comment

SoftlyRaining Oct 17, 2024

Choose a reason for hiding this comment

zuiderkwast Oct 17, 2024

Choose a reason for hiding this comment

zuiderkwast Nov 14, 2024 • edited Loading

Choose a reason for hiding this comment

codecov bot commented Oct 22, 2024 • edited Loading

Codecov Report

hpatro commented Oct 23, 2024

zuiderkwast commented Nov 18, 2024

zuiderkwast commented Oct 16, 2024 •

edited

Loading

zuiderkwast Nov 14, 2024 •

edited

Loading

codecov bot commented Oct 22, 2024 •

edited

Loading