New hash table #1186

zuiderkwast · 2024-10-17T13:18:17Z

This PR implements a new hash table and uses it for keys, command lookup and more. There are multiple commits. Do not squash-merge.

Hash table implementation and unit tests
Use new hash table for command lookup
Use new hash table keys and expire

Hash table design

The hash table is a cache line optimized and implemented as outlined in #169, but changed to chaining instead of probing after an idea by Madelyn. The key-value entry is user-defined, which allows the user to embed key and value within a single allocation. The hash table supports incremental rehashing, scan, random key, etc. just like the dict but faster and using less memory. Each bucket contains a few bits of metadata per entry. For details, see the comments in src/hashtable.{c,h}.

If a bucket is full, the last entry pointer in a bucket can be replaced by a child-bucket pointer and we get a bucket chain.

      Bucket          Bucket          Bucket               
-----+---------------+---------------+---------------+-----
 ... | x x x x x x p | x x x x x x x | x x x x x x x | ... 
-----+-------------|-+---------------+---------------+-----
                   |                                       
                   v  Child bucket                         
                 +---------------+                         
                 | x x x x x x p |                         
                 +-------------|-+                         
                               |                           
                               v  Child bucket             
                             +---------------+             
                             | x x x x x x x |             
                             +---------------+

Command lookup

The 2nd commit relaces dict with hashtable for command lookup. This was implemented by @SoftlyRaining.

Keys and expire

The 3rd commit replaces dict with the new hash table in kvstore.c and all code that uses it, such as db.c.

The hashtable entry in this case is the robj struct. The key and optionally an expire timestamp are embedded in the robj struct, i.e. the key is embedded in the value. Therefore, we can call this a valkey object, val + key. This design saves roughly 20 bytes per key for short string keys.

 keys                                      expire
hashtable             robj                hashtable
 +---+         +-------------------+        +---+
 | 1 --------->| type, enc, lru    |<-------- 1 |
 | 2 |         | refcount, flags   |        | 2 |
 | 3 |         | ptr               |        | 3 |
 | . |         | [expire]          |        | . |
 | . |         | embedded key      |        | . |
 | . |         | [embedded value]  |        | . |
 +---+         +-------------------+        +---+

[.] = optional

Some db.c functions like dbAdd, setKey and setExpire now reallocate the value object to embed the key and optional expire in it. setKey does not increment the reference counter, since it would require duplicating the object.

Fixes #991
Fixes #992

codecov · 2024-10-17T13:33:13Z

Codecov Report

Attention: Patch coverage is 82.72543% with 341 lines in your changes missing coverage. Please review.

Project coverage is 70.90%. Comparing base (f951a1c) to head (3559ba8).
Report is 6 commits behind head on unstable.

Files with missing lines	Patch %	Lines
src/hashtable.c	74.21%	231 Missing ⚠️
src/module.c	0.00%	31 Missing ⚠️
src/memory_prefetch.c	0.00%	18 Missing ⚠️
src/kvstore.c	95.50%	12 Missing ⚠️
src/defrag.c	81.48%	10 Missing ⚠️
src/server.c	93.46%	10 Missing ⚠️
src/acl.c	74.28%	9 Missing ⚠️
src/object.c	94.69%	7 Missing ⚠️
src/debug.c	63.63%	4 Missing ⚠️
src/io_threads.c	0.00%	4 Missing ⚠️
... and 3 more

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #1186      +/-   ##
============================================
+ Coverage     70.85%   70.90%   +0.04%     
============================================
  Files           118      119       +1     
  Lines         63612    64652    +1040     
============================================
+ Hits          45075    45839     +764     
- Misses        18537    18813     +276

Files with missing lines	Coverage Δ
src/aof.c	`80.20% <100.00%> (+0.07%)`	⬆️
src/bitops.c	`93.77% <100.00%> (ø)`
src/cluster.c	`88.51% <100.00%> (ø)`
src/cluster_legacy.c	`86.86% <100.00%> (+0.22%)`	⬆️
src/config.c	`77.65% <100.00%> (+0.02%)`	⬆️
src/evict.c	`97.71% <100.00%> (-0.04%)`	⬇️
src/expire.c	`96.59% <100.00%> (+0.02%)`	⬆️
src/geo.c	`93.58% <100.00%> (-0.02%)`	⬇️
src/hyperloglog.c	`92.23% <100.00%> (ø)`
src/latency.c	`80.92% <100.00%> (+0.04%)`	⬆️
... and 25 more

... and 6 files with indirect coverage changes

xingbowang · 2024-10-17T21:33:16Z

For command lookup, since the command table is rarely updated, why not build an efficient trie? Or it is even more slower?
Or try perfect hash function. https://en.wikipedia.org/wiki/Perfect_hash_function

madolson

Just a partial review (everything but hashset.c, which I guess is the interesting stuff) and tests. Blocked time off tomorrow to read through :) Exciting stuff!

src/unit/test_hashset.c

src/hashset.h

src/hashset.c

src/hashset.h

src/hashset.c

src/hashset.h

src/server.c

zuiderkwast · 2024-10-18T08:09:24Z

For command lookup, since the command table is rarely updated, why not build an efficient trie? Or it is even more slower? Or try perfect hash function. https://en.wikipedia.org/wiki/Perfect_hash_function

@xingbowang For command table, yes a perfect hash function would be even better. The main point of this implementation is to use it for the database keys though, which will come in another PR. (A trie is possibly slower because there might be more memory accesses.)

soloestoy

excellent work!

to be honest, I didn't like open-addressing hash tables before, because the tombstones generated by deletions have a very negative impact on both performance and space. I felt they were not suitable for online services with a large number of deletion operations.

However, this design is quite clever, deleting an element does not create a tombstone, only when a bucket has been full at some point is it considered to have a tombstone, which greatly reduces the impact of deletion operations.

A bucket's length is 7, so in terms of chained hashing, a tombstone effect would only occur when there are 7 collisions, which intuitively seems to be a relatively rare situation.

In theory, this should save memory. Compared to the previous chained hashing, it saves two pointers (for the previous and next dictEntry) and adds one hash value, saving about 15 bytes per element. However, the frequency of resizing might be higher, as resizing is triggered not only by the fill factor but also by the proportion of tombstones. Do we have reliable tests for space and performance?

src/hashset.h

zuiderkwast · 2024-10-18T08:47:58Z

In theory, this should save memory. Compared to the previous chained hashing, it saves two pointers (for the previous and next dictEntry) and adds one hash value, saving about 15 bytes per element. However, the frequency of resizing might be higher, as resizing is triggered not only by the fill factor but also by the proportion of tombstones. Do we have reliable tests for space and performance?

@soloestoy I'm glad you like the design.

In my WIP branch for using this for keys, i get 10-15% better performace for GET compared to unstable.

I set the fill factor to 77% currently, and with 7/8 of the space used for pointers (1/8 in each bucket is metadata bitys), it means that 67% of the allocation is pointers.

Rehashing triggered by the proportion of tombstones is a corner case. It didn't implement it first, but I realized in the defrag tests, the test code tried to create a lot of fragmentation and it added many keys and eviction deleted many keys. In this case, the table became full of tombstones. It does not happen very often and I think it will not affect the performance.

There is a potential problem with CoW during fork though. The dict can avoid rehashing up to it becomes 500% fill factor. This is not possible with open addressing. We have to rehash at some point. I set the hard limit to 90% currently. I hope it will not be a major problem. In the "resize avoid" mode, resizing is allowed but incremental rehashing is paused, so only new keys are added to the new table. This also avoids destroying the CoW.

madolson

Like 75% review of hashset.c, and my brain is done with reviewing. Overall looks really solid though.

The one thing that occurred to me while I was reviewing is that instead of using the everfull bit to start probing, we could use it to indicate the last position is a pointer to another bucket and do linked list resolution. This does a couple of things:

Prevents significant chain lengths, since collisions won't cause issues.
Keeps the overall scanning implementation much more similar to the current implementation.
We get most of the performance benefit from the cache lines and the nextCursorCall() is likely almost as random as a random DRAM lookup. (Which we'll probably prefetch anyways?)

src/hashset.c

src/hashset.h

src/hashset.c

src/hashset.h

src/hashset.c

SoftlyRaining · 2024-10-19T00:18:01Z

Like 75% review of hashset.c, and my brain is done with reviewing. Overall looks really solid though.

The one thing that occurred to me while I was reviewing is that instead of using the everfull bit to start probing, we could use it to indicate the last position is a pointer to another bucket and do linked list resolution. This does a couple of things:
1. Prevents significant chain lengths, since collisions won't cause issues.

2. Keeps the overall scanning implementation much more similar to the current implementation.

3. We get _most_ of the performance benefit from the cache lines and the nextCursorCall() is likely almost as random as a random DRAM lookup.

For what it's worth, I can make nextCursor() use sequential, in-memory order for rehashing, scanning, and iterator if we use the most significant bits instead of the least significant bits of the hash for bucket index. The math for iterating the two different-sized tables in parallel for rehash and scan would be a little different but not bad - basically bit shifting instead of masking.

At any rate, I plan to add prefetch hints - not sure how much that would eliminate the benefit of sequential memory access.

zuiderkwast · 2024-10-19T10:33:28Z

The one thing that occurred to me while I was reviewing is that instead of using the everfull bit to start probing, we could use it to indicate the last position is a pointer to another bucket and do linked list resolution.

@madolson This is a great idea. It reminds me of "bulk chaining" in the SegCache paper. I think it can be done without changing the API, so it can be done at any point.

I believe it has many benefits, like we can postpone rehashing almost indefinitely like the dict does when there's a fork. We can also prevent scan from returning duplicates (when it wraps around zero following a probe chain).

For what it's worth, I can make nextCursor() use sequential, in-memory order for rehashing, scanning, and iterator if we use the most significant bits instead of the least significant bits of the hash for bucket index. The math for iterating the two different-sized tables in parallel for rehash and scan would be a little different but not bad - basically bit shifting instead of masking.

@SoftlyRaining This sounds interesting. Memory is fetched per cache-line, so I'm wondering if there are benefits to accessing it sequentially to the current reverse-bit-increment. Systems with larger cache lines, sure, but that's only macbook so far? Predictive memory fetching predicts that cache lines are accessed sequentially?

src/hashset.c

SoftlyRaining · 2024-10-22T21:35:42Z

This sounds interesting. Memory is fetched per cache-line, so I'm wondering if there are benefits to accessing it sequentially to the current reverse-bit-increment. Systems with larger cache lines, sure, but that's only macbook so far? Predictive memory fetching predicts that cache lines are accessed sequentially?

@zuiderkwast afaik you're right about cache lines. The benefit would be making it predictable so branch prediction can understand it and prefetch buckets efficiently. My guess is that it'd perform almost as well as hand-tuned prefetch hints.

src/hashset.c

zuiderkwast · 2024-11-07T23:47:36Z

Hello. I've changed the design based on Madelyn's idea of bucket chaining. It's no longer open addressing.

The one thing that occurred to me while I was reviewing is that instead of using the everfull bit to start probing, we could use it to indicate the last position is a pointer to another bucket and do linked list resolution. This does a couple of things:

Prevents significant chain lengths, since collisions won't cause issues.

Keeps the overall scanning implementation much more similar to the current implementation.

We get most of the performance benefit from the cache lines and the nextCursorCall() is likely almost as random as a random DRAM lookup. (Which we'll probably prefetch anyways?)

Furthermore,

there is no hard limit on fill factor, so we can postpone rehashing indefinitely when there is a fork (as we do for dict)
iteration and incremental rehashing are not coupled with the scan cursor increment
scan no longer returns duplicates or extremely many elements in the same iteration
it can work with a higher fill factor and I believe it will use less memory in practice

zuiderkwast · 2024-11-12T16:15:15Z

@touitou-dan Hello! Can you help with a prefetching implementation for this hash table?

touitou-dan · 2024-11-13T07:47:07Z

Hey @zuiderkwast , I’d be happy to help! However, I’d first like the opportunity to hear about the new implementation. Would you be available for a chat?

rjd15372

I've only reviewed the hashset.[hc] code. I'll continue reviewing the remaining files later today.

src/hashset.h

src/hashset.c

rjd15372 · 2024-11-13T15:09:59Z

@zuiderkwast finished the review, and I don't have more comments apart from the ones I sent this morning. Very nice design and implementation!

src/hashset.c

JimB123

I really like that we plan to combine key/value/metadata into a single reference counted "element" and that the main dict just becomes a set of reference counted elements. Using an independent reference counted element makes it much easier to perform internal asynchronous operations.

I mostly reviewed the API definition. My main comments are:

I don't think we should be differentiating between a "key" and an "element". An element IS A key.
I don't think we should ever be exposing pointers into the structure. Instead, an opaque hashsetPosition can have appropriate accessor methods.
Keep the opaque stuff opaque. Don't let stack allocation be a reason to expose internals.

src/hashset.h

src/hashset.c

src/hashset.h

src/hashtable.c

hwware · 2024-11-28T21:24:33Z

In theory, this should save memory. Compared to the previous chained hashing, it saves two pointers (for the previous and next dictEntry) and adds one hash value, saving about 15 bytes per element. However, the frequency of resizing might be higher, as resizing is triggered not only by the fill factor but also by the proportion of tombstones. Do we have reliable tests for space and performance?

@soloestoy I'm glad you like the design.

In my WIP branch for using this for keys, i get 10-15% better performace for GET compared to unstable.

I set the fill factor to 77% currently, and with 7/8 of the space used for pointers (1/8 in each bucket is metadata bitys), it means that 67% of the allocation is pointers.

Rehashing triggered by the proportion of tombstones is a corner case. It didn't implement it first, but I realized in the defrag tests, the test code tried to create a lot of fragmentation and it added many keys and eviction deleted many keys. In this case, the table became full of tombstones. It does not happen very often and I think it will not affect the performance.

There is a potential problem with CoW during fork though. The dict can avoid rehashing up to it becomes 500% fill factor. This is not possible with open addressing. We have to rehash at some point. I set the hard limit to 90% currently. I hope it will not be a major problem. In the "resize avoid" mode, resizing is allowed but incremental rehashing is paused, so only new keys are added to the new table. This also avoids destroying the CoW.

I do not review the changes carefully. But here you mentioned you have 10% -- 15% better performance for GET. I think the point is used memory. If it is, could you share the qps perforamce ? I am not sure how the changes effect to the qps, Thanks

zuiderkwast · 2024-11-28T22:18:48Z

I do not review the changes carefully. But here you mentioned you have 10% -- 15% better performance for GET. I think the point is used memory. If it is, could you share the qps perforamce ? I am not sure how the changes effect to the qps, Thanks

@hwware Thanks for looking. I did a new benchmark just now.

My simple benchmark setup: valkey-server --save ''. Run valkey-benchmark on the same computer.

Unstable:

$ ./valkey-benchmark -P 10 --threads 4 -n 10000000 -r 1000000 -q -t set,get
SET: 1050972.12 requests per second, p50=0.415 msec                     
GET: 1080847.38 requests per second, p50=0.399 msec                     

$ ./valkey-cli info memory | grep used_memory_human
used_memory_human:70.05M

This branch:

$ ./valkey-benchmark -P 10 --threads 4 -n 10000000 -r 1000000 -q -t set,get
SET: 1110864.25 requests per second, p50=0.383 msec                     
GET: 1248283.62 requests per second, p50=0.351 msec                     

$ ./valkey-cli info memory | grep used_memory_human
used_memory_human:63.40M

My laptop is Linux on Macbook pro M2. It is unusual, yes. But others got similar results. @SoftlyRaining @uriyage you have done more benchmarks on more normal server hardware, with/without IO threads, and with larger value size. Please share.

The explanation of the improved speed is fewer memory lookups, especially memory that are not already in L1 cache. This is a very slow operation, so counting memory accesses is a good way to optimize.

In this benchmark, the key and value are small and both are embedded in one robj, in one allocation of less than 64 bytes, so it is just one memory access.
The hashtable stores a pointer directly to this robj. To avoid checking the key if it is a hash collision, there are some extra hash bits stored to reduce the probability of collision to 1/256. This means that normally only one memory access is required to find this pointer.

So the total is 2 memory accesses in the most cases. For dict with key embedded in dictEntry (current unstable) with have minimum 3 memory accesses, with more if there are hash collisions.

dict -> dictEntry -> ... -> dictEntry -> robj
          (if there are collissions)

madolson

Through everything now, nothing is standing out as a major concern to me right now. I think it should be good to merge ASAP.

madolson · 2024-12-01T22:38:36Z

src/hashtable.c

+        return 0;
+    }
+
+    if (ht->type->resizeAllowed) {


Why does resize allowed callback override the hard limit check? I would expect this to be in the conditional.

The idea here is that we want to force a resize if the fill factor is above the hard limit, i.e. it's too full and we need to expand. We want to ignore the resizeAllowed in this case. Additionally, the double fill factor is only computed when needed.

For the context, these are the lines below this one:

if (ht->type->resizeAllowed) { double fill_factor = (double)min_capacity / ((double)numBuckets(old_exp) * ENTRIES_PER_BUCKET); if (fill_factor * 100 < MAX_FILL_PERCENT_HARD && !ht->type->resizeAllowed(alloc_size, fill_factor)) { /* Resize callback says no. */ return 0; } }

Maybe you think we also need to forbid the user from explicitly shrink to a very small size, which would take us above the max fill factor? resize is a internal function and the only way some user can try to shrink it is with hashtableShrinkIfNeeded, so a resize can't get you above the hard max fill factor.

src/hashtable.c

madolson · 2024-12-02T08:09:41Z

src/hashtable.c

+ *
+ * It's allowed to insert and replace entries. Deleting entries is only allowed
+ * for the entry that was just returned by hashtableNext. Deleting other entries
+ * are not supported. (It can cause internal fragmentation.)


It seems like we are trying to handle the internal fragmentation, so is this actually a constraint?

Not sure what you mean. The code that handles internal fragmentation in iterator and scan only works for the bucket chain that was just iterated/scanned. If the user deletes from another bucket chain while iterating or from the scan callback, it will leave holes that we don't handle.

src/hashtable.c

src/unit/test_hashtable.c

madolson · 2024-12-02T16:19:30Z

src/hashtable.c

+                     * bucket chain, before we move on to the next index. */
+                    if (iter->hashtable->pause_rehash == 1 &&
+                        iter->hashtable->used[iter->table] < iter->last_seen_size) {
+                        compactBucketChain(iter->hashtable, iter->index, iter->table);


More of a followup, maybe we should make the safe iterator have the same guarantees as SCAN, in that you have to process everything in the bucket. Defrag should be fine, since it can queue up everything into defrag later, active expire should be able to cover everything. I don't think there should be any other iteration where items can change in between invocations. If we do that, then we never have any of this bucket compaction code.

I'm not sure how we can do that. You mean that we shall allow the fillBucketHole function to run on delete also when iterating? It reorders the entries in the bucket chain. How do we make sure all the entries in the chain are returned exactly once?

src/hashtable.c

enjoy-binbin

I don't review everything, but overall LGTM, it is a great job.

src/hashtable.c

src/kvstore.c

uriyage · 2024-12-08T12:11:10Z

src/cluster.c

@@ -276,9 +276,9 @@ void restoreCommand(client *c) {
    }

    /* Create the key and set the TTL if any */
-    dbAdd(c->db, key, obj);
+    dbAdd(c->db, key, &obj);


Shouldn't we use setKey here with TTL as a flag to avoid reallocation?

setKey doesn't take a TTL either. I want to improve this in a follow-up to provide something like dbAddWithExpire and setKeyWithExpire to avoid reallocation. It needs some refactoring and I don't want this PR to become larger than what it already is.

madolson

I think all my concerns are addressed, so let's get it merged :)

SoftlyRaining

Finally finished reviewing all of it! I think it's looking pretty good - my comments are just a few small things I don't think are critical to address. :)

src/server.c

src/unit/test_hashtable.c

src/object.c

src/hashtable.c

A cache-line aware hash table with a user-defined key-value entry type, supporting incremental rehashing, scan, iterator, random sampling, incremental lookup and more... Signed-off-by: Viktor Söderqvist <[email protected]>

This changes the type of command tables from dict to hashtable. Command table lookup takes ~3% of overall CPU time in benchmarks, so it is a good candidate for optimization. My initial SET benchmark comparison suggests that hashtable is about 4.5 times faster than dict and this replacement reduced overall CPU time by 2.79% 🥳 --------- Signed-off-by: Rain Valentine <[email protected]> Signed-off-by: Rain Valentine <[email protected]> Signed-off-by: Viktor Söderqvist <[email protected]> Co-authored-by: Rain Valentine <[email protected]>

Instead of a dictEntry with pointers to key and value, the hashtable has a pointer directly to the value (robj) which can hold an embedded key and acts as a key-value in the hashtable. This minimizes the number of pointers to follow and thus the number of memory accesses to lookup a key-value pair. Keys robj hashtable +-------+ +-----------------------+ | 0 | | type, encoding, LRU | | 1 ------->| refcount, expire | | 2 | | ptr | | ... | | optional embedded key | +-------+ | optional embedded val | +-----------------------+ The expire timestamp (TTL) is also stored in the robj, if any. The expire hash table points to the same robj. Overview of changes: * Replace dict with hashtable in kvstore (kvstore.c) * Add functions for embedding key and expire in robj (object.c) * When there's unused space, reserve an expire field to avoid realloting it later if expire is added. * Always reserve space for expire for large key names to avoid realloc if it's set later. * Update db functions (db.c) * dbAdd, setKey and setExpire reallocate the object when embedding a key * setKey does not increment the reference counter, since it would require duplicating the object. This responsibility is moved to the caller. * Remove logic for shared integer objects as values in the database. The keys are now embedded in the objects, so all objects in the database need to be unique. Thus, we can't use shared objects as values. Also delete test cases for shared integers. * Adjust various commands to the changes mentioned above. * Adjust defrag code * Improvement: Don't access the expires table before defrag has actually reallocated the object. * Adjust test cases that were using hard-coded sizes for dict when realloc would happen, and some other adjustments in test cases. * Adjust memory prefetch for new hash table implementation in IO-threading, using new `hashtableIncrementalFind` API * Adjust offloading of free() to IO threads: Object free to be done in main thread while keeping obj->ptr offloading in IO-thread since the DB object is now allocated by the main-thread and not by the IO-thread as it used to be. * Let expireIfNeeded take an optional value, to avoid looking up the expires table when possible. --------- Signed-off-by: Uri Yagelnik <[email protected]> Signed-off-by: uriyage <[email protected]> Signed-off-by: Viktor Söderqvist <[email protected]> Co-authored-by: Uri Yagelnik <[email protected]>

zuiderkwast · 2024-12-10T20:31:57Z

Merged. If anyone was going to post comments, please post them anyway. I can handle them as follow-ups.

The new `hashtable` provides faster lookups and uses less memory than `dict`. A TCL test case "SRANDMEMBER with a dict containing long chain" is deleted because it's covered by a hashtable unit test "test_random_entry_with_long_chain", which is already present. This change also moves some logic from dismissMemory (object.c) to zmadvise_dontneed (zmalloc.c), so the hashtable implementation which needs the dismiss functionality doesn't need to depend on object.c and server.h. This PR follows #1186. --------- Signed-off-by: Rain Valentine <[email protected]> Signed-off-by: Viktor Söderqvist <[email protected]> Co-authored-by: Viktor Söderqvist <[email protected]>

The new `hashtable` provides faster lookups and uses less memory than `dict`. A TCL test case "SRANDMEMBER with a dict containing long chain" is deleted because it's covered by a hashtable unit test "test_random_entry_with_long_chain", which is already present. This change also moves some logic from dismissMemory (object.c) to zmadvise_dontneed (zmalloc.c), so the hashtable implementation which needs the dismiss functionality doesn't need to depend on object.c and server.h. This PR follows valkey-io#1186. --------- Signed-off-by: Rain Valentine <[email protected]> Signed-off-by: Viktor Söderqvist <[email protected]> Co-authored-by: Viktor Söderqvist <[email protected]>

With verbose logging, this message is logged every 5 seconds. Before #1186, there were 128 slots in a database with 100 keys. Slots here means the number of entries that can be stored on the top-level of a hash table, which in the dict was the same as the number of buckets. With the new hash table, there are multiple entries per bucket, so the number of buckets is not a useful metric to show. This PR multiplies the number of buckets with the number of entry positions per bucket. With this change, the log message is instead 36236:M 09 Feb 2025 15:47:18.923 - DB 9: 100 keys (0 volatile) in 112 slots HT. Signed-off-by: Viktor Söderqvist <[email protected]>

zuiderkwast changed the title ~~Add hashset (a new hash table implementation) and use in command lookup~~ Add hashset (new hash table) and use in command lookup Oct 17, 2024

zuiderkwast force-pushed the hashset branch from 19576b5 to 8fe59b3 Compare October 17, 2024 13:35

zuiderkwast marked this pull request as ready for review October 17, 2024 14:10

zuiderkwast requested a review from madolson October 17, 2024 14:11

madolson reviewed Oct 17, 2024

View reviewed changes

zuiderkwast force-pushed the hashset branch from 8fe59b3 to 3038293 Compare October 18, 2024 07:22

soloestoy reviewed Oct 18, 2024

View reviewed changes

src/hashset.h Outdated Show resolved Hide resolved

madolson reviewed Oct 18, 2024

View reviewed changes

zuiderkwast commented Oct 18, 2024

View reviewed changes

src/hashset.c Outdated Show resolved Hide resolved

SoftlyRaining reviewed Oct 19, 2024

View reviewed changes

src/hashset.c Outdated Show resolved Hide resolved

zuiderkwast mentioned this pull request Oct 22, 2024

[hashset feature] Replace dict with hashset for keys and expire #1178

Closed

SoftlyRaining reviewed Oct 22, 2024

View reviewed changes

src/hashset.c Outdated Show resolved Hide resolved

SoftlyRaining mentioned this pull request Oct 23, 2024

[hashset feature] Scan shortcut fix #1217

Closed

madolson requested review from madolson and soloestoy November 11, 2024 14:36

rjd15372 reviewed Nov 13, 2024

View reviewed changes

src/hashset.h Outdated Show resolved Hide resolved

src/hashset.h Outdated Show resolved Hide resolved

src/hashset.c Outdated Show resolved Hide resolved

src/hashset.c Outdated Show resolved Hide resolved

SoftlyRaining reviewed Nov 13, 2024

View reviewed changes

src/hashset.c Outdated Show resolved Hide resolved

JimB123 reviewed Nov 15, 2024

View reviewed changes

zuiderkwast force-pushed the hashset branch from 2a60fe5 to 40f6624 Compare November 18, 2024 15:36

zuiderkwast force-pushed the hashset branch 2 times, most recently from ed3126a to 9df0601 Compare November 28, 2024 14:40

uriyage reviewed Nov 28, 2024

View reviewed changes

src/hashtable.c Outdated Show resolved Hide resolved

zuiderkwast force-pushed the hashset branch from 9df0601 to 01b334d Compare November 28, 2024 15:46

madolson reviewed Dec 2, 2024

View reviewed changes

zuiderkwast force-pushed the hashset branch 2 times, most recently from d6da54f to dfbea9d Compare December 4, 2024 18:13

enjoy-binbin added release-notes This issue should get a line item in the release notes run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) labels Dec 6, 2024

enjoy-binbin approved these changes Dec 6, 2024

View reviewed changes

src/hashtable.c Outdated Show resolved Hide resolved

src/hashtable.c Outdated Show resolved Hide resolved

src/hashtable.c Outdated Show resolved Hide resolved

src/hashtable.c Outdated Show resolved Hide resolved

src/kvstore.c Outdated Show resolved Hide resolved

zuiderkwast force-pushed the hashset branch from dfbea9d to 3ed37ea Compare December 6, 2024 11:14

zuiderkwast mentioned this pull request Dec 6, 2024

Replace dict with new hashtable for sets datatype #1176

Merged

uriyage reviewed Dec 9, 2024

View reviewed changes

zuiderkwast force-pushed the hashset branch from 3ed37ea to d090f8e Compare December 9, 2024 09:51

madolson approved these changes Dec 9, 2024

View reviewed changes

SoftlyRaining approved these changes Dec 10, 2024

View reviewed changes

zuiderkwast and others added 3 commits December 10, 2024 13:38

Hashtable implementation including unit tests

08fe4c1

A cache-line aware hash table with a user-defined key-value entry type, supporting incremental rehashing, scan, iterator, random sampling, incremental lookup and more... Signed-off-by: Viktor Söderqvist <[email protected]>

zuiderkwast force-pushed the hashset branch from d090f8e to 3559ba8 Compare December 10, 2024 12:55

zuiderkwast merged commit 3eb8314 into unstable Dec 10, 2024
72 of 73 checks passed

enjoy-binbin deleted the hashset branch December 11, 2024 04:25

zuiderkwast mentioned this pull request Jan 7, 2025

Key/Value Embedding in Main Dictionary #394

Closed

zuiderkwast mentioned this pull request Feb 9, 2025

Fix log message "DB 9: 100 keys (0 volatile) in 16 slots HT" #1691

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New hash table #1186

New hash table #1186

zuiderkwast commented Oct 17, 2024 •

edited

Loading

codecov bot commented Oct 17, 2024 •

edited

Loading

xingbowang commented Oct 17, 2024 •

edited

Loading

madolson left a comment

zuiderkwast commented Oct 18, 2024

soloestoy left a comment •

edited

Loading

zuiderkwast commented Oct 18, 2024

madolson left a comment •

edited

Loading

SoftlyRaining commented Oct 19, 2024

zuiderkwast commented Oct 19, 2024

SoftlyRaining commented Oct 22, 2024

zuiderkwast commented Nov 7, 2024 •

edited

Loading

zuiderkwast commented Nov 12, 2024

touitou-dan commented Nov 13, 2024

rjd15372 left a comment

rjd15372 commented Nov 13, 2024 •

edited

Loading

JimB123 left a comment

hwware commented Nov 28, 2024 •

edited

Loading

zuiderkwast commented Nov 28, 2024

madolson left a comment

madolson Dec 1, 2024

zuiderkwast Dec 2, 2024

madolson Dec 2, 2024

zuiderkwast Dec 2, 2024

madolson Dec 2, 2024

zuiderkwast Dec 2, 2024

enjoy-binbin left a comment

uriyage Dec 8, 2024

zuiderkwast Dec 9, 2024 •

edited

Loading

madolson left a comment

SoftlyRaining left a comment

zuiderkwast commented Dec 10, 2024

New hash table #1186

New hash table #1186

Conversation

zuiderkwast commented Oct 17, 2024 • edited Loading

Hash table design

Command lookup

Keys and expire

codecov bot commented Oct 17, 2024 • edited Loading

Codecov Report

xingbowang commented Oct 17, 2024 • edited Loading

madolson left a comment

Choose a reason for hiding this comment

zuiderkwast commented Oct 18, 2024

soloestoy left a comment • edited Loading

Choose a reason for hiding this comment

zuiderkwast commented Oct 18, 2024

madolson left a comment • edited Loading

Choose a reason for hiding this comment

SoftlyRaining commented Oct 19, 2024

zuiderkwast commented Oct 19, 2024

SoftlyRaining commented Oct 22, 2024

zuiderkwast commented Nov 7, 2024 • edited Loading

zuiderkwast commented Nov 12, 2024

touitou-dan commented Nov 13, 2024

rjd15372 left a comment

Choose a reason for hiding this comment

rjd15372 commented Nov 13, 2024 • edited Loading

JimB123 left a comment

Choose a reason for hiding this comment

hwware commented Nov 28, 2024 • edited Loading

zuiderkwast commented Nov 28, 2024

madolson left a comment

Choose a reason for hiding this comment

madolson Dec 1, 2024

Choose a reason for hiding this comment

zuiderkwast Dec 2, 2024

Choose a reason for hiding this comment

madolson Dec 2, 2024

Choose a reason for hiding this comment

zuiderkwast Dec 2, 2024

Choose a reason for hiding this comment

madolson Dec 2, 2024

Choose a reason for hiding this comment

zuiderkwast Dec 2, 2024

Choose a reason for hiding this comment

enjoy-binbin left a comment

Choose a reason for hiding this comment

uriyage Dec 8, 2024

Choose a reason for hiding this comment

zuiderkwast Dec 9, 2024 • edited Loading

Choose a reason for hiding this comment

madolson left a comment

Choose a reason for hiding this comment

SoftlyRaining left a comment

Choose a reason for hiding this comment

zuiderkwast commented Dec 10, 2024

zuiderkwast commented Oct 17, 2024 •

edited

Loading

codecov bot commented Oct 17, 2024 •

edited

Loading

xingbowang commented Oct 17, 2024 •

edited

Loading

soloestoy left a comment •

edited

Loading

madolson left a comment •

edited

Loading

zuiderkwast commented Nov 7, 2024 •

edited

Loading

rjd15372 commented Nov 13, 2024 •

edited

Loading

hwware commented Nov 28, 2024 •

edited

Loading

zuiderkwast Dec 9, 2024 •

edited

Loading