[hashset feature] Scan shortcut fix #1217

SoftlyRaining · 2024-10-23T23:01:24Z

This is targeting the hashset branch and will update the hashset PR when merged. This fixes the issue I saw where scan can miss elements when we're rehashing: #1186 (comment)

The UT fails without my fix, and passes with my fix 😁

Signed-off-by: Rain Valentine <[email protected]>

src/hashset.c

codecov · 2024-10-23T23:31:20Z

Codecov Report

Attention: Patch coverage is 42.85714% with 32 lines in your changes missing coverage. Please review.

Project coverage is 70.36%. Comparing base (5129254) to head (d33eec8).
Report is 1 commits behind head on hashset.

Files with missing lines	Patch %	Lines
src/hashset.c	42.85%	32 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           hashset    #1217      +/-   ##
===========================================
- Coverage    70.77%   70.36%   -0.42%     
===========================================
  Files          115      115              
  Lines        64714    63812     -902     
===========================================
- Hits         45804    44899     -905     
- Misses       18910    18913       +3

Files with missing lines	Coverage Δ
src/hashset.c	`41.11% <42.85%> (+1.14%)`	⬆️

... and 17 files with indirect coverage changes

zuiderkwast

Finally you convinced me that the current logic has a problem!

This PR solves the issue indeed. I'm inclined to merge it. I just posted some suggestions.

Letting rehash follow probing chains can add a lot of latency to individual commands though. I'm not sure that's worse than adding very little latency to each scan call.

Some context: When debugging a test case of AOF rewrite, I observed probing chains of length 6000. This is when reading an AOF file. Actually I believe all non-empty buckets of the table were in the same probing chain.

Reading an AOF file means reading commands and executing them. The AOF file create by AOF rewrite has the commands stored per key, and the keys are in iterator order = probing order. Do you see the problem?

All commands that hash to the same bucket in a large table come together in the beginning of the AOF file, creating huge probing chains in small tables, before growing kicks in. A SET command takes 300 times longer than normal. I tried to let rehashing follow probing chains, but this got even worse spikes, with long probe chains + rehashing a very long chains.

Here's the test case: https://github.com/valkey-io/valkey/pull/1178/files#r1814492725

If we could let the iterator walk in some other order (such as index++) we could avoid this, but it doesn't completely solve the problem either. I guess there's simply a chance of getting very large probe chains and this cases a severe degradation.

So what can we do? Maybe keep track of not only the number of 'everfulls', but the max lengths of probe chains, and grow the table if it's too long? This might help.

But I think the best solution is Madelyn's idea about chaining instead of probing.

I think we need to implement it. It has other benefits too, like

we can have higher fill factor and we can avoid rehashing indefinitely when there's a fork running;
scan doesn't need to follow chains wrapping around zero, so it won't return duplicates.

src/hashset.c

zuiderkwast · 2024-10-24T15:17:44Z

src/hashset.c

-    /* Mask the start cursor to the bigger of the tables, so we can detect if we
+    /* Mask the start cursor to the smaller of the tables, so we can detect if we
     * come back to the start cursor and break the loop. It can happen if enough
     * tombstones (in both tables while rehashing) make us continue scanning. */
-    cursor = cursor & (expToMask(s->bucket_exp[0]) | expToMask(s->bucket_exp[1]));
+    cursor &= expToMask(s->bucket_exp[0]);
+    if (hashsetIsRehashing(s)) {
+        cursor &= expToMask(s->bucket_exp[1]);
+    }


With this change, I guess there's a small increased risk of returning duplicates.

Time 1:

Table size: 16 buckets.

Scan order: 0 -> 8 -> 4 -> 12 -> 2 -> 10 -> 6 -> 14 -> 1 -> ... -> 15.

SCAN 0 returned new cursor 8.

Time 2:

Table sizes 16 and 2. Shrinking started. Scan order: 0 -> 1.

User continues previous scan with SCAN 8.

Old implementation: SCAN 8 returns elements from small table bucket 0 (masked with small table mask) and large table bucket 8 (masked with large table mask). Large table bucket 0 can safely be skipped.

New implementation: SCAN 8 returns elements from small table bucket 0 (masked with small table mask) and large table bucket 0 and 8 (expansion of cursor masked with small table mask).

Both implementaions: Return new cursor 1.

Hmm, I see! Yes, that makes sense. I think it'll be fixed if I don't modify cursor here and instead apply the masks to start_cursor

Yes, maybe. But it doesn't matter much, especially if we want to change probing to chaining anyway... Let's not spend too much effort on this.

src/hashset.c

…duplicates when a shrink-rehash started since the last scan step Signed-off-by: Rain Valentine <[email protected]>

zuiderkwast · 2024-11-08T00:58:14Z

With chaining instead of probing, I guess we don't need this fix anymore. Thanks anyway!

SoftlyRaining added 2 commits October 23, 2024 22:56

fix rehashing shortcut used in hashsetScan

3879c1b

Signed-off-by: Rain Valentine <[email protected]>

forgot clang format again

9bf0c39

Signed-off-by: Rain Valentine <[email protected]>

SoftlyRaining mentioned this pull request Oct 23, 2024

New hash table #1186

Merged

SoftlyRaining commented Oct 23, 2024

View reviewed changes

src/hashset.c Show resolved Hide resolved

zuiderkwast reviewed Oct 24, 2024

View reviewed changes

SoftlyRaining assigned zuiderkwast Oct 29, 2024

SoftlyRaining force-pushed the scan-shortcut-fix branch from d33eec8 to 0849972 Compare November 1, 2024 21:06

don't mask scan cursor to smaller table increment - this avoids some …

8388f85

…duplicates when a shrink-rehash started since the last scan step Signed-off-by: Rain Valentine <[email protected]>

SoftlyRaining force-pushed the scan-shortcut-fix branch from 0849972 to 8388f85 Compare November 1, 2024 21:07

zuiderkwast closed this Nov 8, 2024

SoftlyRaining deleted the scan-shortcut-fix branch November 26, 2024 09:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[hashset feature] Scan shortcut fix #1217

[hashset feature] Scan shortcut fix #1217

SoftlyRaining commented Oct 23, 2024

codecov bot commented Oct 23, 2024 •

edited

Loading

zuiderkwast left a comment

zuiderkwast Oct 24, 2024

SoftlyRaining Oct 24, 2024

zuiderkwast Oct 25, 2024

zuiderkwast commented Nov 8, 2024

[hashset feature] Scan shortcut fix #1217

[hashset feature] Scan shortcut fix #1217

Conversation

SoftlyRaining commented Oct 23, 2024

codecov bot commented Oct 23, 2024 • edited Loading

Codecov Report

zuiderkwast left a comment

Choose a reason for hiding this comment

zuiderkwast Oct 24, 2024

Choose a reason for hiding this comment

SoftlyRaining Oct 24, 2024

Choose a reason for hiding this comment

zuiderkwast Oct 25, 2024

Choose a reason for hiding this comment

zuiderkwast commented Nov 8, 2024

codecov bot commented Oct 23, 2024 •

edited

Loading