-
Notifications
You must be signed in to change notification settings - Fork 729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix LRU crash when getting too many random lua scripts #1310
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thanks. This make sense to me, btw, can we do something in defrag.c to resume this pointer?
please fix the DCO btw
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## unstable #1310 +/- ##
============================================
- Coverage 70.73% 70.71% -0.03%
============================================
Files 115 115
Lines 63158 63158
============================================
- Hits 44674 44661 -13
- Misses 18484 18497 +13
|
6222d74
to
81e4cd1
Compare
Thanks for the comment. Are you saying about tracking references during defragmentation? if not, can you elaborate on your questions please? |
yean, something like that. |
Signed-off-by: Seungmin Lee <[email protected]>
81e4cd1
to
ca56990
Compare
Not sure why it got timeout in the test. Can we re-run the workflow again |
no worry, we can ignore irrelevant test cases, there are some flaky tests |
I looked at it briefly. We can make a change here https://github.com/valkey-io/valkey/blob/unstable/src/defrag.c#L300, so that we also have a "defragLuaScriptKey" function which would attempt to defrag the key, and if it does it can go lookup the script in the hash table, find the associated listNode in the LRU, and then update the sds in the linked list. My original preference was to decouple the two strings since I think it's generally safer to avoid sharing pointers, to avoid situations like this, when it's not going to be a big memory sink (it's 500 * 64 = 32kb worst case). I'm up to either though @enjoy-binbin. FYI, I think we'll get a test for free with #1303 when it's merged, otherwise let's add a test after that PR is merged. |
Co-authored-by: Binbin <[email protected]> Signed-off-by: Seungmin Lee <[email protected]>
This LGTM, decouple the pointer is safer, i am ok with it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question for other reviewers:
Is there any memory usage for scripts (I don't remember one) where we need to include this additional allocation?
A test failed on
PR is out to fix: #1318 |
### Problem Valkey stores scripts in a dictionary (lua_scripts) keyed by their SHA1 hashes, but it needs a way to know which scripts are least recently used. It uses an LRU list (lua_scripts_lru_list) to keep track of scripts in usage order. When the list reaches a maximum length, Valkey evicts the oldest scripts to free memory in both the list and dictionary. The problem here is that the sds from the LRU list can be pointing to already freed/moved memory by active defrag that the sds in the dictionary used to point to. It results in assertion error at [this line](https://github.com/valkey-io/valkey/blob/unstable/src/eval.c#L519) ### Solution If we duplicate the sds when adding it to the LRU list, we can create an independent copy of the script identifier (sha). This duplication ensures that the sha string in the LRU list remains stable and unaffected by any defragmentation that could alter or free the original sds. In addition, dictUnlink doesn't require exact pointer match([ref](https://github.com/valkey-io/valkey/blob/unstable/src/eval.c#L71-L78)) so this change makes sense to unlink the right dictEntry with the copy of the sds. ### Reproduce To reproduce it with tcl test: 1. Disable je_get_defrag_hint in defrag.c to trigger defrag often 2. Execute test script ``` start_server {tags {"auth external:skip"}} { test {Regression for script LRU crash} { r config set activedefrag yes r config set active-defrag-ignore-bytes 1 r config set active-defrag-threshold-lower 0 r config set active-defrag-threshold-upper 1 r config set active-defrag-cycle-min 99 r config set active-defrag-cycle-max 99 for {set i 0} {$i < 100000} {incr i} { r eval "return $i" 0 } after 5000; } } ``` ### Crash info Crash report: ``` === REDIS BUG REPORT START: Cut & paste starting from here === 14044:M 12 Nov 2024 14:51:27.054 # === ASSERTION FAILED === 14044:M 12 Nov 2024 14:51:27.054 # ==> eval.c:556 'de' is not true ------ STACK TRACE ------ Backtrace: /usr/bin/redis-server 127.0.0.1:6379 [cluster](luaDeleteFunction+0x148)[0x723708] /usr/bin/redis-server 127.0.0.1:6379 [cluster](luaCreateFunction+0x26c)[0x724450] /usr/bin/redis-server 127.0.0.1:6379 [cluster](evalCommand+0x2bc)[0x7254dc] /usr/bin/redis-server 127.0.0.1:6379 [cluster](call+0x574)[0x5b8d14] /usr/bin/redis-server 127.0.0.1:6379 [cluster](processCommand+0xc84)[0x5b9b10] /usr/bin/redis-server 127.0.0.1:6379 [cluster](processCommandAndResetClient+0x11c)[0x6db63c] /usr/bin/redis-server 127.0.0.1:6379 [cluster](processInputBuffer+0x1b0)[0x6dffd4] /usr/bin/redis-server 127.0.0.1:6379 [cluster][0x6bd968] /usr/bin/redis-server 127.0.0.1:6379 [cluster][0x659634] /usr/bin/redis-server 127.0.0.1:6379 [cluster](amzTLSEventHandler+0x194)[0x6588d8] /usr/bin/redis-server 127.0.0.1:6379 [cluster][0x750c88] /usr/bin/redis-server 127.0.0.1:6379 [cluster](aeProcessEvents+0x228)[0x757fa8] /usr/bin/redis-server 127.0.0.1:6379 [cluster](redisMain+0x478)[0x7786b8] /lib64/libc.so.6(__libc_start_main+0xe4)[0xffffa7763da4] /usr/bin/redis-server 127.0.0.1:6379 [cluster][0x5ad3b0] ``` Defrag info: ``` mem_fragmentation_ratio:1.18 mem_fragmentation_bytes:47229992 active_defrag_hits:20561 active_defrag_misses:5878518 active_defrag_key_hits:77 active_defrag_key_misses:212 total_active_defrag_time:29009 ``` ### Test: Run the test script to push 100,000 scripts to ensure the LRU list keeps 500 maximum length without any crash. ``` 27489:M 14 Nov 2024 20:56:41.583 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.583 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 [ok]: Regression for script LRU crash (6811 ms) [1/1 done]: unit/test (7 seconds) ``` --------- Signed-off-by: Seungmin Lee <[email protected]> Signed-off-by: Seungmin Lee <[email protected]> Co-authored-by: Seungmin Lee <[email protected]> Co-authored-by: Binbin <[email protected]>
### Problem Valkey stores scripts in a dictionary (lua_scripts) keyed by their SHA1 hashes, but it needs a way to know which scripts are least recently used. It uses an LRU list (lua_scripts_lru_list) to keep track of scripts in usage order. When the list reaches a maximum length, Valkey evicts the oldest scripts to free memory in both the list and dictionary. The problem here is that the sds from the LRU list can be pointing to already freed/moved memory by active defrag that the sds in the dictionary used to point to. It results in assertion error at [this line](https://github.com/valkey-io/valkey/blob/unstable/src/eval.c#L519) ### Solution If we duplicate the sds when adding it to the LRU list, we can create an independent copy of the script identifier (sha). This duplication ensures that the sha string in the LRU list remains stable and unaffected by any defragmentation that could alter or free the original sds. In addition, dictUnlink doesn't require exact pointer match([ref](https://github.com/valkey-io/valkey/blob/unstable/src/eval.c#L71-L78)) so this change makes sense to unlink the right dictEntry with the copy of the sds. ### Reproduce To reproduce it with tcl test: 1. Disable je_get_defrag_hint in defrag.c to trigger defrag often 2. Execute test script ``` start_server {tags {"auth external:skip"}} { test {Regression for script LRU crash} { r config set activedefrag yes r config set active-defrag-ignore-bytes 1 r config set active-defrag-threshold-lower 0 r config set active-defrag-threshold-upper 1 r config set active-defrag-cycle-min 99 r config set active-defrag-cycle-max 99 for {set i 0} {$i < 100000} {incr i} { r eval "return $i" 0 } after 5000; } } ``` ### Crash info Crash report: ``` === REDIS BUG REPORT START: Cut & paste starting from here === 14044:M 12 Nov 2024 14:51:27.054 # === ASSERTION FAILED === 14044:M 12 Nov 2024 14:51:27.054 # ==> eval.c:556 'de' is not true ------ STACK TRACE ------ Backtrace: /usr/bin/redis-server 127.0.0.1:6379 [cluster](luaDeleteFunction+0x148)[0x723708] /usr/bin/redis-server 127.0.0.1:6379 [cluster](luaCreateFunction+0x26c)[0x724450] /usr/bin/redis-server 127.0.0.1:6379 [cluster](evalCommand+0x2bc)[0x7254dc] /usr/bin/redis-server 127.0.0.1:6379 [cluster](call+0x574)[0x5b8d14] /usr/bin/redis-server 127.0.0.1:6379 [cluster](processCommand+0xc84)[0x5b9b10] /usr/bin/redis-server 127.0.0.1:6379 [cluster](processCommandAndResetClient+0x11c)[0x6db63c] /usr/bin/redis-server 127.0.0.1:6379 [cluster](processInputBuffer+0x1b0)[0x6dffd4] /usr/bin/redis-server 127.0.0.1:6379 [cluster][0x6bd968] /usr/bin/redis-server 127.0.0.1:6379 [cluster][0x659634] /usr/bin/redis-server 127.0.0.1:6379 [cluster](amzTLSEventHandler+0x194)[0x6588d8] /usr/bin/redis-server 127.0.0.1:6379 [cluster][0x750c88] /usr/bin/redis-server 127.0.0.1:6379 [cluster](aeProcessEvents+0x228)[0x757fa8] /usr/bin/redis-server 127.0.0.1:6379 [cluster](redisMain+0x478)[0x7786b8] /lib64/libc.so.6(__libc_start_main+0xe4)[0xffffa7763da4] /usr/bin/redis-server 127.0.0.1:6379 [cluster][0x5ad3b0] ``` Defrag info: ``` mem_fragmentation_ratio:1.18 mem_fragmentation_bytes:47229992 active_defrag_hits:20561 active_defrag_misses:5878518 active_defrag_key_hits:77 active_defrag_key_misses:212 total_active_defrag_time:29009 ``` ### Test: Run the test script to push 100,000 scripts to ensure the LRU list keeps 500 maximum length without any crash. ``` 27489:M 14 Nov 2024 20:56:41.583 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.583 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 [ok]: Regression for script LRU crash (6811 ms) [1/1 done]: unit/test (7 seconds) ``` --------- Signed-off-by: Seungmin Lee <[email protected]> Signed-off-by: Seungmin Lee <[email protected]> Co-authored-by: Seungmin Lee <[email protected]> Co-authored-by: Binbin <[email protected]>
### Problem Valkey stores scripts in a dictionary (lua_scripts) keyed by their SHA1 hashes, but it needs a way to know which scripts are least recently used. It uses an LRU list (lua_scripts_lru_list) to keep track of scripts in usage order. When the list reaches a maximum length, Valkey evicts the oldest scripts to free memory in both the list and dictionary. The problem here is that the sds from the LRU list can be pointing to already freed/moved memory by active defrag that the sds in the dictionary used to point to. It results in assertion error at [this line](https://github.com/valkey-io/valkey/blob/unstable/src/eval.c#L519) ### Solution If we duplicate the sds when adding it to the LRU list, we can create an independent copy of the script identifier (sha). This duplication ensures that the sha string in the LRU list remains stable and unaffected by any defragmentation that could alter or free the original sds. In addition, dictUnlink doesn't require exact pointer match([ref](https://github.com/valkey-io/valkey/blob/unstable/src/eval.c#L71-L78)) so this change makes sense to unlink the right dictEntry with the copy of the sds. ### Reproduce To reproduce it with tcl test: 1. Disable je_get_defrag_hint in defrag.c to trigger defrag often 2. Execute test script ``` start_server {tags {"auth external:skip"}} { test {Regression for script LRU crash} { r config set activedefrag yes r config set active-defrag-ignore-bytes 1 r config set active-defrag-threshold-lower 0 r config set active-defrag-threshold-upper 1 r config set active-defrag-cycle-min 99 r config set active-defrag-cycle-max 99 for {set i 0} {$i < 100000} {incr i} { r eval "return $i" 0 } after 5000; } } ``` ### Crash info Crash report: ``` === REDIS BUG REPORT START: Cut & paste starting from here === 14044:M 12 Nov 2024 14:51:27.054 # === ASSERTION FAILED === 14044:M 12 Nov 2024 14:51:27.054 # ==> eval.c:556 'de' is not true ------ STACK TRACE ------ Backtrace: /usr/bin/redis-server 127.0.0.1:6379 [cluster](luaDeleteFunction+0x148)[0x723708] /usr/bin/redis-server 127.0.0.1:6379 [cluster](luaCreateFunction+0x26c)[0x724450] /usr/bin/redis-server 127.0.0.1:6379 [cluster](evalCommand+0x2bc)[0x7254dc] /usr/bin/redis-server 127.0.0.1:6379 [cluster](call+0x574)[0x5b8d14] /usr/bin/redis-server 127.0.0.1:6379 [cluster](processCommand+0xc84)[0x5b9b10] /usr/bin/redis-server 127.0.0.1:6379 [cluster](processCommandAndResetClient+0x11c)[0x6db63c] /usr/bin/redis-server 127.0.0.1:6379 [cluster](processInputBuffer+0x1b0)[0x6dffd4] /usr/bin/redis-server 127.0.0.1:6379 [cluster][0x6bd968] /usr/bin/redis-server 127.0.0.1:6379 [cluster][0x659634] /usr/bin/redis-server 127.0.0.1:6379 [cluster](amzTLSEventHandler+0x194)[0x6588d8] /usr/bin/redis-server 127.0.0.1:6379 [cluster][0x750c88] /usr/bin/redis-server 127.0.0.1:6379 [cluster](aeProcessEvents+0x228)[0x757fa8] /usr/bin/redis-server 127.0.0.1:6379 [cluster](redisMain+0x478)[0x7786b8] /lib64/libc.so.6(__libc_start_main+0xe4)[0xffffa7763da4] /usr/bin/redis-server 127.0.0.1:6379 [cluster][0x5ad3b0] ``` Defrag info: ``` mem_fragmentation_ratio:1.18 mem_fragmentation_bytes:47229992 active_defrag_hits:20561 active_defrag_misses:5878518 active_defrag_key_hits:77 active_defrag_key_misses:212 total_active_defrag_time:29009 ``` ### Test: Run the test script to push 100,000 scripts to ensure the LRU list keeps 500 maximum length without any crash. ``` 27489:M 14 Nov 2024 20:56:41.583 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.583 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 27489:M 14 Nov 2024 20:56:41.584 * LRU List length: 500 [ok]: Regression for script LRU crash (6811 ms) [1/1 done]: unit/test (7 seconds) ``` --------- Signed-off-by: Seungmin Lee <[email protected]> Signed-off-by: Seungmin Lee <[email protected]> Co-authored-by: Seungmin Lee <[email protected]> Co-authored-by: Binbin <[email protected]>
Problem
Valkey stores scripts in a dictionary (lua_scripts) keyed by their SHA1 hashes, but it needs a way to know which scripts are least recently used. It uses an LRU list (lua_scripts_lru_list) to keep track of scripts in usage order. When the list reaches a maximum length, Valkey evicts the oldest scripts to free memory in both the list and dictionary. The problem here is that the sds from the LRU list can be pointing to already freed/moved memory by active defrag that the sds in the dictionary used to point to. It results in assertion error at this line
Solution
If we duplicate the sds when adding it to the LRU list, we can create an independent copy of the script identifier (sha). This duplication ensures that the sha string in the LRU list remains stable and unaffected by any defragmentation that could alter or free the original sds. In addition, dictUnlink doesn't require exact pointer match(ref) so this change makes sense to unlink the right dictEntry with the copy of the sds.
Reproduce
To reproduce it with tcl test:
Crash info
Crash report:
Defrag info:
Test:
Run the test script to push 100,000 scripts to ensure the LRU list keeps 500 maximum length without any crash.