Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce FORCE_DEFRAG compilation option to allow activedefrag run when allocator is not jemalloc #1303

Merged
merged 25 commits into from
Dec 17, 2024

Conversation

ranshid
Copy link
Member

@ranshid ranshid commented Nov 14, 2024

Introduce compile time option to force activedefrag to run even when
jemalloc is not used as the allocator.
This is in order to be able to run tests with defrag enabled
while using memory instrumentation tools.

fixes: #1241

Introduce compile time option to force activedefrag to run even when
jemalloc is not used as the allocator.
This is in order to be able to run tests with defrag enabled
while using memory instrumantation tools.

Signed-off-by: ranshid <[email protected]>
Introduce compile time option to force activedefrag to run even when
jemalloc is not used as the allocator.
This is in order to be able to run tests with defrag enabled
while using memory instrumantation tools.

Signed-off-by: ranshid <[email protected]>
Signed-off-by: ranshid <[email protected]>
@ranshid ranshid requested a review from madolson November 14, 2024 11:45
Copy link

codecov bot commented Nov 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.87%. Comparing base (1acf7f7) to head (5ddb45e).
Report is 17 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #1303      +/-   ##
============================================
+ Coverage     70.71%   70.87%   +0.15%     
============================================
  Files           119      119              
  Lines         64652    64616      -36     
============================================
+ Hits          45717    45794      +77     
+ Misses        18935    18822     -113     
Files with missing lines Coverage Δ
src/allocator_defrag.c 100.00% <100.00%> (ø)
src/config.c 77.59% <ø> (-0.06%) ⬇️
src/defrag.c 88.80% <ø> (-0.23%) ⬇️
src/server.h 100.00% <ø> (ø)

... and 45 files with indirect coverage changes

Signed-off-by: Ran Shidlansik <[email protected]>
@madolson
Copy link
Member

I think we'll want to merge this PR first btw, #1242, since it already exists. I think there will be some merge conflicts.

Copy link
Member

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly makes sense to me. Some ideas for clarity but nothing to really change the direction.

src/defrag.c Outdated
@@ -755,6 +755,15 @@ void defragScanCallback(void *privdata, const dictEntry *de) {
* or not, a false detection can cause the defragmenter to waste a lot of CPU
* without the possibility of getting any results. */
float getAllocatorFragmentation(size_t *out_frag_bytes) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of feels like you should override computeDefragCycles instead, and have that should always set active_defrag_running to like 100%.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with it (makes sense) I do not, however want to break tests which are checking the active_defrag_running, so maybe I can just use the active-defrag-cycle-max as the return value?
Alternatively we can have a tag to skip tests in case of DEBUG_FORCE_DEFRAG

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively we can have a tag to skip tests in case of DEBUG_FORCE_DEFRAG

This intuitively makes a bit more sense to me, since defrag shouldn't really work correctly since we are completely breaking defrag here.

Copy link
Member

@madolson madolson Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the tests already are checking for jemalloc memory to enable the defragmentation, so I think that the tests should all still be skipped anyways?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my last change keeps the same logic and the code is cleaner. I find having the defrag reporting the fragmentation based on the config limits is fine but take a look and let me know.

@madolson
Copy link
Member

@ranshid Zvis change is now merged. Jim's is basically also ready to go so you should be able to resume updating this if you want.

@ranshid
Copy link
Member Author

ranshid commented Dec 8, 2024

@ranshid Zvis change is now merged. Jim's is basically also ready to go so you should be able to resume updating this if you want.

Thank you. just got time to circle back on this one. merging and continue where I left off

Signed-off-by: Ran Shidlansik <[email protected]>
Signed-off-by: Ran Shidlansik <[email protected]>
Signed-off-by: Ran Shidlansik <[email protected]>
Signed-off-by: Ran Shidlansik <[email protected]>
Signed-off-by: Ran Shidlansik <[email protected]>
Signed-off-by: Ran Shidlansik <[email protected]>
@ranshid
Copy link
Member Author

ranshid commented Dec 9, 2024

we will probably hit issues from day 1. eg:

Logged sanitizer errors (pid 53103):
=================================================================
==53103==ERROR: AddressSanitizer: heap-use-after-free on address 0x6080000026b8 at pc 0x0001006130d0 bp 0x00016fa72ec0 sp 0x00016fa72eb8
READ of size 4 at 0x6080000026b8 thread T0
    #0 0x1006130cc in defragStageKvstoreHelper+0x650 (valkey-server:arm64+0x1002870cc)
    #1 0x100610d64 in activeDefragTimeProc+0x2e0 (valkey-server:arm64+0x100284d64)
    #2 0x1003c128c in aeProcessEvents+0x630 (valkey-server:arm64+0x10003528c)
    #3 0x10040334c in main+0x6d7c (valkey-server:arm64+0x10007734c)
    #4 0x19d1ff150 in start+0x9a8 (dyld:arm64e+0xfffffffffff4d150)

0x6080000026b8 is located 24 bytes inside of 96-byte region [0x6080000026a0,0x608000002700)
freed by thread T3 here:
    #0 0x101338d40 in free+0x98 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54d40)
    #1 0x1005bed10 in lazyfreeFreeDatabase+0x54 (valkey-server:arm64+0x100232d10)
    #2 0x100578750 in bioProcessBackgroundJobs+0x4e8 (valkey-server:arm64+0x1001ec750)
    #3 0x101335858 in asan_thread_start(void*)+0x40 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x51858)
    #4 0x19d589f90 in _pthread_start+0x84 (libsystem_pthread.dylib:arm64e+0x6f90)
    #5 0x19d584d30 in thread_start+0x4 (libsystem_pthread.dylib:arm64e+0x1d30)

previously allocated by thread T0 here:
    #0 0x101338fd0 in calloc+0x9c (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54fd0)
    #1 0x1004120a0 in valkey_calloc+0x3c (valkey-server:arm64+0x1000860a0)
    #2 0x1003cd844 in kvstoreCreate+0xa8 (valkey-server:arm64+0x100041844)
    #3 0x1005bf778 in emptyDbAsync+0xbc (valkey-server:arm64+0x100233778)
    #4 0x100458ce8 in emptyData+0x354 (valkey-server:arm64+0x1000ccce8)
    #5 0x100459cd4 in flushallCommand+0x1c0 (valkey-server:arm64+0x1000cdcd4)
    #6 0x1003e9500 in call+0x3cc (valkey-server:arm64+0x10005d500)
    #7 0x1003ec7a4 in processCommand+0x17b4 (valkey-server:arm64+0x1000607a4)
    #8 0x100424358 in processInputBuffer+0x498 (valkey-server:arm64+0x100098358)
    #9 0x100422204 in readQueryFromClient+0xd4 (valkey-server:arm64+0x100096204)
    #10 0x100668688 in connSocketEventHandler+0x160 (valkey-server:arm64+0x1002dc688)
    #11 0x1003c0fb8 in aeProcessEvents+0x35c (valkey-server:arm64+0x100034fb8)
    #12 0x10040334c in main+0x6d7c (valkey-server:arm64+0x10007734c)
    #13 0x19d1ff150 in start+0x9a8 (dyld:arm64e+0xfffffffffff4d150)

Thread T3 created by T0 here:
    #0 0x1013301c8 in pthread_create+0x5c (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x4c1c8)
    #1 0x100578104 in bioInit+0x28c (valkey-server:arm64+0x1001ec104)
    #2 0x1003ff6b4 in main+0x30e4 (valkey-server:arm64+0x1000736b4)
    #3 0x19d1ff150 in start+0x9a8 (dyld:arm64e+0xfffffffffff4d150)

I think it is after the changes introduced in #1242 since the db expires can be freed in the background but is passed as target in beginDefragCycle.

I wonder if to invest time to fix it now or address them later

Signed-off-by: Ran Shidlansik <[email protected]>
Signed-off-by: Madelyn Olson <[email protected]>
Copy link
Member

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Direction looks fine to me, but just a minor comment about removing one of the test permutations.

@ranshid ranshid merged commit ba25b58 into valkey-io:unstable Dec 17, 2024
49 checks passed
kronwerk pushed a commit to kronwerk/valkey that referenced this pull request Jan 27, 2025
…hen allocator is not jemalloc (valkey-io#1303)

Introduce compile time option to force activedefrag to run even when
jemalloc is not used as the allocator.
This is in order to be able to run tests with defrag enabled
while using memory instrumentation tools.

fixes: valkey-io#1241

---------

Signed-off-by: ranshid <[email protected]>
Signed-off-by: Ran Shidlansik <[email protected]>
Signed-off-by: Madelyn Olson <[email protected]>
Signed-off-by: ranshid <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[NEW]Allow activedefrag to run when jemalloc is not used
3 participants