-
Notifications
You must be signed in to change notification settings - Fork 604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CORE-8161] storage
: add min.cleanable.dirty.ratio
, schedule compaction by dirty_ratio
#24991
[CORE-8161] storage
: add min.cleanable.dirty.ratio
, schedule compaction by dirty_ratio
#24991
Conversation
db600d4
to
78b2928
Compare
Retry command for Build#61464please wait until all jobs are finished before running the slash command
|
CI test resultstest results on build#61464
test results on build#61480
test results on build#61769
test results on build#61775
test results on build#61791
test results on build#61800
test results on build#61803
test results on build#62093
test results on build#62314
test results on build#62339
test results on build#62381
|
78b2928
to
7a6aab3
Compare
storage
: schedule compaction by dirty_ratio
storage
: add min.cleanable.dirty.ratio
, schedule compaction by dirty_ratio
storage
: add min.cleanable.dirty.ratio
, schedule compaction by dirty_ratio
storage
: add min.cleanable.dirty.ratio
, schedule compaction by dirty_ratio
Retry command for Build#61480please wait until all jobs are finished before running the slash command
|
7a6aab3
to
8b2ac50
Compare
Force push to:
|
8b2ac50
to
0110d6c
Compare
Force push to:
|
Retry command for Build#61769please wait until all jobs are finished before running the slash command
|
0110d6c
to
e99aa91
Compare
Force push to:
|
Retry command for Build#61775please wait until all jobs are finished before running the slash command
|
Expected: Decoded: This is the only Do we have a |
1df39dd
to
70c37b7
Compare
Force push to:
|
70c37b7
to
1af7ac0
Compare
This approach has the added benefit of not having to sort the entire `_logs_list`. However, we now book-keep two data structures instead of one. Pointers to `housekeeping_meta` are held, but there are no concerns about concurrent removal since no scheduling points exist while the order is being mutated.
ce067cd
to
6697885
Compare
Yeah, that would mean disabled by default. I'm not sure what's better here (maintaining the current status quo, or imposing a new sensible default value). I think the I could maybe see leaving it disabled at the cluster level, but defaulting to create new topics with the kafka value WDYT? |
maybe choose something "close" to the existing behavior. so if existing behavior is 0, then maybe something like 0.2 or whatever. then we can be sure that the code is executing in tests, but not yet have to tackle any sort of tuning project? |
Yeah, 0.2 feels like a good value to start with. Ducktape doesn't actually have any complaints with it set to I'm open to whatever you feel is best. |
cool. let's just leave it at 0.5 and we can discuss further when we add the max lag configuration option, and maybe consider integrating with space management. |
(Force pushed to correct value in |
just curious about this, i don't see any recent force pushes |
It's a bit above our most recent conversation, sorry 🙂 |
Retry command for Build#62314please wait until all jobs are finished before running the slash command
|
Well, maybe ducktape doesn't like I can tweak the failing tests to set their |
Force push to:
|
851be25
to
a9cb0be
Compare
Most of these tests expect/need unconditional compacting of the log. Set the `min_cleanable_dirty_ratio` at the topic/cluster level in these tests to mimic the old compaction scheduling behavior before this configuration was added.
a9cb0be
to
cba07b8
Compare
Force push to:
|
Cool. But what was this change? I don't recall it being mentioned |
That change was an amend to the commit for the new ducktape test which was removed. It's no longer in this PR. |
Based on PR #24649.
Instead of unconditionally compacting all logs during a round of housekeeping, users may now optionally schedule log compaction in the
log_manager
using the cluster/topic propertymin_cleanable_dirty_ratio
/min.cleanable.dirty.ratio
.As mentioned in the above PR,
By setting the
min.cleanable.dirty.ratio
on a per topic basis, users can avoid unnecessary read/write amplification during compaction as the log grows in size.A housekeeping scan will still be performed every
log_compaction_interval_ms
, and the log'sdirty_ratio
will be tested againstmin.cleanable.dirty.ratio
in determining it's eligibility for compaction. Additionally, logs are now compacted in descending order according to their dirty ratio, offering a better "bang for buck" heuristic for compaction scheduling.Backports Required
Release Notes
Improvements
min_cleanable_dirty_ratio