Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix docs: change some description in w2d3 documentation for tiered compaction trigger condition #82

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

cvt220106
Copy link

  1. compaction trigger precondition
    The original text stated that compaction triggers when the number of tiers is 'larger than' num_tiers, which could be interpreted as strictly greater than (>). This has been corrected to 'greater than or equal to (or can say reaches)' to accurately reflect the intended behavior and align with the actual implementation.
  2. min_merge_width condition
    same as above, it should be 'at least' not 'more than'
  3. reduce sorted run conditions
    accurately states the aim to reduce tiers to num_tiers - 1

@skyzh skyzh self-requested a review June 30, 2024 04:32
@cvt220106 cvt220106 closed this Jun 30, 2024
@skyzh
Copy link
Owner

skyzh commented Jul 2, 2024

Is the changes from your output? I'm currently investigating issues related to tier/level id, there might be some mismatches between the write-up and the actual code.

@cvt220106
Copy link
Author

Is the changes from your output? I'm currently investigating issues related to tier/level id, there might be some mismatches between the write-up and the actual code.

In your tutorial, we always set the tier/level id as the first sst id in this tier/level.
But when you use the compaction simulator, choose the --dump-real-id option or the default --dump-origin-id option, it will be different.
In compaction simulator, u code after apply compaction result, write like this

for file in files {
    let new_sst_id = storage.generate_sst_id();
    sst_ids.push(new_sst_id);
    storage.file_list.insert(new_sst_id, *file);
    storage.total_writes += 1;
}

It's not change the actual file in file_list.
so when we use the compaction simulator on default option (dump origin id, it will be use file, it's not change when we apply result),

println!(
    "L{level} ({}): {:?}",
    files.len(),
    files.iter().map(|x| self.file_list[x]).collect::<Vec<_>>()
);

the output will be just like current example in docs

--- After Flush ---
L3 (1): [3]
L2 (1): [2]
L1 (1): [1]
--- Compaction Task ---
--- Compaction Task ---
compaction trigger by space amplification ratio: 200
L3 [3] L2 [2] L1 [1] -> [4, 5, 6]
--- After Compaction ---
L4 (3): [3, 2, 1]

and then, when we use the '--dump-real-id` option, it will use those new generated sst ids; at this time, tier id will observe the rule.

println!("L{level} ({}): {:?}", files.len(), files);

this output maybe more readable and meet our expectations.

--- After Flush ---
L3 (1): [3]
L2 (1): [2]
L1 (1): [1]
--- Compaction Task ---
--- Compaction Task ---
compaction triggered by space amplification ratio: 200
L3 [3] L2 [2] L1 [1] -> [4, 5, 6]
--- After Compaction ---
L4 (3): [4, 5, 6]

@cvt220106 cvt220106 reopened this Jul 3, 2024
@cvt220106
Copy link
Author

So my changes in the example aren't absolutely right. It simply changes the tier id to make it observe tier id equal to the first sst id in the tier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants