Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(tiering): Faster small bins serialization #3340

Closed
wants to merge 1 commit into from

Conversation

dranikpg
Copy link
Contributor

No description provided.

@dranikpg dranikpg force-pushed the tiering-offload-opt branch 2 times, most recently from 549f0a0 to 150256b Compare July 20, 2024 11:54
Comment on lines +73 to +76
void FlushDelayed(bool force) {
// Flush pages with most records accumulated first, or all, if forced.
// It's enough just to issue reads, because they are collapsed by the tiered storage internally
while ((force && !delayed_.empty()) || delayed_.size() > kMaxPageAccum) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to choose some kind of limit here... by the number of pages? by the total sum of enties? 🤷🏻‍♂️ We have to take into account that once we flush all of those, we can get a memory spike

Comment on lines 437 to 441
}

// Flush tiered delayed entries to avoid reordering with journal
tiered_serializer_->FlushDelayed(true);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we pass the value with ChangeReq& and find out whether it was tiered, only if it's tiered and part of a relevant page, we should flush it

int64_t memory_margin_ = 0;
std::vector<tiering::DiskSegment> delayed_deletes_;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: better naming

@dranikpg dranikpg requested a review from romange July 20, 2024 11:57
@dranikpg dranikpg changed the title WIP feat(tiering): Faster small bins serialization feat(tiering): Faster small bins serialization Jul 20, 2024
@dranikpg
Copy link
Contributor Author

@romange Please look only at the idea, I left some open questions

Copy link
Collaborator

@romange romange left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice that you suggest to do it without changing the RDB format.
I am not sure about the effectiveness of flush though. the probability of keys from the same page being in the same serialization batch are low.

: db_slice_(slice), dest_(dest), compression_mode_(compression_mode) {
: db_slice_(slice),
dest_(dest),
tiered_serializer_(new TieredSerializer{}),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want to initialize it unconditionally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷🏻‍♂️ I don't think it's expensive

auto entries = delayed_.extract(page);
for (auto& [base, value] : entries.mapped()) {
DCHECK(value.IsExternal());
pending_.push_back(Read(std::move(base), value));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what guarantess that by the time you flush delayed, their segment are still correct?
maybe the pages were freed and repurposed for other values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside tiered storage, I don't delete small bin pages while we're serializing

delayed_sizes_.insert({entries.size(), page});
}
} else {
pending_.push_back(Read(std::move(base), pv));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: pending_reads_

@romange
Copy link
Collaborator

romange commented Jul 21, 2024

I thought about this more. Master is the more fragile piece compared to slave or an instance loading the snapshot.
That means we want to move the complexity to Load phase and I think it is possible to do if we just send the segment support at the end. This suggestion tracks pending items on the master side, i.e. maintains per object granularity. I think it's worth extending RDB support to maintain segment granularity (i.e. send 4K pages with their offsets.).

@dranikpg dranikpg closed this Jul 25, 2024
@dranikpg dranikpg deleted the tiering-offload-opt branch July 25, 2024 12:59
@dranikpg dranikpg restored the tiering-offload-opt branch July 25, 2024 14:54
@dranikpg dranikpg reopened this Jul 25, 2024
@dranikpg
Copy link
Contributor Author

Closed in favour of #3396

@dranikpg dranikpg closed this Sep 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants