Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRAFT: Consume fieldcaps node response while deserializing it #120010

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Jan 11, 2025

Just to illustrate the point of where we can save memory here, the concrete API could be made nicer but functionally this thing seems fine and comes with a measurable performance boost from some quick and dirty benchmarking.

The API implemented here could definitely use some extra love. But the idea seems to work quite well according to some quick testing and this approach can potentially save a massive amount of memory not just in field caps.

Instead of deserializing a full list of per-index responses before getting to processing them, we should consume each response right after deserializing it without any intermediary list. This should be safe to do since we deserialize responses on the same thread that we consume the full response on today.
Not only does this save a lot of heap by merging responses directly, it also might save quite a bit of runtime from better cache locality. Also, having a stateful deserializer allows for a couple helpful tricks in around deduplication across multiple responses for fan-out style APIs as well.

The API implemented here could definitely use some extra love. But the idea seems to work quite well
according to some quick testing and this approach can potentially save a massive amount of memory
not just in field caps.

Instead of deserializing a full list of per-index responses before getting to processing them,
we should consume each response right after deserializing it without any intermediary list.
This should be safe to do since we deserialize responses on the same thread that we consume
the full response on today.
Not only does this save a lot of heap by merging responses directly, it also might save quite
a bit of runtime from better cache locality.
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Jan 11, 2025
This field is only used (by security) for requests, having it in responses is redundant.
Also, we have a couple of responses that are singletons/quasi-enums where setting the value
needlessly might introduce some strange contention even though it's a plain store.

This isn't just a cosmetic change. It makes it clear at compile time that each response instance
is exclusively defined by the bytes that it is read from. This makes it easier to reason about the
validity of suggested optimizations like elastic#120010
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Search Foundations/Search Catch all for Search Foundations v9.0.0 WIP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants