Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update vector collection tuning tips and describe efSearch hint [AI-195] [AI-192] #1521

Merged
merged 12 commits into from
Feb 5, 2025

Conversation

k-jamroz
Copy link
Contributor

@k-jamroz k-jamroz commented Feb 3, 2025

There were important changes in 6.0 impacting vector collection tuning: introduction of efSearch hint and default value for partitionLimit.

After https://hazelcast.atlassian.net/browse/AI-192 mutating operations do not fail during optimization.

@k-jamroz k-jamroz requested a review from a team as a code owner February 3, 2025 18:30
Copy link

netlify bot commented Feb 3, 2025

Deploy Preview for hardcore-allen-f5257d ready!

Name Link
🔨 Latest commit 8ecff5e
🔍 Latest deploy log https://app.netlify.com/sites/hardcore-allen-f5257d/deploys/67a3775afa32ee0008587a42
😎 Deploy Preview https://deploy-preview-1521--hardcore-allen-f5257d.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@k-jamroz k-jamroz requested a review from yuce February 3, 2025 18:30
@k-jamroz k-jamroz changed the title Update vector collection tuning tips and describe efSearch hint [AI-195] Update vector collection tuning tips and describe efSearch hint [AI-195] [AI-192] Feb 3, 2025
Copy link
Contributor

@yuce yuce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some ideas...

1. For searches with small `topK` (for example, 10) it may be beneficial to artificially increase `topK`, adjust `partitionLimit` accordingly, and discard extra results. If you need 10 results, a good starting point for tuning could be `topK=100` and a `partitionLimit` between 50 and 100. While this will make the search slower, it will also improve quality, sometimes significantly. Overall, this setup can be more efficient than increasing index build parameters (`max-degree`, `ef-construction`) which results in slower index builds and searches. With a very small `topK` or `paritionLimit`, the search algorithm is less able to escape local minima and find the best results.
2. Vector deduplication does not incur significant overhead for uploads (usually less than 1%) and searches. You may consider disabling it to get slightly better performance and smaller memory usage if your dataset does not contain duplicated vectors. However, be aware that in the presence of many duplicated vectors with deduplication disabled, a similarity search may return poor quality results.
3. For a given query, each vector index partition is searched by 1 thread. The number of concurrent partition searches is configured by specifying a pool size for `hz:query` executor, which by default has 16 threads per member. If optimizing for search, we recommend setting the `hz:query` pool size to be that of the physical core count of your host machines: this will result in a good balance between search throughput and CPU utilization. Setting `hz:query` to have a pool size greater than that of the physical core count will not deliver a significant increase in throughput but it will increase total CPU utilization. The `hz:query` pool size can be changed as follows:
1. Enable Vector API.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to mention how to do that, by using --add-modules jdk.incubator.vector or link to the appropriate document.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -93,12 +93,24 @@ To allow the system to return enough results, the following conditions must be s

- `partitionLimit * partitionCount >= topK`, `partitionLimit <= topK`
- `memberLimit * memberCount >= topK`, `memberLimit <= topK`
- `efSearch >= partitionLimit`, if `partitionLimit` is not configured explicitly this applies to the default `partitionLimit` value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIS the default value of partitionLimit is not mentioned anywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the exact formula is complicated (https://github.com/hazelcast/hazelcast-mono/pull/3258). I described it generally:

partitionLimit is calculated based on topK and cluster configuration (number of partitions)

4. Before increasing index build parameters (`max-degree`, `ef-construction`) which would result in slower index builds and searches and a larger index,
test if adjusting `efSearch` gives satisfactory results.
5. Vector deduplication does not incur significant overhead for uploads (usually less than 1%) and searches. You may consider disabling it to get slightly better performance and smaller memory usage if your dataset does not contain duplicated vectors. However, be aware that in the presence of many duplicated vectors with deduplication disabled, a similarity search may return poor quality results.
6. For a given query, each vector index partition is searched by 1 thread. The number of concurrent partition searches is configured by specifying a pool size for `hz:query` executor, which by default has 16 threads per member. If optimizing for search, we recommend setting the `hz:query` pool size to be that of the physical core count of your host machines: this will result in a good balance between search throughput and CPU utilization. Setting `hz:query` to have a pool size greater than that of the physical core count will not deliver a significant increase in throughput but it will increase total CPU utilization. The `hz:query` pool size can be changed as follows:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
6. For a given query, each vector index partition is searched by 1 thread. The number of concurrent partition searches is configured by specifying a pool size for `hz:query` executor, which by default has 16 threads per member. If optimizing for search, we recommend setting the `hz:query` pool size to be that of the physical core count of your host machines: this will result in a good balance between search throughput and CPU utilization. Setting `hz:query` to have a pool size greater than that of the physical core count will not deliver a significant increase in throughput but it will increase total CPU utilization. The `hz:query` pool size can be changed as follows:
6. For a given query, each vector index partition is searched by one thread. The number of concurrent partition searches is configured by specifying a pool size for `hz:query` executor, which by default has 16 threads per member. If optimizing for search, we recommend setting the `hz:query` pool size to be that of the physical core count of your host machines: this will result in a good balance between search throughput and CPU utilization. Setting `hz:query` to have a pool size greater than that of the physical core count will not deliver a significant increase in throughput but it will increase total CPU utilization. The `hz:query` pool size can be changed as follows:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is hz:query is changed? It would be good to link to the relevant doc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is an example below

@k-jamroz k-jamroz requested a review from yuce February 4, 2025 10:55
@oliverhowell oliverhowell self-assigned this Feb 4, 2025
Copy link
Contributor

@oliverhowell oliverhowell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly minor edits but a few questions on the table values. Not suggested any changes to these pages that are not in this PR but I do see a few issues which we can come back to in a later update e.g. page name vs. title, explicitly introducing code samples (helps with Ask AI)

docs/modules/data-structures/pages/vector-collections.adoc Outdated Show resolved Hide resolved
@@ -718,6 +719,9 @@ You can use hints to fine-tune search precision, especially with smaller `limit`
|===
|Hint|Description

|efSearch
|Size of list of potential candidates during search. Larger value results in better precision but slower execution.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the format of this value? i.e. how should the user use this hint?
Can we add a type or default to the table overall?
Also we should explicitly introduce the examples in general e.g. The following code example shows how to add search options and hints

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you add these hints to the example (not sure if you should combine them though) then this would suffice rather than adding detail about type of value needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added type in 8bd114e and example in 18288d0.

Most hints should not be normally used, except efSearch which maybe will be promoted to the full-fledged SearchOption in the future and partitionLimit in case of skewed distribution. Other are useful for advanced tuning and benchmarking.

docs/modules/data-structures/pages/vector-collections.adoc Outdated Show resolved Hide resolved
docs/modules/data-structures/pages/vector-collections.adoc Outdated Show resolved Hide resolved
docs/modules/data-structures/pages/vector-collections.adoc Outdated Show resolved Hide resolved
Copy link
Contributor

@oliverhowell oliverhowell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing comments - not sure if you've finished edits but approving from docs perspective

@k-jamroz k-jamroz merged commit b2e00e9 into hazelcast:main Feb 5, 2025
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants