Update vector collection tuning tips and describe efSearch hint [AI-195] [AI-192] #1521

k-jamroz · 2025-02-03T18:30:17Z

There were important changes in 6.0 impacting vector collection tuning: introduction of efSearch hint and default value for partitionLimit.

After https://hazelcast.atlassian.net/browse/AI-192 mutating operations do not fail during optimization.

netlify · 2025-02-03T18:30:38Z

✅ Deploy Preview for hardcore-allen-f5257d ready!

Name	Link
🔨 Latest commit	`8ecff5e`
🔍 Latest deploy log	https://app.netlify.com/sites/hardcore-allen-f5257d/deploys/67a3775afa32ee0008587a42
😎 Deploy Preview	https://deploy-preview-1521--hardcore-allen-f5257d.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

yuce

Some ideas...

yuce · 2025-02-04T07:49:09Z

docs/modules/data-structures/pages/vector-search-overview.adoc

-1. For searches with small `topK` (for example, 10) it may be beneficial to artificially increase `topK`, adjust `partitionLimit` accordingly, and discard extra results. If you need 10 results, a good starting point for tuning could be `topK=100` and a `partitionLimit` between 50 and 100. While this will make the search slower, it will also improve quality, sometimes significantly. Overall, this setup can be more efficient than increasing index build parameters (`max-degree`, `ef-construction`) which results in slower index builds and searches. With a very small `topK` or `paritionLimit`, the search algorithm is less able to escape local minima and find the best results.
-2. Vector deduplication does not incur significant overhead for uploads (usually less than 1%) and searches. You may consider disabling it to get slightly better performance and smaller memory usage if your dataset does not contain duplicated vectors. However, be aware that in the presence of many duplicated vectors with deduplication disabled, a  similarity search may return poor quality results.
-3. For a given query, each vector index partition is searched by 1 thread. The number of concurrent partition searches is configured by specifying a pool size for `hz:query` executor, which by default has 16 threads per member. If optimizing for search, we recommend setting the `hz:query` pool size to be that of the physical core count of your host machines: this will result in a good balance between search throughput and CPU utilization. Setting `hz:query` to have a pool size greater than that of the physical core count will not deliver a significant increase in throughput but it will increase total CPU utilization. The `hz:query` pool size can be changed as follows:
+1. Enable Vector API.


It would be good to mention how to do that, by using --add-modules jdk.incubator.vector or link to the appropriate document.

docs/modules/data-structures/pages/vector-search-overview.adoc

yuce · 2025-02-04T08:00:47Z

docs/modules/data-structures/pages/vector-search-overview.adoc

@@ -93,12 +93,24 @@ To allow the system to return enough results, the following conditions must be s

 - `partitionLimit * partitionCount >= topK`, `partitionLimit &lt;= topK`
 - `memberLimit * memberCount >= topK`, `memberLimit &lt;= topK`
+- `efSearch >= partitionLimit`, if `partitionLimit` is not configured explicitly this applies to the default `partitionLimit` value


AFAIS the default value of partitionLimit is not mentioned anywhere.

the exact formula is complicated (https://github.com/hazelcast/hazelcast-mono/pull/3258). I described it generally:

partitionLimit is calculated based on topK and cluster configuration (number of partitions)

docs/modules/data-structures/pages/vector-search-overview.adoc

yuce · 2025-02-04T08:25:54Z

docs/modules/data-structures/pages/vector-search-overview.adoc

+4. Before increasing index build parameters (`max-degree`, `ef-construction`) which would result in slower index builds and searches and a larger index,
+test if adjusting `efSearch` gives satisfactory results.
+5. Vector deduplication does not incur significant overhead for uploads (usually less than 1%) and searches. You may consider disabling it to get slightly better performance and smaller memory usage if your dataset does not contain duplicated vectors. However, be aware that in the presence of many duplicated vectors with deduplication disabled, a similarity search may return poor quality results.
+6. For a given query, each vector index partition is searched by 1 thread. The number of concurrent partition searches is configured by specifying a pool size for `hz:query` executor, which by default has 16 threads per member. If optimizing for search, we recommend setting the `hz:query` pool size to be that of the physical core count of your host machines: this will result in a good balance between search throughput and CPU utilization. Setting `hz:query` to have a pool size greater than that of the physical core count will not deliver a significant increase in throughput but it will increase total CPU utilization. The `hz:query` pool size can be changed as follows:


Suggested change

6. For a given query, each vector index partition is searched by 1 thread. The number of concurrent partition searches is configured by specifying a pool size for `hz:query` executor, which by default has 16 threads per member. If optimizing for search, we recommend setting the `hz:query` pool size to be that of the physical core count of your host machines: this will result in a good balance between search throughput and CPU utilization. Setting `hz:query` to have a pool size greater than that of the physical core count will not deliver a significant increase in throughput but it will increase total CPU utilization. The `hz:query` pool size can be changed as follows:

6. For a given query, each vector index partition is searched by one thread. The number of concurrent partition searches is configured by specifying a pool size for `hz:query` executor, which by default has 16 threads per member. If optimizing for search, we recommend setting the `hz:query` pool size to be that of the physical core count of your host machines: this will result in a good balance between search throughput and CPU utilization. Setting `hz:query` to have a pool size greater than that of the physical core count will not deliver a significant increase in throughput but it will increase total CPU utilization. The `hz:query` pool size can be changed as follows:

How is hz:query is changed? It would be good to link to the relevant doc.

there is an example below

Co-authored-by: Yüce Tekol <[email protected]>

oliverhowell

Mostly minor edits but a few questions on the table values. Not suggested any changes to these pages that are not in this PR but I do see a few issues which we can come back to in a later update e.g. page name vs. title, explicitly introducing code samples (helps with Ask AI)

docs/modules/data-structures/pages/vector-collections.adoc

oliverhowell · 2025-02-05T11:32:18Z

docs/modules/data-structures/pages/vector-collections.adoc

@@ -718,6 +719,9 @@ You can use hints to fine-tune search precision, especially with smaller `limit`
 |===
 |Hint|Description

+|efSearch
+|Size of list of potential candidates during search. Larger value results in better precision but slower execution.


What is the format of this value? i.e. how should the user use this hint?
Can we add a type or default to the table overall?
Also we should explicitly introduce the examples in general e.g. The following code example shows how to add search options and hints

If you add these hints to the example (not sure if you should combine them though) then this would suffice rather than adding detail about type of value needed.

I added type in 8bd114e and example in 18288d0.

Most hints should not be normally used, except efSearch which maybe will be promoted to the full-fledged SearchOption in the future and partitionLimit in case of skewed distribution. Other are useful for advanced tuning and benchmarking.

docs/modules/data-structures/pages/vector-collections.adoc

docs/modules/data-structures/pages/vector-search-overview.adoc

Co-authored-by: Oliver Howell <[email protected]>

…tor-tuning-6.0-part

oliverhowell

Thanks for addressing comments - not sure if you've finished edits but approving from docs perspective

k-jamroz added 2 commits February 3, 2025 19:21

Update vector collection tuning tips and describe efSearch hint [AI-195]

8ba7a75

Add tip about metric

dd39b6d

k-jamroz requested a review from a team as a code owner February 3, 2025 18:30

k-jamroz requested a review from yuce February 3, 2025 18:30

Update docs for blocking optimize

8476db7

k-jamroz changed the title ~~Update vector collection tuning tips and describe efSearch hint [AI-195]~~ Update vector collection tuning tips and describe efSearch hint [AI-195] [AI-192] Feb 3, 2025

yuce requested changes Feb 4, 2025

View reviewed changes

k-jamroz and others added 3 commits February 4, 2025 11:25

Apply suggestions from code review

bd4ffae

Co-authored-by: Yüce Tekol <[email protected]>

Add partition count to tuning tips and link to vector api configuration

ba08fbc

Fix list

4ab11dd

k-jamroz requested a review from yuce February 4, 2025 10:55

yuce approved these changes Feb 4, 2025

View reviewed changes

oliverhowell self-assigned this Feb 4, 2025

oliverhowell requested changes Feb 5, 2025

View reviewed changes

k-jamroz and others added 5 commits February 5, 2025 13:13

Apply suggestions from code review

ee6a581

Co-authored-by: Oliver Howell <[email protected]>

Document hint types

8bd114e

Update docs/modules/data-structures/pages/vector-collections.adoc

cbc1d87

Add example with multiple hints

18288d0

Merge remote-tracking branch 'origin/vector-tuning-6.0-part' into vec…

ce86547

…tor-tuning-6.0-part

oliverhowell approved these changes Feb 5, 2025

View reviewed changes

Update docs/modules/data-structures/pages/vector-search-overview.adoc

8ecff5e

k-jamroz merged commit b2e00e9 into hazelcast:main Feb 5, 2025
5 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update vector collection tuning tips and describe efSearch hint [AI-195] [AI-192] #1521

Update vector collection tuning tips and describe efSearch hint [AI-195] [AI-192] #1521

k-jamroz commented Feb 3, 2025 •

edited

Loading

netlify bot commented Feb 3, 2025 •

edited

Loading

yuce left a comment

yuce Feb 4, 2025

k-jamroz Feb 4, 2025

yuce Feb 4, 2025

k-jamroz Feb 4, 2025

yuce Feb 4, 2025

yuce Feb 4, 2025

k-jamroz Feb 4, 2025

oliverhowell left a comment

oliverhowell Feb 5, 2025

oliverhowell Feb 5, 2025

k-jamroz Feb 5, 2025

oliverhowell left a comment

Update vector collection tuning tips and describe efSearch hint [AI-195] [AI-192] #1521

Update vector collection tuning tips and describe efSearch hint [AI-195] [AI-192] #1521

Conversation

k-jamroz commented Feb 3, 2025 • edited Loading

netlify bot commented Feb 3, 2025 • edited Loading

✅ Deploy Preview for hardcore-allen-f5257d ready!

yuce left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oliverhowell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oliverhowell left a comment

Choose a reason for hiding this comment

k-jamroz commented Feb 3, 2025 •

edited

Loading

netlify bot commented Feb 3, 2025 •

edited

Loading