Architecture for many Separate Collections #21341

gvilums · 2022-12-21T16:40:09Z

gvilums
Dec 21, 2022

Hello everyone 👋 ,

We're planning to use Milvus as a vector similarity search backend for a consumer service. In our model, each user only ever needs to search through the data provided by them, and will (and should) never find results related to other users.

I was wondering what the best approach for modeling something like this is in Milvus. My first idea was to create a separate Collection for each user. Then, when a query arrives, this collection can be loaded and the query can be answered, and finally the collection can be unloaded again.
However, this approach has the problem that loading a collection takes multiple seconds, whereas we would like to keep total query latency under 200ms.
Alternatively, I was thinking that it might be possible to simply let a user's collection remain loaded, so that subsequent queries are answered faster. However, I'm not sure if Milvus implements any garbage collection for collections which are loaded but inactive, and fear that this approach would simply lead to eventual memory exhaustion.

Another approach would be to create a single collection for all users, and then filter based on some user-id attribute. However, I imagine that this would be quite inefficient, as the entire collection would have to be loaded all of the time, even for users which have been inactive for a long time.

Finally, I was also looking at using a DISKANN index (but was unable to get it to work). Does this index type also require the collection to be loaded?

Any suggestions about the best way to approach this would be appreciated 😄

yhmo · 2022-12-28T08:01:16Z

yhmo
Dec 28, 2022
Collaborator

If your cpu memory is sufficient, create a collection for each user would be a best way. Keep all the collections loaded in memory.
Otherwise, DISKANN is a good choice. DISKANN requires few memory than other index type, but the performance is a bit slower than others.

8 replies

yhmo Jan 11, 2023
Collaborator

Totally how many users?
Averagely how many embeddings for each user?

GitIgnoreMaybe Jan 11, 2023

I don't know yet. But let's say 10k - 100k users and each user has text documents which are split by SBERT into passages of 1–3 sentences. Each text document has 100 documents, with each 20 SBERT embeddings, which would mean 2000 embeddings per user. Number of users can grow and number of documents can grow.

yhmo Jan 12, 2023
Collaborator

I am not sure this is a good solution, you can try:

hash the users into multiple collections by user_id, for example, user_1/user_3/user_5 hash to collection_1, user_2/user_4/user_6 hash to collection_2
keep a "hot" list of collection in the client side, for example, when we search user_2, we load the collection_2 into memory, put the collection_2 into the "hot" list, and release the other collections in the "hot" list
when we search user_2, use expression to filtering, such as collection.search(expr="user_id==user_2"

Assume there are 100 users per collection, then totally there will be 100 ~ 1k collections.

GitIgnoreMaybe Jan 12, 2023

That sounds very interesting, I need to think about it.

Meanwhile, I thought about your idea "one collection for all users" and tried to understand in the documentation why this approach might have a bad search performance.

Is it because the query for the meta filtering is slow (filtering all vector_id for user_id = 1234), or
because applying the vector_id list to the index is slow (vectorList = [0,1,2,3,n] performing ANN on index)?

yhmo Jan 13, 2023
Collaborator

If we put all users into a collection, you need to load entire collection into memory to search, and the filtering process(filter 1 user from 10k-100k users) is also slow.

xiaofan-luan · 2022-12-29T06:00:07Z

xiaofan-luan
Dec 29, 2022
Maintainer

We will support collection LRU loaded but it is still under design and it might also be super slow(Load data from S3 is slow).
So far I recommne to use DiskANN or IVFPQ and load all data into memory/disk

2 replies

GitIgnoreMaybe Jan 6, 2023

The e.g. IVF65536_HNSW32,Flat is definitely extremely quick and has a good recall, but the issue will be a retriever and re-ranker setup.

Let's say a user queries "green bottle" and the retriever returns top_k = 100. Then there is a pretty good chance that the top 100 will not include any data from the user. Even if top_k is way bigger, with more users and data the issue becomes bigger as well. And another challenge will be to prune the top_k to only the users' data in Milvus, scalar will not work from what I understand because it's a pre-step.

Any idea how to deal with it or how to separate the user data?

yhmo Jan 9, 2023
Collaborator

Milvus separate data into segments, this is internal machinery, user has no interface to separate data.
ANN search is approximate search, different parameters(index parameters, search parameters) return different results. If the result is not good, you can change index/search parameters to get better result.

GitIgnoreMaybe · 2023-01-06T03:59:06Z

GitIgnoreMaybe
Jan 6, 2023

@gvilums You did write:

However, this approach has the problem that loading a collection takes multiple seconds, whereas we would like to keep total query latency under 200ms.

Do you know how long it takes? I've never tried it. But if it's not too long, then you can play with the timing when to load the collection. E.g. initiate the collection loading on page load or when the input field is in focus. The user needs time to write the query anyway, so maybe the collection is ready by then.

1 reply

yhmo Jan 9, 2023
Collaborator

It depends to the bandwidth between storage and milvus server. If you deployed minio and milvus on same machine, load progress is fast.

dan-pav · 2024-07-17T02:40:58Z

dan-pav
Jul 17, 2024

Hi @gvilums
Can you please share what was your approach eventually? How did it work out?

1 reply

xiaofan-luan Jul 17, 2024
Maintainer

use userid as parititonkey is what you are looking for.
see https://milvus.io/docs/multi_tenancy.md

nairan-deshaw · 2024-10-15T05:49:46Z

nairan-deshaw
Oct 15, 2024

@xiaofan-luan Can you share any relevant links for the below. Is it still under consideration?

We will support collection LRU loaded but it is still under design

1 reply

xiaofan-luan Oct 15, 2024
Maintainer

@xiaofan-luan Can you share any relevant links for the below. Is it still under consideration?

We will support collection LRU loaded but it is still under design

The code is actually already in but this is not something we really recommended. Because search could be very slow (More than several seconds).

here is the pr if you are interested and willing to give it a try #32567

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture for many Separate Collections #21341

{{title}}

Replies: 5 comments 13 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Architecture for many Separate Collections #21341

gvilums Dec 21, 2022

Replies: 5 comments · 13 replies

yhmo Dec 28, 2022 Collaborator

yhmo Jan 11, 2023 Collaborator

GitIgnoreMaybe Jan 11, 2023

yhmo Jan 12, 2023 Collaborator

GitIgnoreMaybe Jan 12, 2023

yhmo Jan 13, 2023 Collaborator

xiaofan-luan Dec 29, 2022 Maintainer

GitIgnoreMaybe Jan 6, 2023

yhmo Jan 9, 2023 Collaborator

GitIgnoreMaybe Jan 6, 2023

yhmo Jan 9, 2023 Collaborator

dan-pav Jul 17, 2024

xiaofan-luan Jul 17, 2024 Maintainer

nairan-deshaw Oct 15, 2024

xiaofan-luan Oct 15, 2024 Maintainer

gvilums
Dec 21, 2022

Replies: 5 comments 13 replies

yhmo
Dec 28, 2022
Collaborator

yhmo Jan 11, 2023
Collaborator

yhmo Jan 12, 2023
Collaborator

yhmo Jan 13, 2023
Collaborator

xiaofan-luan
Dec 29, 2022
Maintainer

yhmo Jan 9, 2023
Collaborator

GitIgnoreMaybe
Jan 6, 2023

yhmo Jan 9, 2023
Collaborator

dan-pav
Jul 17, 2024

xiaofan-luan Jul 17, 2024
Maintainer

nairan-deshaw
Oct 15, 2024

xiaofan-luan Oct 15, 2024
Maintainer