FAISSDocumentStore scores are always in same range #5781

mlnsharma · 2023-09-12T15:46:35Z

mlnsharma
Sep 12, 2023

I am using FAISSDocumentStore for a QA system. When the FAISSDocumentStore.query_by_embedding() is called (internally during pipeline retrieval) with top_k=10, the documents matched in the index have normalised scores is in the range 0.50 - 0.51, actual score in the range 0.72-0.76.

When a query passed doesn't have matching documents, the top 10 documents returned have scores in the same range as mentioned above. This is preventing me from determining if the query is indeed matching the document or not.

I am using the default settings for similarity, index factory, scale_score. Constructor call looks like below.

document_store = FAISSDocumentStore(index_path=<>, config_path=<>, duplicate_documents='skip', validate_index_sync=False)

How to determine if the score represents a lower/higher similarity match? Are there any other configs to get the score representing the actual similarity?

ZanSara · 2023-09-13T08:12:56Z

ZanSara
Sep 13, 2023

Hey @mlnsharma, I believe that also depends on the embeddings. How do you generate them? Are you sure the model actually generates good embeddings for your documents?

1 reply

mlnsharma Sep 13, 2023
Author

Hi @ZanSara
I am using OpenAI + 'text-embedding-ada-002' model for embedding generation. Can you please suggest any specific settings or alternatives to get correct scores?

anakin87 · 2023-09-14T11:56:25Z

anakin87
Sep 14, 2023
Maintainer

I would suggest trying to set scale_score=False.

Related API reference

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FAISSDocumentStore scores are always in same range #5781

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

FAISSDocumentStore scores are always in same range #5781

mlnsharma Sep 12, 2023

Replies: 2 comments · 1 reply

ZanSara Sep 13, 2023

mlnsharma Sep 13, 2023 Author

anakin87 Sep 14, 2023 Maintainer

mlnsharma
Sep 12, 2023

Replies: 2 comments 1 reply

ZanSara
Sep 13, 2023

mlnsharma Sep 13, 2023
Author

anakin87
Sep 14, 2023
Maintainer