Documents retrieved from Retriever and their similarity score #6350
Unanswered
demongolem-biz2
asked this question in
Questions
Replies: 2 comments
-
@demongolem-biz2 We also see potential for improvement with regards to retriever scores. Finetuning retrievers on your own data can help to improve the scores. I could also imagine that splitting long documents into shorter documents helps a bit, yes. What I can also recommend is to use a Ranker after your Retriever: https://docs.haystack.deepset.ai/docs/ranker |
Beta Was this translation helpful? Give feedback.
0 replies
-
Yes, I found that the SentenceTransformerRanker had some impact. In this
case I may see like 4 results with fairly high scores of 0.8 or 0.9 and
then a large gap with the remainder of the results plummeting well below
0.1. In general it behaves how I like, however a query or two will still
have a best result of say 0.02 even when the top result seems to still
match the query fairly well.
…On Tue, Dec 5, 2023 at 6:42 AM Julian Risch ***@***.***> wrote:
@demongolem-biz2 <https://github.com/demongolem-biz2> We also see
potential for improvement with regards to retriever scores. Finetuning
retrievers on your own data can help to improve the scores. I could also
imagine that splitting long documents into shorter documents helps a bit,
yes. What I can also recommend is to use a Ranker after your Retriever:
https://docs.haystack.deepset.ai/docs/ranker
—
Reply to this email directly, view it on GitHub
<#6350 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A45OQCMCEA3SWNMGGYLM6VDYH4CDDAVCNFSM6AAAAAA7QGDTJCVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TONRTGY3TC>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
At its most basic form, it seems the general usage is the comparison a query of shorter length and a document of much greater length to produced a score which is returned with each Document from the retriever. What I see in practice is all the scores smooshed together such that say my top 10 results all have a similar score even if they vary to a greater extent of correctness reading the Document versus the query.
It looks like for 2 short strings performing cosine similarity works quite well. But when the document has some information which matches the query and a lot of other information that does not this would dilute the score and this is why this bunching of similarity scores happen.
Any insights? One possibility if the Documents were split into very small pieces, similarity might be more meaningful, however to make content matches for semantic search (my use case) that capability would be reduced. At the end of the day, I want really good query results to standout more.
I guess the ideas are pretty much the same as #6305 but neither of the ideas there is really satisfying in trying to make really good query results standout.
Beta Was this translation helpful? Give feedback.
All reactions