Replies: 1 comment 1 reply
-
Hello, @Peveld! Two quick ideas:
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, my idea for a RAG Project was to feed only valid documents into the question to the LLM. So I'm trying to find significant differences in the returned score. I followed the tutorial and have something like:
`from haystack.document_stores.faiss import FAISSDocumentStore
from haystack import Document
faissDocumentStore = FAISSDocumentStore(faiss_index_factory_str="Flat", sql_url="sqlite:////tmp/faiss_document_store.db")
documents = [Document(content="The english channel is 30 kilometers wide."), Document(content="la le li la di da")]
faissDocumentStore.write_documents(documents)
from haystack.nodes import EmbeddingRetriever
retriever = EmbeddingRetriever(
document_store=faissDocumentStore, embedding_model="sentence-transformers/multi-qa-mpnet-base-dot-v1"
)
faissDocumentStore.update_embeddings(retriever)
myquery = "How wide is the english channel?"
docs = retriever.retrieve(query=myquery, top_k=2)
print(docs)`
However the difference in Score is pretty small 0.5734 vs 0.5290. On my real text base I find nearly same score values for perfectly matching docs and perfectly not matching ones. My fantasy was to provide a general threshold... Do I understand something wrong or is there maybe a better approach?
Beta Was this translation helpful? Give feedback.
All reactions