Replies: 2 comments 2 replies
-
The whole situation is not particularly clear to me. |
Beta Was this translation helpful? Give feedback.
1 reply
-
(related: #3390) I'm using Haystack 1.19.0 and the latest Docker image of Weaviate. # I manually run Weaviate using Docker:
# sudo docker run -d -p 8080:8080 --env AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED='true' --env PERSISTENCE_DATA_PATH='/var/lib/weaviate' --name weaviate semitechnologies/weaviate:latest
from haystack.nodes.retriever.dense import DensePassageRetriever
from tqdm import tqdm
from haystack import Document
from haystack.document_stores import WeaviateDocumentStore
from haystack.utils import launch_weaviate
docs_process = [Document("my document", id=i) for i in range(20_000)]
document_store_Weaviate = WeaviateDocumentStore(index="DPR",recreate_index=True, similarity="dot_product", progress_bar=True, duplicate_documents="overwrite") #Weaviate fonctionne sur http://localhost:8080
print('Running the pre-embedding')
retriever_dense_DPR = DensePassageRetriever(
document_store=document_store_Weaviate,
query_embedding_model="etalab-ia/dpr-question_encoder-fr_qa-camembert",
passage_embedding_model="etalab-ia/dpr-ctx_encoder-fr_qa-camembert",
use_gpu=True,
embed_title=True,
max_seq_len_passage=380,
batch_size=32
)
embeds = retriever_dense_DPR.embed_documents(docs_process)
cpt =0
print("Ajout des embding !")
for doc, emb in tqdm(zip(docs_process,embeds)):
doc.embedding = emb
cpt+=1
print("nombre d'embed :",cpt)
document_store_Weaviate.write_documents(docs_process, batch_size=1000) I suspect that your client configuration may somehow interfere... |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
I use a retrieval pipeline with a DPR model for the retriever and i use the weaviate document_store. I have a lot of document and for skip the limitations for update embedding in the weaviate database. I used a trick to get around this limitation as proposed in an old discussion where you first write the embeding in the documents then write everything in the database. However, with the DPR model, when I write documents and embeds, I get this error:
Add property to class! Unexpected status code: 422, with response body: {'error': [{'message': "extend idx 'dpr' with property 'page: create property 'page' value index on shard 'dpr_TMfpeIusk8NL': store is read-only"}]}.
When I execute this line :
document_store_Weaviate.write_documents(documents=docs_process, index="DPR", batch_size=100, duplicate_documents="overwrite")
I don't know how to resolve the read only property.
Beta Was this translation helpful? Give feedback.
All reactions