You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a deeplake vector database with code chunks of a project. According to an issue I want to find the corresponding code chunks. For this I have written a SelfQueryRetriever.
But it throws an error exactly when I mention an expression like 'train.py script' in the query. If I leave this out, I get no error. The whole thing is supposed to work automatically for all possible issues, so it is not possible to simply say to keep such expressions out of the issues.
Steps to Reproduce
def CustomRetriever(files, dataset_path,issue):
metadata_field_info = [
AttributeInfo(
name="source",
description="The soruce file the chunk was extracted from",
type="string",
),
AttributeInfo(
name="file_name",
description="The name of the file the chunk was extracted from",
type="string",
),
AttributeInfo(
name="chunk_id",
description="the id of the chunk",
type="string",
),
]
document_content_description = "The sourcecode of a project"
model = ChatOpenAI(model="gpt-4")
embeddings = OpenAIEmbeddings(disallowed_special=())
db = DeepLake(dataset_path=dataset_path, read_only=True, embedding=embeddings, exec_option='python')
docs = (db.similarity_search(query=" ", k=10000000))
retriever = SelfQueryRetriever.from_llm(
model, db, document_content_description, metadata_field_info, verbose=True
)
try:
# Ihr Code, der den Fehler verursacht
print('TEST', retriever.get_relevant_documents(
f"Which documents contain code to resolve the following issue? -> {issue}"))
except ValueError as e:
print(traceback.format_exc())
Here is the error:
query='CNN instead of BERT model in train.py script, handle data better, generated using Tensorflow, integrated into logic, adapted to word vectors, change code' filter=Operation(operator=<Operator.AND: 'and'>, arguments=[Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='source', value='train.py'), Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='file_name', value='train.py')]) limit=None
Traceback (most recent call last):
File "/Users/kaanerbay/GitHub/Github_Issue_Solver/langchainLogic/retriever2.py", line 93, in CustomRetriever
print('TEST', retriever.get_relevant_documents(
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/langchain/schema/retriever.py", line 208, in get_relevant_documents
raise e
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/langchain/schema/retriever.py", line 201, in get_relevant_documents
result = self._get_relevant_documents(
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/langchain/retrievers/self_query/base.py", line 135, in _get_relevant_documents
docs = self.vectorstore.search(new_query, self.search_type, **search_kwargs)
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/langchain/vectorstores/base.py", line 121, in search
return self.similarity_search(query, **kwargs)
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/langchain/vectorstores/deeplake.py", line 475, in similarity_search
return self._search(
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/langchain/vectorstores/deeplake.py", line 348, in _search
return self._search_tql(
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/langchain/vectorstores/deeplake.py", line 267, in _search_tql
result = self.vectorstore.search(
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/deeplake/core/vectorstore/deeplake_vectorstore.py", line 429, in search
utils.parse_search_args(
File "/Users/kaanerbay/miniconda3/envs/main/lib/python3.10/site-packages/deeplake/core/vectorstore/vector_search/utils.py", line 229, in parse_search_args
raise ValueError(
ValueError: User-specified TQL queries are not support for exec_option=python.
Here is the used issue:
a CNN should be used instead of the BERT model in the train.py script, because it can handle the type of data better.
The CNN should not be too complex, but also not too simple and should be generated using Tensorflow.
The CNN should be integrated into the logic and adapted according to the word vectors used. Change the code of it, as good as you can.
Expected/Desired Behavior
If you replace the expression 'train.py scripts' with for example 'training process', the error disappears and the query is executed correctly
Python Version
3.10.13
OS
MacOS Ventura 13.5.2
IDE
PyCharm
Packages
langchain==0.0.293, lark==1.1.7, deeplake==3.6.26
Additional Context
No response
Possible Solution
No response
Are you willing to submit a PR?
I'm willing to submit a PR (Thank you!)
The text was updated successfully, but these errors were encountered:
Thanks for sharing the error? I am curious, have installed the latest deeplake version? Also have you installed deeplake[enterprise]. The problem is related to exec_option not being casted correctly, this either can be because you're using old deeplake version or haven't installed deeplake[enterprise].
To install deeplake[enterprise] please run the following command:
Hey @adolkhan
i have already had deeplake[enterprise] installed. Unfortunately this is not the solution to this error. And I have the latest deeplake version installed.
I noticed that this error occurs when the query contains scripts like 'train.py' or 'package.json' in combination with text.
Severity
None
Current Behavior
I have a deeplake vector database with code chunks of a project. According to an issue I want to find the corresponding code chunks. For this I have written a SelfQueryRetriever.
But it throws an error exactly when I mention an expression like 'train.py script' in the query. If I leave this out, I get no error. The whole thing is supposed to work automatically for all possible issues, so it is not possible to simply say to keep such expressions out of the issues.
Steps to Reproduce
Here is the error:
Here is the used issue:
Expected/Desired Behavior
If you replace the expression 'train.py scripts' with for example 'training process', the error disappears and the query is executed correctly
Python Version
3.10.13
OS
MacOS Ventura 13.5.2
IDE
PyCharm
Packages
langchain==0.0.293, lark==1.1.7, deeplake==3.6.26
Additional Context
No response
Possible Solution
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: