-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Es vector store basic implementation #21
base: main
Are you sure you want to change the base?
Conversation
botnim/sync.py
Outdated
tool_resources = None | ||
tools = None | ||
print(f'Updating assistant: {config["name"]}') | ||
# Load context, if necessary | ||
if config.get('context'): | ||
vs = VectorStoreOpenAI(config, config_dir, production, client) | ||
if config.get('context') and replace_context: # Only runs if both conditions are true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is wrong - we need the tools and tools_resources even if we don't replace the context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
# Generate embedding | ||
response = self.openai_client.embeddings.create( | ||
input=content, | ||
model="text-embedding-ada-002", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you choose this older model?
'text-embedding-3-small' is cheaper and more advanced.
https://platform.openai.com/docs/guides/embeddings#embedding-models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are very right, and actually it was already included in the configurations. Notice we might change this selection later. For now - fixed.
while len(file_streams) > 0: | ||
current = file_streams[:32] | ||
|
||
for filename, content_file, content_type in current: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for the double loop I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree. good catch.
|
||
def get_or_create_vector_store(self, context, context_name, replace_context): | ||
ret = None # return value | ||
vs_name = self.env_name(self.config['name']).lower().replace(' ', '_') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to include the context name as well here, as we would want each context to be in a separate index.
return 0 | ||
|
||
def update_tools(self, context_, vector_store): | ||
if len(self.tools) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the tool added needs to be a function, not file-search.
Something like:
{
"type": "function",
"name": "search_common_knowledge",
"description": "Semantic search the 'common-knowledge' vector store",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The query string to use for searching"
}
},
"required": ["query"]
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks. I did not think of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @noamoss overall pretty good - I left some small comments.
), | ||
)) | ||
|
||
def update_tool_resources(self, context_, vector_store): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no internal vector store ids for elasticsearch
closes #12
closes #13
closes #16