Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(pinecone): Add rerank task for Pinecone component #773

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

MubeenKodvavi
Copy link
Contributor

@MubeenKodvavi MubeenKodvavi commented Oct 23, 2024

Because

This commit

  • Added Pinecone Rerank Support.
    • Inputs: Documents and Query input. We can restrict output to n documents by setting optional field (top-n)
    • Output: Reranked Documents and Scores
  • Made URL in Pinecone Setup Optional as for rerank input URL is not required. For others, this will still need to be input.

Define task in pinecone's definition.json
Define inputs and outputs of component in task.json
URL is no longer required as Rerank task does not require an index
…tput response

Implement required structs to parse input data to API request and parse API output to component output
Introduce new client to make rerank task
…changes

run campogen to update documentation with new changes
@MubeenKodvavi
Copy link
Contributor Author

Test Details

Tested locally by creating a pinecone component's rerank task and verified data using sample input from API documentation

Sample Pipeline Recipe:

version: v1beta

variable:
  query:
    title: Query
    instill-format: string
  documents:
    title: Documents
    instill-format: array:string

component: 
  pinecone-0:
    type: pinecone
    input:
      query: ${variable.query}
      documents: ${variable.documents}
      top-n: 3
    setup:
      api-key: ${secret.pineconetoken} 
    task: TASK_RERANK

output:
  reranked_documents:
    title: Reranked Documents
    value: ${pinecone-0.output.documents}
  scores:
    title: scores
    value: ${pinecone-0.output.scores} 

Input:

image

Component Output:

image

… formed

remove api version 2024-10 from headers as this is latest stable version now and supports rerank api by default
…client

pinecone is redirecting towards oldest stable version that does not support rerank endpoint
also recommended by pinecone in their documentation to specify api version in headers
Copy link
Member

@chuang8511 chuang8511 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution!

I left a few comments that we need you to modify.
And, if possible, please also help us add the test code.

}

return &rerankReq{
Model: "bge-reranker-v2-m3",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though it only provides a model now, we can make it a field with a default value.

You can refer to this.

"instillUIOrder": 0,
"properties": {
"query": {
"description": "The query to rerank the documents",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "The query to rerank the documents",
"description": "The query to rerank the documents.",

Now, we always add period in the end for description fields.

"type": "string"
},
"documents": {
"description": "The documents to rerank",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "The documents to rerank",
"description": "The documents to rerank.",

"type": "array"
},
"top-n": {
"description": "The number of results to return sorted by relevance. Defaults to the number of inputs.\n\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "The number of results to return sorted by relevance. Defaults to the number of inputs.\n\n",
"description": "The number of results to return sorted by relevance. Defaults to the number of inputs.",

"instillUIOrder": 0,
"properties": {
"documents": {
"description": "Reranked documents",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "Reranked documents",
"description": "Reranked documents.",

"type": "string"
},
"instillUIOrder": 0,
"title": "Reranked documents",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"title": "Reranked documents",
"title": "Reranked Documents",

resp := rerankResp{}
req.SetResult(&resp).SetBody(inputStruct.asRequest())
if _, err := req.Post(rerankPath); err != nil {
job.Error.Error(ctx, httpclient.WrapURLError(err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recently, I found the clients don't reply the meaningful error message in err when sending http request.

Could you add the errMsg like this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WrapURLError is a helper to add an end-user message to http transport errors.
This message is extracted in ErrorHandler interface for component job.

@@ -145,7 +167,34 @@ func (e *execution) Execute(ctx context.Context, jobs []*base.Job) error {
job.Error.Error(ctx, err)
continue
}
case taskRerank:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be good to have the test code if possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: No status
Development

Successfully merging this pull request may close these issues.

[Pinecone] Add rerank task
3 participants