Replies: 1 comment
-
There are two ways to solve this problem. You can either reduce the input length to the model. For this, you might want to use our |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
When studying the tutorial 'Creating a Generative QA Pipeline with Retrieval-Augmentation', I found the warning message regarding token maximum length from the response as below;
< Code >
output = pipe.run(query="What does Rhodes Statue look like?")
print(output["answers"][0].answer)
< Response >
Token indices sequence length is longer than the specified maximum sequence length for this model (560 > 512). Running this sequence through the model will result in indexing errors
WARNING:haystack.nodes.prompt.invocation_layer.hugging_face:The prompt has been truncated from 560 tokens to 412 tokens so that the prompt length and answer length (100 tokens) fit within the max token limit (512 tokens). Shorten the prompt to prevent it from being cut off
The Colossus was a mythical figure, and the mythical figure is the mythical Colossus.
The model adopted in the PromptNode was 'google/flan-t5-large' and I know the max length of it is 512.
I guess the input length exceeded the max length of the model and that's why warning message came out as above.
In addtion, I can saw this kind of matter influence the performance of QA pipeline due to indexing errors.
How can I use 'google/flan-t5-large' without such a maximum sequence length problem ?
Thank you for your help in advance.
Beta Was this translation helpful? Give feedback.
All reactions