You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm running into token limits when creating embeddings running the create_final_entities step and I can't figure out what it's trying to create embeddings of.
My embedding (and chat) LLM has pretty strict token limits of about 2048. However, graphrag is passing it inputs slightly larger than that. How do I prevent this? The logs are at the bottom.
I've set max_tokens under llm to 1500 to ensure that instruction LLMs aren't generating large outputs. max_length for community_reports and summarize_descriptions is already set to below 2048. I've played with embeddings:tokens_per_batch too. None of this seems to be related. Poking around the actual queries appears to match text generated by entity queries but not exactly. And I'm a bit lost when reading the create_final_entities code in the repo.
02:28:54,462 graphrag.index.run INFO Running workflow: create_final_entities...
02:28:54,468 graphrag.index.run INFO dependencies for create_final_entities: ['create_base_entity_graph']
02:28:54,498 graphrag.index.run INFO read table from storage: create_base_entity_graph.parquet
02:28:54,648 datashaper.workflow.workflow INFO executing verb unpack_graph
02:28:56,773 datashaper.workflow.workflow INFO executing verb rename
02:28:56,788 datashaper.workflow.workflow INFO executing verb select
02:28:56,802 datashaper.workflow.workflow INFO executing verb dedupe
02:28:56,815 datashaper.workflow.workflow INFO executing verb rename
02:28:56,825 datashaper.workflow.workflow INFO executing verb filter
02:28:56,885 datashaper.workflow.workflow INFO executing verb text_split
02:28:56,928 datashaper.workflow.workflow INFO executing verb drop
02:28:56,956 datashaper.workflow.workflow INFO executing verb merge
02:28:57,415 datashaper.workflow.workflow INFO executing verb text_embed
02:28:57,416 graphrag.llm.openai.create_openai_client INFO Creating OpenAI client base_url=http://URL.com/api/v1
02:28:57,426 graphrag.index.llm.load_llm INFO create TPM/RPM limiter for cohere.embed-english-v3: TPM=0, RPM=0
02:28:57,426 graphrag.index.llm.load_llm INFO create concurrency limiter for cohere.embed-english-v3: 25
02:28:57,723 graphrag.index.verbs.text.embed.strategies.openai INFO embedding 2019 inputs via 2019 snippets using 127 batches. max_batch_size=16, max_tokens=8191
02:28:58,20 httpx INFO HTTP Request: POST http://URL.com/api/v1/embeddings "HTTP/1.1 400 Bad Request"
....
02:28:58,53 datashaper.workflow.workflow ERROR Error executing verb "text_embed" in create_final_entities: Error code: 400 - {'detail': 'An error occurred (ValidationException) when calling the InvokeModel operation: Malformed input request: #/texts/6: expected maxLength: 2048, actual: 2163, please reformat your input and try again.'}
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I'm running into token limits when creating embeddings running the create_final_entities step and I can't figure out what it's trying to create embeddings of.
My embedding (and chat) LLM has pretty strict token limits of about 2048. However, graphrag is passing it inputs slightly larger than that. How do I prevent this? The logs are at the bottom.
I've set
max_tokens
underllm
to 1500 to ensure that instruction LLMs aren't generating large outputs.max_length
forcommunity_reports
andsummarize_descriptions
is already set to below 2048. I've played withembeddings:tokens_per_batch
too. None of this seems to be related. Poking around the actual queries appears to match text generated by entity queries but not exactly. And I'm a bit lost when reading thecreate_final_entities
code in the repo.Beta Was this translation helpful? Give feedback.
All reactions