[Bug] 调用ollama进行向量化失败 #6827

69gg · 2025-03-09T06:46:19Z

📦 部署环境

Docker

📦 部署模式

服务端模式(lobe-chat-database 镜像)

📌 软件版本

v1.49.12

💻 系统环境

Other

🌐 浏览器

Edge

🐛 问题描述

在树莓派5B搭建的服务器版。
在进行量化时会出现：embeddingChunks error {
message: '{"error":{"message":"Error 429: Too Many Requests","name":"ResponseError","status_code":429},"errorType":"OllamaBizError","provider":"ollama"}',
name: 'EmbeddingError'
}

📷 复现步骤

No response

🚦 期望结果

No response

📝 补充信息

No response

lobehubbot · 2025-03-09T06:46:31Z

👀 @69gg

Thank you for raising an issue. We will investigate into the matter and get back to you as soon as possible.
Please make sure you have given us as much context as possible.
非常感谢您提交 issue。我们会尽快调查此事，并尽快回复您。请确保您已经提供了尽可能多的背景信息。

dosubot · 2025-03-09T06:48:29Z

@69gg 你好! 我是Dosu，一个可以帮助您解决问题、回答疑问并成为贡献者的机器人。我会帮助您并等待人类维护者的到来。

I found a similar open issue regarding batch chunking not retrying previously failed embeddings. The suggested solution involves implementing a more robust retry mechanism that can handle batch retries programmatically. This includes modifying the retry logic to ensure that when a batch retry is initiated, it properly queues and processes each failed item again, rather than immediately failing. Additionally, enhancing the error handling logic to better differentiate between types of errors, such as rate limits, and implementing a more automated retry strategy for these specific cases could be beneficial. For now, using a JavaScript workaround in the Chrome Developer Console is a practical solution to automate the manual retry process ^[1].

In your case, the error message indicates a "Too Many Requests" error (status code 429) from the Ollama provider, which suggests that the requests are being rate-limited. Implementing a retry mechanism as described above might help in handling such rate limit errors more effectively.

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other}

69gg · 2025-03-09T06:54:49Z

@69gg 你好! 我是Dosu，一个可以帮助您解决问题、回答疑问并成为贡献者的机器人。我会帮助您并等待人类维护者的到来。

I found a similar open issue regarding batch chunking not retrying previously failed embeddings. The suggested solution involves implementing a more robust retry mechanism that can handle batch retries programmatically. This includes modifying the retry logic to ensure that when a batch retry is initiated, it properly queues and processes each failed item again, rather than immediately failing. Additionally, enhancing the error handling logic to better differentiate between types of errors, such as rate limits, and implementing a more automated retry strategy for these specific cases could be beneficial. For now, using a JavaScript workaround in the Chrome Developer Console is a practical solution to automate the manual retry process [1].

In your case, the error message indicates a "Too Many Requests" error (status code 429) from the Ollama provider, which suggests that the requests are being rate-limited. Implementing a retry mechanism as described above might help in handling such rate limit errors more effectively.

To continue talking to Dosu, mention @dosu.

Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

我使用了本地的ollama部署，没有限制什么

lobehubbot · 2025-03-09T06:54:59Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

@69gg Hello! I am Dosu, a bot that can help you solve problems, answer questions and become a contributor. I will help you and wait for the arrival of human defenders.

I found a similar public question about embeddings that failed before batch chunking does not retry. The recommended solution is to implement a more robust retry mechanism that can programmatically handle batch retry. This means modifying the retry logic to ensure that when starting a batch retry, you can queue up and reprocess each failed item correctly, rather than giving up immediately. At the same time, it may be more helpful to enhance error handling logic to better distinguish different types of errors, such as rate limiting, and implement more automated retry strategies for these specific situations. Currently, JavaScript scripts in the Chrome Developer Console are used as a practical solution to automate the manual retry process [1].

In your case, the error message indicates that it is a "too many requests" error from the Ollama provider (status code 429), which indicates that your request is being restricted. Implementing the retry mechanism in accordance with the above method may help to handle such rate limiting errors more efficiently.

_In order to continue talking to Dosu, please explain. _

Help Dosu learn! Feedback: Excellent Response | Unrelated | Info Error | Too Long | Hallucination | Defect Report | Other

I've used local ollama deployment, no restrictions on anything

69gg added the 🐛 Bug Something isn't working | 缺陷 label Mar 9, 2025

github-project-automation bot moved this to Roadmap - Chat 1.x in Lobe Chat Routine Mar 9, 2025

github-project-automation bot added this to Lobe Chat Routine Mar 9, 2025

dosubot bot added the ollama Relative to Ollama Provider and ollama models label Mar 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] 调用ollama进行向量化失败 #6827

[Bug] 调用ollama进行向量化失败 #6827

69gg commented Mar 9, 2025

lobehubbot commented Mar 9, 2025

dosubot bot commented Mar 9, 2025

69gg commented Mar 9, 2025 •

edited

Loading

lobehubbot commented Mar 9, 2025

[Bug] 调用ollama进行向量化失败 #6827

[Bug] 调用ollama进行向量化失败 #6827

Comments

69gg commented Mar 9, 2025

📦 部署环境

📦 部署模式

📌 软件版本

💻 系统环境

🌐 浏览器

🐛 问题描述

📷 复现步骤

🚦 期望结果

📝 补充信息

lobehubbot commented Mar 9, 2025

dosubot bot commented Mar 9, 2025

69gg commented Mar 9, 2025 • edited Loading

lobehubbot commented Mar 9, 2025

69gg commented Mar 9, 2025 •

edited

Loading