Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exiting Chain EOF #85

Closed
ImVexed opened this issue Apr 12, 2024 · 5 comments
Closed

Exiting Chain EOF #85

ImVexed opened this issue Apr 12, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@ImVexed
Copy link

ImVexed commented Apr 12, 2024

Describe the bug
When submitting a question:
Exiting chain with error: Post "http://ollama:11434/api/chat": EOF

To Reproduce
Steps to reproduce the behavior:

  1. Enter a message
  2. EOF

Expected behavior
Not break.

Screenshots
image

Additional context
I think this may be some sort of timeout issue? I note it only happens to me with Command-R. I can use Command-R (18.8gb) fine in Ollama's Web UI, with 39/41 layers offloaded to GPU (3090, 24Gb). But when I use LLocalSearch, I only see 19/41 layers offloaded. Not sure if that has anything to do with it, but is confusing me since when I use Mixtral-8x-7b(19Gb) it loads all layers to GPU and has no issues with LLocalSearch.

@ImVexed ImVexed added the bug Something isn't working label Apr 12, 2024
@nilsherzig
Copy link
Owner

Hi :). I assume your using the default 2k context window on open-webui? Until today, my project used a much larger context window if possible (like in the case of command-r). I just pushed an update which contains a new settings window, which allows you to adjust the context window. Please confirm that this causes the increase in vRAM usage / decrease in offloaded layers.

@nilsherzig
Copy link
Owner

If that's the case, I assume ollama just run out of memory on your system?

@ImVexed
Copy link
Author

ImVexed commented Apr 14, 2024

Yes, it's certainly quicker when I lower the context window size. Though it seems to be breaking. It froze when trying to pull info from the internet here for maybe a minute or so:

ollama        | [GIN] 2024/04/14 - 00:29:19 | 200 |  3.916807898s |      172.30.0.3 | POST     "/api/chat"
searxng-1     | 2024-04-14 00:29:19,854 WARNING:searx.engines.google: ErrorContext('searx/search/processors/online.py', 116, "response = req(params['url'], **request_args)", 'searx.exceptions.SearxEngineTooManyRequestsException', None, ('Too many request',)) False
searxng-1     | 2024-04-14 00:29:19,854 ERROR:searx.engines.google: Too many requests
searxng-1     | Traceback (most recent call last):
searxng-1     |   File "/usr/local/searxng/searx/search/processors/online.py", line 163, in search
searxng-1     |     search_results = self._search_basic(query, params)
searxng-1     |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
searxng-1     |   File "/usr/local/searxng/searx/search/processors/online.py", line 147, in _search_basic
searxng-1     |     response = self._send_http_request(params)
searxng-1     |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
searxng-1     |   File "/usr/local/searxng/searx/search/processors/online.py", line 116, in _send_http_request
searxng-1     |     response = req(params['url'], **request_args)
searxng-1     |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
searxng-1     |   File "/usr/local/searxng/searx/network/__init__.py", line 164, in get
searxng-1     |     return request('get', url, **kwargs)
searxng-1     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
searxng-1     |   File "/usr/local/searxng/searx/network/__init__.py", line 95, in request
searxng-1     |     return future.result(timeout)
searxng-1     |            ^^^^^^^^^^^^^^^^^^^^^^
searxng-1     |   File "/usr/lib/python3.11/concurrent/futures/_base.py", line 456, in result
searxng-1     |     return self.__get_result()
searxng-1     |            ^^^^^^^^^^^^^^^^^^^
searxng-1     |   File "/usr/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
searxng-1     |     raise self._exception
searxng-1     |   File "/usr/local/searxng/searx/network/network.py", line 289, in request
searxng-1     |     return await self.call_client(False, method, url, **kwargs)
searxng-1     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
searxng-1     |   File "/usr/local/searxng/searx/network/network.py", line 272, in call_client
searxng-1     |     return Network.patch_response(response, do_raise_for_httperror)
searxng-1     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
searxng-1     |   File "/usr/local/searxng/searx/network/network.py", line 245, in patch_response
searxng-1     |     raise_for_httperror(response)
searxng-1     |   File "/usr/local/searxng/searx/network/raise_for_httperror.py", line 76, in raise_for_httperror
searxng-1     |     raise SearxEngineTooManyRequestsException()
searxng-1     | searx.exceptions.SearxEngineTooManyRequestsException: Too many request, suspended_time=3600
searxng-1     | 2024-04-14 00:29:22,329 ERROR:searx.engines.duckduckgo: engine timeout
searxng-1     | 2024-04-14 00:29:22,423 WARNING:searx.engines.duckduckgo: ErrorContext('searx/engines/duckduckgo.py', 118, 'res = get(query_url)', 'httpx.ConnectTimeout', None, (None, None, 'duckduckgo.com')) False
searxng-1     | 2024-04-14 00:29:22,423 ERROR:searx.engines.duckduckgo: HTTP requests timeout (search duration : 3.0941880460013635 s, timeout: 3.0 s) : ConnectTimeout
backend-1     | 2024/04/14 00:29:22 WARN Error downloading website error="no content found"

And after that went through it then got stuck in a loop:
image

Here's the full logs:
temp.log

@nilsherzig
Copy link
Owner

I'm pretty sure that it run out of context. 2k tokens isnt much. You can see an estimate of the current context in the backend logs. I assume that the format instructions arent in the context anymore at this point. Which results in the LLM ignoring the requested structure.

@nilsherzig
Copy link
Owner

closing for #91

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants