-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Repeated clicking on the same file eventually leads to server "Too many open files" error and session becomes unusable #4538
Comments
I did some debugging into this, it appears that there's many unix sockets being opened per "click" to the docker daemon socket. There appear to be a leak somewhere. |
Every time the docker container log is retrieved, a bunch of unix socket FDs are leaked. The issue is the same as what's reported in docker/docker-py#3282 As far as I can tell, the best way to fix this would be to:
The other leak of the TCP socket is from
|
This issue should now be resolved on main |
Bug is less severe, but it's still fundamentally there, take a look at the output of
TCP CLOSE_WAIT is the one that's easy to solve, by setting the connection header to CLOSE. |
The TCP isn't being leaked, so now the only remaining accumulating leak is the unix docker sockets. |
@diwu-sf is this bug finally resolved? tofarr put in a few fixes. |
No there's still Unix socket leak to the docker socket due to the log streamer. Use the same repo and you should see that Unix sockets are still being accumulated per click |
Was able to see the leakage by following
Strange that the uvicorn proc itself doesn't own the leak, just this one. Which I guess points to the leak being in a thread? |
It’s the docker log streamer thread. It never actually closed the log line
generator
…On Thu, Nov 14, 2024 at 6:19 AM Robert Brennan ***@***.***> wrote:
Was able to see the leakage by following lsof -p for this process, and
clicking on files
~/.cache/pypoetry/virtualenvs/openhands-ai-uYxnY0EM-py3.12/bin/python -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=5, pipe_handle=7) --multiprocessing-fork
Strange that the uvicorn proc itself doesn't own the leak, just this one.
Which I guess points to the leak being in a thread?
—
Reply to this email directly, view it on GitHub
<#4538 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BL3P7PSMDECPQFNMNPMLBQD2ASWPTAVCNFSM6AAAAABQP5ZOPSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZWGQ3TGOJZGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
OK I fixed several leakage issues, but this one persists 🙃 I did start calling for posterity, you can watch the leakage with
All the leaks go away if you make LogBuffer a null class (replace all method logic with |
I've further confirmed that if you don't instantiate the I've also confirmed that we're closing every TBH at this point I'm assuming there's a bug in the docker SDK |
There is a bug, look at the sdk reference I had earlier in this thread.
There’s also a workaround fix for the leak in that docker issues thread
…On Thu, Nov 14, 2024 at 11:57 AM Robert Brennan ***@***.***> wrote:
I've further confirmed that if you don't instantiate the log_generator in
LogBuffer, the leak goes away.
I've also confirmed that we're closing every log_generator we create.
TBH at this point I'm assuming there's a bug in the docker SDK
—
Reply to this email directly, view it on GitHub
<#4538 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BL3P7PQKBQZO74AJ6MIZOI32AT6D5AVCNFSM6AAAAABQP5ZOPSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZXGI4TINJUGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
This issue was closed because it has been stalled for over 30 days with no activity. |
I am trying to do the inference on swe-bench-lite and experienced "too many open files" error- Using openhands 0.21.0 version
|
Is there an existing issue for the same bug?
Describe the bug and reproduction steps
Note, I searched for the error message "Too many open files" and didn't see any open issue against this error.
Repro steps:
do nothing
/workspace
, just a few files will doEventually, the UI and backend server becomes broken.
In the UI, the message "Failed to fetch file" will show up.
In the backend, when running with
DEBUG=1 make run
this error message shows up:OpenHands Installation
Docker command in README
OpenHands Version
main
Operating System
MacOS
Logs, Errors, Screenshots, and Additional Context
No response
The text was updated successfully, but these errors were encountered: