-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure connecting to [ip]: dial tcp [ip]:[port]: connection timed out #443
Comments
After restarting my entire system, I'm able to get individual forwarders to connect but they timeout regardless of how large a timeout I set:
When this happens, my entire logstash system appears to stop processing logs silently and individual forwarders begin failing, one by one, until all of them are giving these logs messages. Any info on this at all? |
This error This is usually caused by something in Logstash getting stuck (a slow/stalled filter or output) which prevents the inputs from processing more data and prevents the lumberjack input from acknowledging events. My guess would be that something in Logstash (or downstream of it, like an output) is stuck for you. Can you attach your config? |
I'm building up the 'x-message' flow to include one more filter, the elapsed filter, so that I can track time diffs between messages going in and out of my system. |
Here's the forwarder conf:
|
Let's see if logstash is stuck? Are you seeing events flowing out of Logstash into your outputs? If so, how behind real-time are they? Also, your config comment above omitted the |
oops. yep, I've got that in a different file:
30 - output file
|
I actually just saw this message for the first time:
Clearly something I need to solve, but a little afraid it's a new problem since I hadn't seen it before. Thanks a bunch for the help. |
Not sure if the OOM is a symptom or a cause right now. There's a known issue with lumberjack/logstash if an output is stuck somehwat mentioned in this ticket and this ticket. In your case, if Elasticsearch is down or not accepting new documents for some reason (network or other fault), then Logstash will block until that resolves, and lsf will keep retrying on timeouts, and Logstash will eventually OOM due to a bug mentioned in the above tickets. --- in this case, the OOM would be a symptom, not a cause. Is Elasticsearch healthy? |
I'm not quite sure how to interpret these logs, but perhaps they're suggesting a node was restarted?
|
#323 will provide more info to debug such backpressure situations. Meanwhile, for more help you can create a topic https://discuss.elastic.co/c/logstash/logstash-forwarder GitHub is reserved for issues/enhancements |
After upgrading to v 1.5.0.rc2, all of my logstash-forwarders (on ~12 machines) are doing the following. They previously ran well for months.
Any ideas?
The text was updated successfully, but these errors were encountered: