-
Notifications
You must be signed in to change notification settings - Fork 537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MQTT disconnects/reconnects due to missed KeepAlive interval to send ping #317
Comments
we have encountered a serious problem while regressing with the current version, specifically as you mentioned, random disconnections and problem with reconnect as well....for now we reverted the changes to the previous version... |
The older version basically relies on rounding nature of the Unix() and leaves the PING with something like 0.5s to send the reply which might be not enough on heavy loaded system. The new code actually exposes the real problem behind. @odedva You can also try https://github.com/eclipse/paho.mqtt.golang/pull/316/files |
we actually were dealing for very long time with issues of connect\reconnect on bad networks scenarios with this client. mainly due to the nature of the publish channels and etc. |
I didn't try using in harsh network environment, but with current implementation this 0.5s could be the issue. I'm not sure I understand what problems due to publish channels you have? Is there a ticket here? It is a bit strange that for this major issue no-one reacts for the last 10 days. For the near future I won't going to use mqtt library anymore, so someone else have to carry this fight :))) |
As per the spec the server is supposed to allow 1.5 times the keepalive interval to receive a pingreq
I can see this would be a problem if the keepalive interval is short, I appreciate the work in the associated PR, but I cannot merge it without a signed ECA |
I've run into this when the client is under load and orderMatters is set to true. #210 Also found that when we removed ordering, under load the app would overflow routines. Not sure if this is still in place but we ended up forking and modifying as a fix. https://github.com/meshifyiot/paho.mqtt.golang |
The issue is related to #300
which aimed to "Use monotonic time for keep alive".
Unfortunately, the originally used time.Unix() rounds the time, so you always have at least 0.5 seconds to send the Ping and you always have checkInterval set to KeepAlive/2 and thus every second that check succeed and sends appropriate Ping message.
Currently, I get "random" MQTT disconnects/reconnects due to the missed Ping. Now the code sends one or zero Ping messages every KeepAlive interval due to the higher precision and lack of "rounding".
However, I don't believe that the original code was intended to rely on the rounding second to achieve sending of Ping message.
Thus, I've split the KeepAlive to 5 and Ping would be sent at the last 1/5 (4rd check) of the KeepAlive interval.
Pull request: #316
Could you please review it and approve it. I don't want to create another account for the Eclipse :)
The text was updated successfully, but these errors were encountered: