-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
:Error code: ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT, #617
Comments
|
can you clarify which thread ? is that the specific bellows Thread ? Is the EventLoop thread a must for Bellows, or what is the impact if we disable it ? [EDIT] |
Yes, it's a part of bellows.
It's delivered by bellows in its thread.
If the thread is disabled, you'll get more errors like this. |
If I summarize your assumption, the issue is related to the fact that the Thread where the EventLoop is do not get enough time and so all tasks in the EventLoop are delayed . |
I agree with you then my production system with many integrations and was always having problems loosing contact with 2 cromcasts and have starting loosing contact with the EZSP for some month ago and still running stable firmware (6.10.x) with 80 devices. |
@MattWestb we are not using HA, we are using zigpy with the Zigbee4Domoticz plugin. This is embedded python in a C++ application. The user who is having those issues as more than 90 devices ( 70 routers), and there are quiet some heavy zigbee traffic. @puddly can I suggest adding in the EventLoop the following lines. In such we could instrument and get the confirmation if the task is delayed for more than xx ms. Don't know what is the expected time the EZSP expects the ack.
PS/ If the EZSP expect an ACK why are we not processing this request synchronously ? [EDIT] I have investigated on the system where we have the issue, and I found a matching patern, which is related to a Database backup started at the same time. So most-likely the system is heavy loaded from IO and CPU poitn of view and less time is given to the threads/EventLoop to do their work. |
I guess we are talking about :
Do we have any idea on what is MAXIMUM_ACK_TIMEOUT_COUNT and how much time it represent ? |
The problem we are facing after investigation is that GIL is not allocating time to other thread due to an IO block during a backup. Trying to fix that one the Domoticz application |
Since the recent version of zigpy & bellows we are having such issue.
If we fall back to previous version we do no get this error.
We have put in place an automatic restart when getting the connection lost, but it is weird not understand what is the root-cause.
A full loog file is available here. This is a large network with around more than 90 devices
zigpy/zigpy#1378 (comment)
The text was updated successfully, but these errors were encountered: