-
Notifications
You must be signed in to change notification settings - Fork 7.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SDK 5.3.1 WiFi still bugged in TCP/IP stack (IDFGH-14128) #14932
Comments
@filzek Could you share more detail to reproduce this issue? |
We have not yet found a straightforward way to replicate this process. However, our analysis so far indicates that btm_rrm_t is not being properly destroyed when the Wi-Fi/LWIP stack is reinitialized. This oversight results in the continuous creation of new tasks, leading to task duplication and potential resource exhaustion. |
Wifi layer still being corrupted and will stop work in multitasking complex tasks. We have this problems for so many versions, we would like to know any true robust system running right now with esp32 without network issues. Seems that for the last 4 years the problem still the same Wifi stocks halts loose connection and never came back. Now tcp/ip layer with same proglems. SDK 5.3.1 not okay. We really want to understand why things are this deeply bad in keep the connection working???? Why we need to create a lot of patches to try to make the wifi and ip stack barnacle workable in a production environment. I am about to open to offer for thousands of USD to show that solution aren't working at all in the development level inside espressif and CPUs sold could be extremely effective and can't stand working in production environment. Things already went too far and now true answer come to the table to solve it. Everyone in espressif push to one to another and no one there really calls it on! It's time to someone come abroad and solve the problem with the wifi and ip layer, thousands of offline devices that need to be power off and power on again isn't a true solution for this kind of service. @euripedesrocha can someone come abroad to solve the problem for real???? |
Do you mean the older sdk versions (e.g. 5.2.x, 5.1.x) also have the same issue? |
@AxelLin NO WAY, latest STABLE WIFI / IP STACK working is SDK 3, all others versions is complete BUGGED the WIFI on ESP32 and makes devices to crash random, and the Espressif know it all and didnt come to public to tell the problem so far, the dev team try to hide the problem as it is nothing happen at all, and we try to always show the problem to them to let them to be able to fix, but they tends to just become deaf to let know that this is sintomatic and spread all over, can occurs random and needs a physical reset to make the hardware works again. This is why 3.1 is out to use, but what about the 1.1, 3.0 hardware in the market, costumers that arent stable because a software failure that hangs the WIFI interface and makes in irreparable???? |
So, why the Wifi / TCP stack cant survive and start to degrade all over aleatory? This happen everywhere and why things arent clear to know what to do or what to do not do? We can release the code that stuck and halt all internal esp32 registries even upon restart of the cpu, it keep a mess, so, only a full power down and power on can recover the inside registries from it, and its very simple to make it happen. This is something related to the current caos, but, not the intent. The great question is why the WiFi cant stand running and colapse? Why did it not bring and info to the problem? |
@AxelLin I thinkl the problem could start with something related to the software/hw ble/wifi coexistance, somethings point out there. |
@filzek, could you attach an sdkconfig? We've also had some networking troubles, but coexistance seems to be working pretty well for us. |
@filzek |
Hi @hansw123 @bryghtlabs-richard @AxelLin We are tracking the issue to the lowest level as possible, but we can't make the problem happen on bench development, only in release field this happen so far. In our tracking the problem happen following this: 2 - WiFi reconnection sometimes loses the IP and can't get it so far, so a manual dhcp stop and start mist be done, but the IP address must be cleaned to all zeroes first. 3 - DHCP loses the IP while the WiFi layer still connected. The item 3 track with a running ping continuously to the own IP get in the STA interface, so if it's connected the ping loses should be minimal and if so it is working, but if the IP stop to ping the lwip/dhcp layer is somehow breaked. Fox as the step 2. The WiFi event handle acting in IP event when the problem 2 happen there is no IP in the interface so doing as said in another code side the issues could be fixed. The log tells that it got the IP but it really doesn't and there is no action on any http server. Websocket or any other part, so the dhcp simple doesn't work as intent so force doing the solution 2 it fixes everything. The item 1 sometimes is extremely difficult to track as simple it stop working but it doesn't call any disconnection or wifi event handlers, this make totally difficult to track the field deployed devices, asto this solution a set of supervisory ips, pings, external actions, layer check, are done to understand the break on the wifi and so the solution 1 is applied The Nimble is latest sdk 5.4 with latest commit as feb 24 2025 is working as intent but still problem with asserts yet. Tomorrow I Will add the sdkconfig to here. We.have fixed the bugs by alternative corrections, the best is if in the wifi driver it could be fixed as show above as the tracking issues could be something easy to the wifi/lwip team to patch. |
Updating findings |
Answers checklist.
General issue report
The SDK 5.3.1 has a very deep bug in the IP/WiFi stack where it get stucked in TCP mode, the UDP mode keep working.
Also the wifi act weird in de-handshake sometimes.
Using WebSocket make the problem to get worst.
Unique way to solve it to deinitialize the lwip and wifi, and recreat it all again.:
Merge branch 'bugfix/fix_some_wifi_bugs_241024_v5.3' into 'release/v5.3'
fix(wifi): fix some wifi bugs 241024 v5.3
See merge request espressif/esp-idf!34420
The text was updated successfully, but these errors were encountered: