Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of errors returned by writev() #36

Merged
merged 1 commit into from
Mar 26, 2023
Merged

Conversation

oxan
Copy link
Owner

@oxan oxan commented Mar 22, 2023

This should fix the problems reported in #32 and #34.

@LZ7AA
Copy link

LZ7AA commented Mar 23, 2023

I am happy to report the stream is stable but with glitches in the data for 8 hours now. Will continue monitoring and will report again. The glitches seem to be missing chunks of characters - message lines cut here and there.
Disconnect/connect of TCP and UART stream does not brake the operation. Tested with 2-4 simultaneous TCP clients = stable.
One weird thing i noticed though. I use RealTerm terminal to connect to the TCP stream and I observe the TCP flow on my Mikrotik router.
Some times the client connects in "slow" mode with 50 kbps and 4 TCP packets per second reported by the Mikrotik. The terminal screen updates like 2 times per second, hence "slow".
Other times the client connects in "fast" mode with 70~90 kbps and 20 packets per second. The terminal screen updates much more frequent and scrolls up smoothly.
As far as I remember an year ago (with the old version of the stream server) I was able to achieve 700 kbps throughput from the same UART on the same ESP8266.
Any thoughts Oxan?
I am still interested to try and compare the old version of the stream-server. Will you assist please to get it?
Thanks,
Kiril

@oxan
Copy link
Owner Author

oxan commented Mar 23, 2023

The glitches seem to be missing chunks of characters - message lines cut here and there.

Can you try increasing the UART RX buffer size and the stream server buffer size?

uart:
  rx_buffer_size: 8192

stream_server:
  buffer_size: 8192

I think the stream server might currently be limited to transmitting (about) 60 times its buffer size per second; if you can confirm this fixes it I'll have another look at that.

Some times the client connects in "slow" mode with 50 kbps and 4 TCP packets per second reported by the Mikrotik. The terminal screen updates like 2 times per second, hence "slow".
Other times the client connects in "fast" mode with 70~90 kbps and 20 packets per second. The terminal screen updates much more frequent and scrolls up smoothly.

That's weird, I can't immediately think of anything that would cause this.

I am still interested to try and compare the old version of the stream-server. Will you assist please to get it?

It is available in the async-tcp branch:

external_components:
  - source: github://oxan/esphome-stream-server@async-tcp

@LZ7AA
Copy link

LZ7AA commented Mar 23, 2023

All right, weird enough. I increased both buffers to 8k. BTW in my previous tests i played with UART buffer only. I didn't know there is parameter from the stream too.
This stabilized the glitches. Almost gone. I get a glitch every 10 seconds or so. Will try increasing the buffer further and will report back here.
Now the weirdness of the bandwidth is more pronounced. I can get "fast" rate only once out of 10 tries. The "slow" rate is the same at 50kbps, the "fast" one went way up to 150-170kbps. The fast one is not stable though. When I disconnect/connect the client (only 1 client - RealTerm terminal) I can hardly get it work fast and when I do it lasts for 10-30 seconds and then falls to the choppy slow screen flow and the rate drops to the (seems constant) 50kbps. In a few occasions I saw it go from fast to slow then to fast and back to slow with the fast period lasing few seconds and slow can run like forever.
The fast rate is 180k (increased compare to the previous test with 512 UART buffer) with 20 packets per second.
There are occasional moments in time when the rate drops down to 0 for a second or 2 and there screen looks like frozen (no input coming).

@LZ7AA
Copy link

LZ7AA commented Mar 23, 2023

Now I managed to choke the ESP CPU to death by increasing both buffers further to 16384. The flow rate is reeeally slow and is choppy, barely able to transfer a hundred characters and then goes to complete silence for 4-5 seconds.
When tried to reflash it back (OTA) it gets stuck at 10-15% and times out. I have seen this before - CPU overload.
After a couple of dozen of retries and power cycles I managed to revert it back to 8192 buffer. I have no access to the device. It is located on the rooftop close to the receiver antenna, but I have a possibility to cut it's power.

@LZ7AA
Copy link

LZ7AA commented Mar 23, 2023

Test - async server.
Now I switched back to the old async server and everything seems quite OK, as it was.
There are no glitches in the messages at all. The rate is fast - 170kbps and going down to 100k, but this seems natural as the number of airplanes decreased this late in teh evening.
I observed some hiccups in the flow. The flow stops for a second and then resumes fast. It is a momentary stop of flow and then it resumes normally fast, like a hiccup. It is not like the choppy flow of the new server, when it goes slow. Those hiccups coincide with TCP packets going to the HomeAssistant. So I guess the CPU gets busy when HomeAssistant polls for updates. It is very pronounced when I increase the UART buffer to 4096 (I didn't dare to increase any further). I'm leaving it to 512 as this level is fast and stable.

Oxan, please keep the async server in the repository I might need it in the future.

Any suggestions how to proceed further with diagnosing the new server?

@LZ7AA
Copy link

LZ7AA commented Mar 25, 2023

Here I provide a graphic to illustrate the throughput from the ADSB receiver.
dump1090rate-48h

As you can see Thursday is on the new component and Friday is on the old async component.
The new component seems capped at 50 messages per second, with rate of 50kbps.

@oxan
Copy link
Owner Author

oxan commented Mar 26, 2023

Thanks for your input!

The new component seems capped at 50 messages per second, with rate of 50kbps.

The 50 messages per second I can explain, it's a result of the design of the "new" socket-based version. I changed that on purpose because with the old version sometimes the CPU got fully saturated with processing all UART data and didn't get around to any of the normal ESPHome tasks anymore. I'd rather drop a few bytes than choke the entire CPU, though in this case it's clearly behaving suboptimal. I'll have to think a bit about how to get the best of both worlds...
What buffer sizes did you use? The throughput should be 50 times the smallest buffer of the stream server and the UART itself, so if you used 8K buffers it should be 400kbps instead of 50kbps.

@oxan
Copy link
Owner Author

oxan commented Mar 26, 2023

I'll merge this now as it resolves at least the stream corruption; and I'll leave #32 open to further discuss the issue(s) at high baudrates.

@oxan oxan merged commit 104c4f7 into master Mar 26, 2023
@oxan oxan deleted the esp8266-disconnects branch March 26, 2023 15:38
@LZ7AA
Copy link

LZ7AA commented Mar 26, 2023

The highest buffer I was able to test was 8k uart + 8k TCP. Then I went straight up to 16384 and it saturated the CPU (esp8266).
Data corruption was almost gone (perhaps there was still some missing data between buffer handlings by the stream server). So here I'm OK. The overall performance was bad though compare to the async stream server.
Will continue discussion at issue #32 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants