Address the telnet console death bug. #2347

spikefishjohn · 2024-02-03T03:19:53Z

Add wait_for for drain() call with a 5 second timeout. If we're stuck on drain then the buffer isn't getting emptied. 5 seconds after drain() blocks, exception will be thrown and client will be removed from connection table and will no longer be a problem. There will be a pause in console output while in this 5 second wait, after that the console will return to normal.

When the exception is triggered the follow debug will be logged.

2024-02-03 03:07:15 DEBUG telnet_server.py:309 Timeout while sending data to client: ('REMOTE_IP', REMOTE_PORT), closing and removing from connection table.

It just dawned on me that peerinfo() is a list so if you want that reformatted it would be easy enough to do.

Fixes Issue 2344

…re stuck on drain then the buffer isn't getting emptied. 5 seconds after drain() blocks, exception will be thrown and client will be removed from connection table and will no longer be a problem.

spikefishjohn · 2024-02-03T04:09:30Z

oops, just noticed its a 10 second timeout and not 5.

spikefishjohn · 2024-02-06T18:16:55Z

One additional note about the fix. I switched to looking up items by keys instead of values and returning a list so that we can safely remove items from the dictionary while transversion.

cristian-ciobanu · 2024-02-06T19:54:02Z

@spikefishjohn so merging this pull request will fix completely the console freeze bug when using telnet ?

spikefishjohn · 2024-02-06T20:10:49Z

It does for me. May want to lower the timeout as 10 seconds feels a little long. That being said the patch could use additional testing. Do you have a way to replicate the issue btw?

The way I recreate the issue is clear the firewall's connection table for the telnet session (I'm accessing GNS3 through a firewall to be clear). This will cause the firewall to do a silent drop on the tcp session. Then open a 2nd telnet session and do something that creates a large amount of output on the console. I use find / on a linux node for example.

Without the patch the console will output data for about 10 seconds or so then lock up. You'll be able to open new sessions and get connected and even send data but you won't see anything happening on the console as the echo of your data doesn't get sent to you.

With the patch there will be a pause while GNS3 is trying to communicate with the down telnet session. After 10 seconds it will give up, delete the session and then session 2 will start working again as well as any new session.

Also note the next version of GNS3 will have tcp keepalives timers lowered so that it sends keep alives more often then every 2 hours. This will prevent a firewall from killing a session from being idle for too long but won't fix the broken console issue.

cristian-ciobanu · 2024-02-07T10:29:55Z

Do you have a way to replicate the issue btw?

I can only replicate this only by leaving the telnet console session opened for a while (no activity for several hours I think) until I receive the message connection refused.

spikefishjohn · 2024-02-07T14:02:26Z

hmmmm.. connection refused. Ok I can't say for sure this will address your issue but only because I don't know what happens if the telnet console is left in a bad state for long enough. I personally have never seen it throw connection refused, which would mean the listening socket closed. That being said if the GNS3 node continues to try to push data out and it can't write data at some point I would expect the service to do something bad (like crash) ... so its possible.

If you can give me more details i'll see if I can replicate or by all means give the update a try and report back.

For me to replicate can you tell me how your accessing the telnet console network topology wise? To give you an idea of bullet points i'm looking for.

GSN3 version
client OS
GNS3 server OS
GNS3 node your connecting to (cisco router, linux, etc). Hopefully its something I have access to.
How does the client reach the server? (its accessed through router, firewall, VPN, GNS3 VM which is running on Virtbox/Vmware/ESX etc (crossing fingers its not GNS3VM).
Just be as detailed as possible if you want me to try to replicate. Minor details may not be. :)

cristian-ciobanu · 2024-02-07T20:46:14Z

Sorry I got a bit confused. I have two different issues related to telnet console connections.

The telnet console session freezes if I leave opened for a while (no activity for several hours I think) no message received.
When I use auto-start console option for devices the console of the devices automatically pops in with the message connection refused (using SecureCRT for accessing console of devices). If I close the console and reopen again it starts to work fine.

I have two different setups for GNS3.

GNS3 server 2.2.44 running on bare metal Ubuntu 22.04 machine and GNS3 client 2.2.44 running remotely on Windows.
GNS3 3.0.0 beta (server and client) running on bare metal on Linux (same machine Archlinux)

The first issue with the telnet console freeze happens only with the remote server setup not when running both server and client on same machine.
The second issues happens randomly so it is not easy to replicate. I think maybe when I start multiple devices at once and their all open their console, but I need to do some more extensive testing.

spikefishjohn · 2024-02-07T20:59:00Z

ah ok.

For sure, this will fix issue number 1. I'll be my burrito on it. Please give the patch a try to see if I owe you one.

Issue 2 sounds like a completely different thing. Like a timing issue where it hasn't waited for the device to return running or maybe for the console to have started. I would make a new bug report for that.

cristian-ciobanu · 2024-02-09T11:55:21Z

@grossmj 3.0 branch does not need this patch applied or the other pull request using telnetlib3 rewrite should in theory fix this annoying issue ?

grossmj · 2024-02-09T12:31:07Z

@grossmj 3.0 branch does not need this patch applied or the other pull request using telnetlib3 rewrite should in theory fix this annoying issue ?

The patch will be merged into 3.0 and the other pull request using telnetlib3 should make it there as well once it has been sufficiently tested.

spikefishjohn · 2024-03-01T04:52:20Z

sooo.. uh.. loaded up 2.2.46. Opened a telnet console. Slept laptop, went to bed. Work up. Double click vyos box.

... nothing...

.....

spikefishjohn · 2024-03-01T04:58:10Z

EDIT: This is completely wrong. Doh! I need to stop the late night bug research.

I did notice something. this calls except if it fails which calls close() which then does this.

   async def close(self):
        for writer, connection in self._connections.items():
            try:
                writer.write_eof()
                await writer.drain()
                writer.close()
                # await writer.wait_closed()  # this doesn't work in Python 3.6
            except (AttributeError, ConnectionError):
                continue

So my theory is the drain call needs to have a wait_for as well. I'm going to see if I can recreate the problem then take a swing at it again.

Address the telnet console bug. Add wait_for for drain() call. If we'…

763ef24

…re stuck on drain then the buffer isn't getting emptied. 5 seconds after drain() blocks, exception will be thrown and client will be removed from connection table and will no longer be a problem.

spikefishjohn mentioned this pull request Feb 3, 2024

Telnet console death issue #2344

Open

grossmj changed the base branch from master to 2.2 February 9, 2024 06:09

grossmj merged commit 16f72b4 into GNS3:2.2 Feb 9, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Address the telnet console death bug. #2347

Address the telnet console death bug. #2347

spikefishjohn commented Feb 3, 2024 •

edited

Loading

spikefishjohn commented Feb 3, 2024

spikefishjohn commented Feb 6, 2024

cristian-ciobanu commented Feb 6, 2024

spikefishjohn commented Feb 6, 2024 •

edited

Loading

cristian-ciobanu commented Feb 7, 2024

spikefishjohn commented Feb 7, 2024

cristian-ciobanu commented Feb 7, 2024

spikefishjohn commented Feb 7, 2024

cristian-ciobanu commented Feb 9, 2024

grossmj commented Feb 9, 2024

spikefishjohn commented Mar 1, 2024

spikefishjohn commented Mar 1, 2024 •

edited

Loading

Address the telnet console death bug. #2347

Address the telnet console death bug. #2347

Conversation

spikefishjohn commented Feb 3, 2024 • edited Loading

spikefishjohn commented Feb 3, 2024

spikefishjohn commented Feb 6, 2024

cristian-ciobanu commented Feb 6, 2024

spikefishjohn commented Feb 6, 2024 • edited Loading

cristian-ciobanu commented Feb 7, 2024

spikefishjohn commented Feb 7, 2024

cristian-ciobanu commented Feb 7, 2024

spikefishjohn commented Feb 7, 2024

cristian-ciobanu commented Feb 9, 2024

grossmj commented Feb 9, 2024

spikefishjohn commented Mar 1, 2024

spikefishjohn commented Mar 1, 2024 • edited Loading

spikefishjohn commented Feb 3, 2024 •

edited

Loading

spikefishjohn commented Feb 6, 2024 •

edited

Loading

spikefishjohn commented Mar 1, 2024 •

edited

Loading