You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On some machines we've experienced an issue with UDP healtcheck passing once and then hanging in "passing" status despite service going down and no longer being available. Monitoring the UDP traffic, consul sends just one check and stops sending any more. We're using a simple server that responds to an empty UDP payload with empty UDP payload.
Adding extra logging steps to consul, seems like the check just hangs on
Looking through the code further, it seems like there's no actual timeout set for UDP checks. Dialer timeout set in func (c *CheckUDP) Start(), as far as I understand, is related to establishing connection (which makes no sense for UDP? except for timeout for resolving endpoint address?) and is irrelevant to actual conn timeouts, which should be set via SetReadDeadline? In func (c *CheckUDP) check():
which I also do not understand at all. Why on read timeout, the check actually passess? Doesn't UDP healthcheck expect a response?
Setting SetReadDeadLine and deleting the "i/o timeout" case error check (btw why does it check for string value? I'm not a go developer, but this seems odd to me) makes the check update accordingly when the UDP endpoint goes up or down.
Since I'm not a go dev, I might be grossly misinterpreting what aformentioned functions do, but after making the changes I've described, consul stopped misreporting UDP service status for me.
Reproduction Steps
Start consul agent
Register servce with UDP healtcheck with timeout of 1s
Pass the check once, stop the service
Service is still passing in consul (logs, UI, API)
Overview of the Issue
On some machines we've experienced an issue with UDP healtcheck passing once and then hanging in "passing" status despite service going down and no longer being available. Monitoring the UDP traffic, consul sends just one check and stops sending any more. We're using a simple server that responds to an empty UDP payload with empty UDP payload.
Adding extra logging steps to consul, seems like the check just hangs on
consul/agent/checks/check.go
Line 835 in c1a887e
Looking through the code further, it seems like there's no actual timeout set for UDP checks. Dialer timeout set in
func (c *CheckUDP) Start()
, as far as I understand, is related to establishing connection (which makes no sense for UDP? except for timeout for resolving endpoint address?) and is irrelevant to actualconn
timeouts, which should be set viaSetReadDeadline
? Infunc (c *CheckUDP) check()
:However just adding
SetReadDeadLine
doesn't fix anything because of check inconsul/agent/checks/check.go
Lines 836 to 839 in c1a887e
which I also do not understand at all. Why on read timeout, the check actually passess? Doesn't UDP healthcheck expect a response?
Setting
SetReadDeadLine
and deleting the "i/o timeout" case error check (btw why does it check for string value? I'm not a go developer, but this seems odd to me) makes the check update accordingly when the UDP endpoint goes up or down.Since I'm not a go dev, I might be grossly misinterpreting what aformentioned functions do, but after making the changes I've described, consul stopped misreporting UDP service status for me.
Reproduction Steps
Consul info for both Client and Server
Server info
Operating system and Environment details
Linux 5.4.0-198-generic #218-Ubuntu SMP Fri Sep 27 20:18:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
The text was updated successfully, but these errors were encountered: