-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v2dns makes SRV records unusable #21325
Comments
Hey @setaou, thank you for the report. It definitely looks like a regression. I'm looking into it now and will report back with a fix or any additional information. For anyone else trying out 1.19 in the interim that is having problem with SRV records or anything DNS related, you can set |
@DanStough in case it wasn't clear from the description of the issue, the problem is not only about services having the same name, but whichever string you prepend to the record, it will always be resolved. dig hey.there.how.is.it.going.consul.service.ha.geant.net @127.0.0.1 -p8600 -t SRV +short
1 1 8300 test-consul02.node.test-geant.ha.geant.net.
1 1 8300 test-consul03.node.test-geant.ha.geant.net.
1 1 8300 test-consul01.node.test-geant.ha.geant.net. |
The title of the issue does not tell the whole story. When issuing a standard A lookup against consul DNS the tag is ignored and all registered instances are returned. As #21336 got closed as duplicate of this, this issue should clearly indicate that this is not a SRV record only issue, but filtering for tags via Consul DNS is generally broken. When I started noticing issues I have not considered looking into this issue, because it was not resembling my observations. When we make that a little more clear in the title of this issue, other users might find the workaround faster than me. I can confirm by turning |
@faryon93 I am going to re-open. |
Actually, querying SRV or A for |
Thanks all for the details. I've reproduced the behavior with the tag #21336 and the SRV results (this issue). I should have a PR up today or tomorrow to fix. |
FWIW, we also found this change appears to break/confuse HAProxy DNS Discovery. This older page describes the setup, but the most relevant part is:
It looks like HAProxy constantly detects different IPs for the same service because the SRV records are all returning the same host name which is resolving to different IPs. That causes it to constantly flap between backend servers. We see a lot of messages in our HAProxy logs for a Consul service "dashboard" like:
|
This should be resolved with the linked PR. We're discussing putting out 1.19.1 sooner than expected. BOLO for that release. |
I should note, this version breaks HAProxy Enterprise dynamic backend configurations with consul, so i would consider this a high priority to release the patch before bigger firms start having issues. Fortunately my issues were on my dev cluster and your configuration change seemed to fix my problems |
Overview of the Issue
Since consul 1.19 (with v2dns enabled by default), SRV requests return the same hostname for every allocation. If allocations are using different ports, there is no way to tell which IP correspond to which port, rendering DNS SRV queries useless.
Reproduction Steps
Given a service "livetiler" tagged "prod" in the datacenter "paris" with 5 allocations on different hosts, here is the result of an SRV request on Nomad 1.19.0 :
Obviously, the hosts can be fetched using an A query on livetiler.service.paris.consul., but it is impossible to know which port correspond to which host.
On the contrary, on Consul 1.18 (or 1.19.0 with the v1dns option), an SRV request return a different name for each allocation:
Consul info for both Client and Server
Client info
Server info
Operating system and Environment details
Ubuntu Linux 22.04 LTS, amd64
Log Fragments
n/a
The text was updated successfully, but these errors were encountered: