Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(kuma-dp): prevent watchers from being cleaned up unexpectedly #12886

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

jijiechen
Copy link
Member

@jijiechen jijiechen commented Feb 19, 2025

Motivation

fixing potential xDS update lost by unexpected watcher cleanups

Implementation information

Only allow a single active connection between a DP and the CP:

  • [this change] In the pkg/xds/server/callbacks/dataplane_callbacks.go, check if there is existing active stream for a dataplane: if there is, reject this connect attempt.
  • [this change] In the pkg/xds/server/callbacks/dataplane_sync_tracker.go, OnProxyConnected will wait for the actual cleanup to complete and then return
  • [existing behaviour] the DP will then retry connecting in a loop-backoff manner
  • [known fact] the DP may reach other CP instances which do not have this stale state

This does not change the behaviour of current CP XDS server as we never supported multiple xDS streams from a DP.

Supporting documentation

fixes #12885

…e polling on the state checking and make it synchronous to unblock other proxies

Signed-off-by: Jay Chen <[email protected]>
@jijiechen jijiechen added the ci/run-full-matrix PR: Runs all possible e2e test combination (expensive use carefully) label Feb 19, 2025
@jijiechen jijiechen requested a review from a team as a code owner February 19, 2025 06:31
Copy link
Contributor

Reviewer Checklist

🔍 Each of these sections need to be checked by the reviewer of the PR 🔍:
If something doesn't apply please check the box and add a justification if the reason is non obvious.

  • Is the PR title satisfactory? Is this part of a larger feature and should be grouped using > Changelog?
  • PR description is clear and complete. It Links to relevant issue as well as docs and UI issues
  • This will not break child repos: it doesn't hardcode values (.e.g "kumahq" as an image registry)
  • IPv6 is taken into account (.e.g: no string concatenation of host port)
  • Tests (Unit test, E2E tests, manual test on universal and k8s)
    • Don't forget ci/ labels to run additional/fewer tests
  • Does this contain a change that needs to be notified to users? In this case, UPGRADE.md should be updated.
  • Does it need to be backported according to the backporting policy? (this GH action will add "backport" label based on these file globs, if you want to prevent it from adding the "backport" label use no-backport-autolabel label)

@lahabana
Copy link
Contributor

There's already code that attempts to do this in xdsCallbacks I believe if anything needs to change it's in there

Signed-off-by: Jay Chen <[email protected]>
…nowing that they need to send channel signal from a goroutine

Signed-off-by: Jay Chen <[email protected]>
@jijiechen jijiechen requested a review from Icarus9913 February 20, 2025 06:23
@jijiechen jijiechen enabled auto-merge (squash) February 21, 2025 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/run-full-matrix PR: Runs all possible e2e test combination (expensive use carefully)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dataplane proxies do not get mTLS identity cert and other xDS resources updated
3 participants