Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNSPolicy scale test #615

Merged
merged 1 commit into from
Feb 6, 2025
Merged

DNSPolicy scale test #615

merged 1 commit into from
Feb 6, 2025

Conversation

mikenairn
Copy link
Member

@mikenairn mikenairn commented Jan 13, 2025

Adds a DNSPolicy specific scale test using kube burner.

Part of #928

Based on the existing scale test, but with a focus on DNSPolicy and shared hostnames being updated by multiple dns operator instances.

The workload will create multiple instances of the dns operator in separate namespaces(kuadrant-dns-operator-x), and multiple test namespaces (scale-test-x) that the corresponding dns operator is configured to watch. The number of dns operator instances and test namespaces created is determined by the JOB_ITERATIONS environment variable.
In each test namespace a test app and service is deployed and one or more gateways are created determined by the NUM_GWS environment variable. The number of listeners added to the gateway is determined by the NUM_LISTENERS environment variable.
Each listener hostname is generated using the listener number and the KUADRANT_ZONE_ROOT_DOMAIN environment variable. In each test namespace a dns provider credential is created, the type created is determined by the DNS_PROVIDER environment variable, additional environment variables may need to be set depending on the provider type.

Requires:

Comments/Thoughts:

  • Kubeburner does not have the concept of running workloads across multiple instances. This was one of the asks in this issue. It is probably possible to run multiple kubeburner tasks simultaneously using the same configuration in order to have multiple updates to the same record set from multiple clusters but there would be no orchestration from kubeburners POV. It should also use of a single thanos instance instead of one deployed on each cluster.
  • For these workloads to be of any use we need good metrics and alerts that are expected to fire when things are not working. It's not a test suite with assertions on the state, but rather it expects alerts to fire in order to fail the test run.
  • Separating the DNS Operator specific templates/metrics/alerts into the dns operator repo makes sense as long as we have a similar scale test in that repo. TBD if we do want that.

Alerts
A small list of alerts that i realised would be useful, but really there are probably hundreds required.

  • Alert when a gateway has not been assigned an address in an appropriate amount of time (Can be hit quite easily when using kind if you only have a few IPs available). This isn't strictly a kuadrant, issue.
  • Alert when DNSRecords are in a failing state for a given amount of time.
  • Alert if the managers are restarting an unexpected amount of times during the test run. Hit this as part of the DNSRecord scale test, wrote an alert for this here.

Makefile Show resolved Hide resolved
Copy link

@maleck13 maleck13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't actually try this and would prefer if @trepel or another of the QE team took a look and approved, but the changes look good to me

@trepel
Copy link
Contributor

trepel commented Jan 20, 2025

I tried against OCP cluster and Route53 and it worked as described - except for that missing github.com/kuadrant/dns-operator/config/observability?ref=main but that's being discussed. What surprises me was that no matter how many GWs/HTTPRoutes are created there is just NUM_LISTENERS unique hostnames. That's by design if I got it right, and it seems to be working (checked by dig +short )

@maleck13
Copy link

@mikenairn do we want to move this to ready and get it merged ? @trepel from your comment seems good to merge?

@mikenairn mikenairn force-pushed the dnspolicy_scale_test branch from 0be3332 to 82a7848 Compare January 29, 2025 12:01
@mikenairn mikenairn marked this pull request as ready for review January 29, 2025 12:02
@mikenairn
Copy link
Member Author

What surprises me was that no matter how many GWs/HTTPRoutes are created there is just NUM_LISTENERS unique hostnames.

This is intentional for the workload being added since it's testing dns at scale, part of that is testing that multiple records all contributing to the same dns name works with increasing numbers of owners.

@mikenairn mikenairn mentioned this pull request Jan 29, 2025
Adds a DNSPolicy specific scale test using kube burner.

The workload will create multiple instances of the dns operator in
separate namespaces(kuadrant-dns-operator-x), and multiple test
namespaces (scale-test-x) that the corresponding dns operator is
configured to watch.  The number of dns operator instances and test
namespaces created is determined by the `JOB_ITERATIONS` environment
variable.
In each test namespace a test app and service is deployed and one or
more gateways are created determined by the `NUM_GWS` environment
variable.  The number of listeners added to the gateway is determined by
the `NUM_LISTENERS` environment variable.
Each listener hostname is generated using the listener number and the
`KUADRANT_ZONE_ROOT_DOMAIN` environment variable.  In each test
namespace a dns provider credential is created, the type created is
determined by the `DNS_PROVIDER` environment variable, additional
environment variables may need to be set depending on the provider type.

Signed-off-by: Michael Nairn <[email protected]>
@mikenairn mikenairn force-pushed the dnspolicy_scale_test branch from 82a7848 to 9aaa301 Compare January 29, 2025 13:04
@maleck13
Copy link

@trepel are you ok to approve and merge this?

Copy link
Contributor

@trepel trepel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this managed to get buried in my todo list. Yes, I am fine to merge this one

@trepel trepel merged commit 4409d64 into main Feb 6, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants