-
Notifications
You must be signed in to change notification settings - Fork 556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/naming: shorten regexp when matching many similar names #670
Conversation
Welcome @abursavich! |
Hi @abursavich. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any big issue with that feature. Even if the configuration is templated, I can hardly a scenario where a valid configuration that already expected a regex would break with the simplified version of the expression.
/assign |
/ok-to-test |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: abursavich The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
Hey, @dgrisonnet! Please take another look. Tests are passing. |
cae9317
to
69ae2d1
Compare
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Fixes #668.
Pods from a deployment will have long names like:
The naive regexp that matches these pods is
foobar-59c8d74c75-9gv24|foobar-59c8d74c75-dhfbj|foobar-59c8d74c75-kggvk|foobar-59c8d74c75-nd8jm|foobar-59c8d74c75-nqctr|foobar-59c8d74c75-pxlnd|foobar-59c8d74c75-rnxk7|foobar-59c8d74c75-rpj7q|foobar-59c8d74c75-vl6n6|foobar-59c8d74c75-xhvp4
(len=239).A shorter regexp that factors out the shared prefixes and matches the same pods is
foobar-59c8d74c75-(9gv24|dhfbj|kggvk|n(d8jm|qctr)|pxlnd|r(nxk7|pj7q)|vl6n6|xhvp4)
(len=81).A similar set of only 5 pods achieves 59% compression. This example with 10 pods achieves 66% compression. At 20 pods it's ~71%, at 40 pods it's ~73%, at 80 pods it's ~74%, and at 200 pods it starts to converge to ~75%. The savings ultimately depend on the length of the deployment name, but even pods with short deployment names (e.g.
foobar
) still benefit because the replicaset hash (e.g.59c8d74c75
) contributes a lot of redundancy.Instead of trying to implement this transformation directly, this change lets the
regexp/syntax
package do all the heavy lifting. In some cases this doesn't result in the shortest possible string (e.g. some short prefix repetitions are more optimal than complete prefix factoring:nd8jm|nqctr
vsn(d8jm|qctr)
), but writing and maintaining a separate implementation isn't worth it to get at these crumbs.