Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update prometheus-federator to use dependencies from k8s 1.32.X+ #141

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

@alexandreLamarre alexandreLamarre force-pushed the full-update branch 6 times, most recently from ccd3fcd to 50a263a Compare January 13, 2025 19:56
pkg/chart/data.go Outdated Show resolved Hide resolved
scripts/build-chart Outdated Show resolved Hide resolved
@alexandreLamarre alexandreLamarre changed the title Update prometheus-federator to use dependencies from k8s 1.30+ Update prometheus-federator to use dependencies from k8s 1.32.X+ Jan 14, 2025
@alexandreLamarre alexandreLamarre marked this pull request as ready for review January 15, 2025 15:39
@alexandreLamarre alexandreLamarre requested a review from a team as a code owner January 15, 2025 15:39
go.mod Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
@mallardduck
Copy link
Member

BTW - I just realized that since this is targeting 1.32 we should ideally wait to ship this along with Rancher 2.11 that officially supports 2.11. And we're using RC libraries intended for Rancher 2.11 which feels wrong to include this soon. Because we don't do branching right now we can't merge this into main and expect it to not affect the February release cycle.

So what if you changed this to target 1.31 and Rancher 2.10 based libraries that way it doesn't have to be something we only do in 2.11. Otherwise we have to some how implement branch strategy and do partial reverts before the February release cycle.

@alexandreLamarre
Copy link
Contributor Author

@mallardduck The intention was always to bump dependencies as aggressively as possible and branch off the main branch, and downgrade dependencies selectively

@mallardduck
Copy link
Member

@alexandreLamarre - I guess that wasn't super clear. Either way works in the end I guess.

However after merging this we need to be careful about what tag is used to test main. If main needs to match rancher main, then that means the tag needs to logically match that version and leave a gaps. That way we have either a major/minor that correlates to the Rancher version, but versions under it that we can use to fill in gaps later.

i.e. After merging this the tag for main becomes 6.x by default, and covers 2.11; so in turn 5.x is for 2.10 and so on.

I think my misunderstanding there also may mean my opinion in change ordering will be different here. Specifically I think we do need to sort out my effort to split charts+image code first. Since having that complete will make us more prepared for the branching once implemented here.

@alexandreLamarre alexandreLamarre force-pushed the full-update branch 3 times, most recently from 5e3a49e to 0094819 Compare January 29, 2025 19:31
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this worth keeping - looks commented out.

# condition that causes the helm-locker tests to fail, because either controller can acquire and harden a helm-release.

echo "Running helm-locker integration tests"
KUBECONFIG=$KUBECONFIG go test -cover ./internal/helm-locker
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 KUBECONFIG=$KUBECONFIG think this isn't needed?

Or does this help KUBECONFIG=something make integration work better? If so, what if it was done as export KUBECONFIG=$KUBECONFIG in versions, I think that would keep that same context consistent for any make scripts.

@@ -148,7 +148,7 @@ helmProjectOperator:
job:
image:
repository: rancher/klipper-helm
tag: v0.7.0-build20220315
tag: v0.9.3-build20241008
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think that rancher/klipper-helm:v0.9.4-build20250113 will match the helm-controller we're targeting with this branch.

managedCrds := common.ManagedCRDsFromRuntime(f.RuntimeOptions)

// dynamically read instance type to override the crds required by automatically identifying them
// and always disable helm controller crd management since the cluster-scoped one should always manage those
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part may want to be removed and let users control it.

I'm worried that, if:

  • User disabled k3s/rke2 helm-controller and
  • User has namespace helm-controller mode setup, and
  • user has DETECT_K3S_RKE2=true - because one of the namespace ones owns CRDs,
  • user manually set helmController.enabled=true to PromFed be the controller

This change will create a breaking change - maybe that's fine but at the very least we need to document/Release Note it for users in that situation to know:

  • They should also deploy a helm-controller in namespaced mode to cover PromFed too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm just confused about the wording of the second line comment here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can double check but I'm fairly certain this matches 1-1 the existing control flow in main.

I also want to double check when we say namespaced helm-controller we mean one that watches all namespaces but has the controller-name set to something prometheus federator specific -- since the embedded helm-controller expects to watch across arbitrary namespaces

Copy link
Contributor Author

@alexandreLamarre alexandreLamarre Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onK3sRke2 := k8sRuntimeType == "k3s" || k8sRuntimeType == "rke2"
if onK3sRke2 {
logrus.Debug("the cluster is running on k3s (or rke2), `helm-controller` CRDs will not be managed by `prometheus-federator`")
}
return !onK3sRke2

Every time we attempt to create crds in main we do the following, which automatically disables helm-controller crds

if shouldManageHelmControllerCRDs(cfg) {
crdDefs = append(crdDefs, helmControllerCrdDefs...)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that here, in the shared CRD abstraction :

managedCrds = common.ManagedCRDsFromRuntime(common.RuntimeOptions{
DisableEmbeddedHelmLocker: f.RuntimeOptions.DisableEmbeddedHelmLocker,
DisableEmbeddedHelmController: true,
})

We don't actually disable the helm-controller (since we're not overwriting the config passed to the operator), we just tell the crd tracker to not track crds related to helm-controller, so users can still have the embedded helm-controller running, but not manage the crds in prom-fed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when we say namespaced helm-controller
Oh - sorry that's right I had a brain fart. The PromFed embedded helm-controller would be a second instance of a "global controller" since it's listening for all namespaces - but only for charts designated to be managed by it.

Where as if they opt to disable the PromFed provided controller then to be compliant with helm-controller policy and still keep PromFed working, the must:
a) use a global controller (either embedded in k3s/rke2, or added to any k8s), OR
b) use namespaced controller(s) so that each namespace they use for PromFed are covered.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So yeah - I think we're good and I was confused by the wording initially. I don't have any proposed changes to fix the wording - I probably just needed to read it slower 😆

Signed-off-by: Alexandre Lamarre <[email protected]>

add updated helm-locker

Signed-off-by: Alexandre Lamarre <[email protected]>

add updated helm-project-operator

Signed-off-by: Alexandre Lamarre <[email protected]>

Add node identification to internal package

Signed-off-by: Alexandre Lamarre <[email protected]>

Add internal helmcommon package to manage crds

Signed-off-by: Alexandre Lamarre <[email protected]>

update helm-locker to use shared crds abstraction

Signed-off-by: Alexandre Lamarre <[email protected]>

update helm-project-operator to use shared crds abstraction

Signed-off-by: Alexandre Lamarre <[email protected]>

update older sub-packages

Signed-off-by: Alexandre Lamarre <[email protected]>

update main functions

slightly rework crd management

Signed-off-by: Alexandre Lamarre <[email protected]>

update go generate targets

Signed-off-by: Alexandre Lamarre <[email protected]>

update to go 1.23

run go mod tidy

Signed-off-by: Alexandre Lamarre <[email protected]>

update golangci-lint

Signed-off-by: Alexandre Lamarre <[email protected]>

remove internal go sub modules

Signed-off-by: Alexandre Lamarre <[email protected]>

update controller & crd generation

Signed-off-by: Alexandre Lamarre <[email protected]>

Run go generate

Signed-off-by: Alexandre Lamarre <[email protected]>

update helm-project-operator with new generated code

Signed-off-by: Alexandre Lamarre <[email protected]>

fix crd generation path

Signed-off-by: Alexandre Lamarre <[email protected]>

tweak test labels

Signed-off-by: Alexandre Lamarre <[email protected]>

run go generate

Signed-off-by: Alexandre Lamarre <[email protected]>

update images to go 1.23

Signed-off-by: Alexandre Lamarre <[email protected]>

simplify dev scripts

Signed-off-by: Alexandre Lamarre <[email protected]>

remove dapper from Makefile

Signed-off-by: Alexandre Lamarre <[email protected]>

[WIP] simplify CI

Signed-off-by: Alexandre Lamarre <[email protected]>

vendor chart data in public package

Signed-off-by: Alexandre Lamarre <[email protected]>

update integration workflow again

Signed-off-by: Alexandre Lamarre <[email protected]>

use upstream helm during image builds

Signed-off-by: Alexandre Lamarre <[email protected]>

[WIP] importing images for integration CI

Signed-off-by: Alexandre Lamarre <[email protected]>

fix integration testing

Signed-off-by: Alexandre Lamarre <[email protected]>

[temp] disable validate-ci

Signed-off-by: Alexandre Lamarre <[email protected]>

lint fixes

Signed-off-by: Alexandre Lamarre <[email protected]>

Add back deprecated flag logic

Signed-off-by: Alexandre Lamarre <[email protected]>

remove unecessary CI stuff

Signed-off-by: Alexandre Lamarre <[email protected]>

update helm-project-operator build-chart destination

Signed-off-by: Alexandre Lamarre <[email protected]>

make crd management consistent with existing flags and add tests

Signed-off-by: Alexandre Lamarre <[email protected]>

update unit test command to exclude integration tests

Signed-off-by: Alexandre Lamarre <[email protected]>

Add helm-project-operator chart to examples dir

Signed-off-by: Alexandre Lamarre <[email protected]>

update path for HPO chart in integration tests

Signed-off-by: Alexandre Lamarre <[email protected]>

revert public pkg/chart package

Signed-off-by: Alexandre Lamarre <[email protected]>

vendor correct helm-controller dependencies

Signed-off-by: Alexandre Lamarre <[email protected]>

update wrangler/go toolchain

Signed-off-by: Alexandre Lamarre <[email protected]>

remove helm-locker e2e CI

Signed-off-by: Alexandre Lamarre <[email protected]>

bump rancher/klipper-helm to v0.9.4-build20250113

Signed-off-by: Alexandre Lamarre <[email protected]>

add KUBECONFIG to version script

Signed-off-by: Alexandre Lamarre <[email protected]>

add back embedded controller name logic

Signed-off-by: Alexandre Lamarre <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants