Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests running on k8s latest commit are failing #10142

Closed
adilGhaffarDev opened this issue Feb 12, 2024 · 5 comments
Closed

Tests running on k8s latest commit are failing #10142

adilGhaffarDev opened this issue Feb 12, 2024 · 5 comments
Assignees
Labels
kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@adilGhaffarDev
Copy link
Contributor

Which jobs are failing?

Jobs that run on k8s latest commit

  • periodic-cluster-api-e2e-conformance-ci-latest-main
  • capi-e2e-main-1-29-1-30

Which tests are failing?

  • When testing K8S conformance with K8S latest ci [Conformance] [K8s-Install-ci-latest] Should create a workload cluster and run kubetest

  • When upgrading a workload cluster using ClusterClass and testing K8S conformance [Conformance] [K8s-Upgrade] [ClusterClass] Should create and upgrade a workload cluster and eventually run kubetest

Since when has it been failing?

Failed between this time: 16:26 EET 08-02-2024 - 00:29 EET 09-02-2024
After that it went green for a few hours: 02:30 EET 09-02-2024 - 16:31 EET 09-02-2024
And started failing again after: 18:32 EET 09-02-2024

Testgrid link

https://testgrid.k8s.io/sig-cluster-lifecycle-cluster-api#capi-e2e-conformance-ci-latest-main

Reason for failure (if possible)

The machine is stuck in provisioning state and journalctl logs we see this error:
PullImage \"registry.k8s.io/kube-apiserver:v1.30.0-alpha.1.154_73f19e4c0162a2\" failed" error="rpc error: code = NotFound desc = failed to pull and unpack image \"registry.k8s.io/kube-apiserver:v1.30.0-alpha.1.154_73f19e4c0162a2\": failed to resolve reference \"registry.k8s.io/kube-apiserver:v1.30.0-alpha.1.154_73f19e4c0162a2\": registry.k8s.io/kube-apiserver:v1.30.0-alpha.1.154_73f19e4c0162a2: not found"

Since these tests run on k8s latest commit we build kindest/node image locally with the latest commit of k8s and these k8s images should already be in kindest/node. But we are getting this error for some reason.

Anything else we need to know?

kubekins-e2e image has been changed a lot since this test started failing. ref: kubernetes/test-infra#31961

Label(s) to be applied

/kind failing-test
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

@k8s-ci-robot k8s-ci-robot added kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 12, 2024
@sbueringer
Copy link
Member

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 13, 2024
@chrischdi
Copy link
Member

chrischdi commented Feb 19, 2024

Should be fixed by bumping kind via:

Root-cause:

kind fix:

The issue started when kubekins images got bumped to a version which include docker >= v25.0.0 (to be more specific: v25.0.3).

Only jobs on main are currently affected, because we don't build latest-ci on old branches. If we ever want to do the same on release-1.6 (or older) we need this fix too or a proper workaround (e.g. via preKubeadmScript)

@fabriziopandini
Copy link
Member

/assign @cahillsf
since he already has a PR out with the fix

@chrischdi
Copy link
Member

Back to green since

merged.

/close

@k8s-ci-robot
Copy link
Contributor

@chrischdi: Closing this issue.

In response to this:

Back to green since

merged.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants