Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Startup probe on cluster-jwks-proxy fails with an HTTP 404 error #2963

Open
4 of 7 tasks
mprahl opened this issue Jan 20, 2025 · 8 comments · May be fixed by #2971
Open
4 of 7 tasks

Startup probe on cluster-jwks-proxy fails with an HTTP 404 error #2963

mprahl opened this issue Jan 20, 2025 · 8 comments · May be fixed by #2971
Assignees

Comments

@mprahl
Copy link

mprahl commented Jan 20, 2025

Validation Checklist

  • Is this a Kubeflow issue?
  • Are you posting in the right repository ?
  • Did you follow the Kubeflow installation guideline ?
  • Is the issue report properly structured and detailed with version numbers?
  • Is this for Kubeflow development ?
  • Would you like to work on this issue?
  • You can join the CNCF Slack and access our meetings at the Kubeflow Community website. Our channel on the CNCF Slack is here #kubeflow-platform.

Version

master

Describe your issue

The cluster-jwks-proxy deployment as part of the Oauth2-proxy instructions is in CrashLoopBackOff due to the startup probe failing.

  Normal   Created    10s (x2 over 28s)   kubelet            Created container: kubectl-proxy
  Normal   Started    10s (x2 over 28s)   kubelet            Started container kubectl-proxy
  Normal   Killing    10s                 kubelet            Container kubectl-proxy failed startup probe, will be restarted
  Warning  Unhealthy  5s (x4 over 20s)    kubelet            Startup probe failed: HTTP probe failed with statuscode: 404

The logs of the pod are:

❯ kubectl -n istio-system logs -f cluster-jwks-proxy-5dd544bcd-w44cj
Starting to serve on [::]:8080

Steps to reproduce the issue

Follow the installation instructions on a Kind cluster but use "option 2" for the "Oauth2-proxy" section.

Put here any screenshots or videos (optional)

No response

@juliusvonkohout
Copy link
Member

Are you using Kind as in our instructions or do you bring your own Kubernetes cluster?

@mprahl
Copy link
Author

mprahl commented Jan 22, 2025

Are you using Kind as in our instructions or do you bring your own Kubernetes cluster?

Following the instructions exactly using Kind.

@juliusvonkohout
Copy link
Member

That is interesting. @tarekabouzeid can you help @mprahl with debugging?

@tarekabouzeid
Copy link
Member

That is interesting. @tarekabouzeid can you help @mprahl with debugging?

Yeah sure, i will try to reproduce from my side

@AhmedMousa-ag
Copy link

AhmedMousa-ag commented Jan 25, 2025

Actually I got the same exact issue if I follow option 2, I'm using Kali Linux,

@tarekabouzeid
Copy link
Member

tarekabouzeid commented Jan 26, 2025

It's reproducible. I am looking into it.

UPDATE: This is due to no default issuer in Kind cluster.

$ curl --insecure -H "Authorization: Bearer `cat /var/run/secrets/kubernetes.io/serviceaccount/token`"  https://kubernetes.default/.well-known/openid-configuration
404 page not found

Pods:

NAMESPACE            NAME                                             READY   STATUS             RESTARTS       AGE
cert-manager         cert-manager-5f864bbfd-s7mn7                     1/1     Running            0              20m
cert-manager         cert-manager-cainjector-589dc747b5-fh9kz         1/1     Running            0              20m
cert-manager         cert-manager-webhook-5987c7ff58-fzw79            1/1     Running            0              20m
istio-system         cluster-jwks-proxy-5dd544bcd-drnrs               0/1     CrashLoopBackOff   6 (2m5s ago)   6m21s
istio-system         istio-ingressgateway-65f4848f75-vm5zx            1/1     Running            0              8m45s
istio-system         istiod-6ddd5f9c49-xc9bv                          1/1     Running            0              8m45s
kube-system          coredns-668d6bf9bc-ghv4d                         1/1     Running            0              23m
kube-system          coredns-668d6bf9bc-mxwbn                         1/1     Running            0              23m
kube-system          etcd-kubeflow-control-plane                      1/1     Running            0              23m
kube-system          kindnet-hb2b5                                    1/1     Running            0              23m
kube-system          kube-apiserver-kubeflow-control-plane            1/1     Running            0              23m
kube-system          kube-controller-manager-kubeflow-control-plane   1/1     Running            0              23m
kube-system          kube-proxy-dlbvl                                 1/1     Running            0              23m
kube-system          kube-scheduler-kubeflow-control-plane            1/1     Running            0              23m
local-path-storage   local-path-provisioner-58cc7856b6-6mdnz          1/1     Running            0              23m
oauth2-proxy         oauth2-proxy-649b9846d8-pjxmr                    1/1     Running            0              6m20s
oauth2-proxy         oauth2-proxy-649b9846d8-wsdlp                    1/1     Running            0              6m20s

Pod logs

  Normal   Scheduled  4m24s                  default-scheduler  Successfully assigned istio-system/cluster-jwks-proxy-5dd544bcd-drnrs to kubeflow-control-plane
  Normal   Pulling    4m23s                  kubelet            Pulling image "docker.io/bitnami/kubectl:1.30.4"
  Normal   Pulled     4m3s                   kubelet            Successfully pulled image "docker.io/bitnami/kubectl:1.30.4" in 20.585s (20.585s including waiting). Image size: 110751853 bytes.
  Normal   Created    2m8s (x6 over 4m3s)    kubelet            Created container: kubectl-proxy
  Normal   Started    2m8s (x6 over 4m2s)    kubelet            Started container kubectl-proxy
  Normal   Pulled     2m8s (x5 over 3m43s)   kubelet            Container image "docker.io/bitnami/kubectl:1.30.4" already present on machine
  Warning  Unhealthy  113s (x18 over 3m53s)  kubelet            Startup probe failed: HTTP probe failed with statuscode: 404
  Normal   Killing    112s (x6 over 3m43s)   kubelet            Container kubectl-proxy failed startup probe, will be restarted
  Warning  BackOff    98s (x7 over 2m58s)    kubelet            Back-off restarting failed container kubectl-proxy in pod cluster-jwks-proxy-5dd544bcd-drnrs_istio-system(98a5ca77-6f63-422d-aaef-b365c5452e27)

@juliusvonkohout
Copy link
Member

It's reproducible. I am looking into it.

UPDATE: This is due to no default issuer in Kind cluster.

$ curl --insecure -H "Authorization: Bearer `cat /var/run/secrets/kubernetes.io/serviceaccount/token`"  https://kubernetes.default/.well-known/openid-configuration
404 page not found

Pods:

NAMESPACE            NAME                                             READY   STATUS             RESTARTS       AGE
cert-manager         cert-manager-5f864bbfd-s7mn7                     1/1     Running            0              20m
cert-manager         cert-manager-cainjector-589dc747b5-fh9kz         1/1     Running            0              20m
cert-manager         cert-manager-webhook-5987c7ff58-fzw79            1/1     Running            0              20m
istio-system         cluster-jwks-proxy-5dd544bcd-drnrs               0/1     CrashLoopBackOff   6 (2m5s ago)   6m21s
istio-system         istio-ingressgateway-65f4848f75-vm5zx            1/1     Running            0              8m45s
istio-system         istiod-6ddd5f9c49-xc9bv                          1/1     Running            0              8m45s
kube-system          coredns-668d6bf9bc-ghv4d                         1/1     Running            0              23m
kube-system          coredns-668d6bf9bc-mxwbn                         1/1     Running            0              23m
kube-system          etcd-kubeflow-control-plane                      1/1     Running            0              23m
kube-system          kindnet-hb2b5                                    1/1     Running            0              23m
kube-system          kube-apiserver-kubeflow-control-plane            1/1     Running            0              23m
kube-system          kube-controller-manager-kubeflow-control-plane   1/1     Running            0              23m
kube-system          kube-proxy-dlbvl                                 1/1     Running            0              23m
kube-system          kube-scheduler-kubeflow-control-plane            1/1     Running            0              23m
local-path-storage   local-path-provisioner-58cc7856b6-6mdnz          1/1     Running            0              23m
oauth2-proxy         oauth2-proxy-649b9846d8-pjxmr                    1/1     Running            0              6m20s
oauth2-proxy         oauth2-proxy-649b9846d8-wsdlp                    1/1     Running            0              6m20s

Pod logs

  Normal   Scheduled  4m24s                  default-scheduler  Successfully assigned istio-system/cluster-jwks-proxy-5dd544bcd-drnrs to kubeflow-control-plane
  Normal   Pulling    4m23s                  kubelet            Pulling image "docker.io/bitnami/kubectl:1.30.4"
  Normal   Pulled     4m3s                   kubelet            Successfully pulled image "docker.io/bitnami/kubectl:1.30.4" in 20.585s (20.585s including waiting). Image size: 110751853 bytes.
  Normal   Created    2m8s (x6 over 4m3s)    kubelet            Created container: kubectl-proxy
  Normal   Started    2m8s (x6 over 4m2s)    kubelet            Started container kubectl-proxy
  Normal   Pulled     2m8s (x5 over 3m43s)   kubelet            Container image "docker.io/bitnami/kubectl:1.30.4" already present on machine
  Warning  Unhealthy  113s (x18 over 3m53s)  kubelet            Startup probe failed: HTTP probe failed with statuscode: 404
  Normal   Killing    112s (x6 over 3m43s)   kubelet            Container kubectl-proxy failed startup probe, will be restarted
  Warning  BackOff    98s (x7 over 2m58s)    kubelet            Back-off restarting failed container kubectl-proxy in pod cluster-jwks-proxy-5dd544bcd-drnrs_istio-system(98a5ca77-6f63-422d-aaef-b365c5452e27)

can you configure our kind setup and documentation to have a default issuer?

@warjiang warjiang linked a pull request Feb 2, 2025 that will close this issue
@warjiang
Copy link

warjiang commented Feb 2, 2025

Same problem encountered in my local envrioment. It's the problem of kind config. If you start the kind cluster with the following config:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  image: fronted-cn-beijing.cr.volces.com/container/kindest/node:v1.31.4
  kubeadmConfigPatches:
  - |
    kind: ClusterConfiguration
    apiServer:
      extraArgs:
        "service-account-issuer": "kubernetes.default.svc"
        "service-account-signing-key-file": "/etc/kubernetes/pki/sa.key"

start proxy with kubectl proxy --address=0.0.0.0 --port=8080 and then execute curl http://localhost:8080/.well-known/openid-configuration, the response code is still 404. But if you rm the kubeadmConfigPatches field, you will find that curl http://localhost:8080/.well-known/openid-configuration get the right reponse.

And I get the command of kube-apiserver, the difference is the service-account-issuer:

Image

for --service-account-issuer=https://kubernetes.default.svc.cluster.local, it can be simpled as --service-account-issuer=https://kubernetes.default.svc without the cluster name, so just add https://, it will works. But more details of this problem, maybe something wrong with kubeadm.

The simplest way to fix the problem is recreate the kind cluster, and follow the rest install instructions. If don't want to recreate the kind cluster, you can try this way, attach to the corresponding docker container, and edit the /etc/kubernetes/manifests/kube-apiserver.yaml file, add https:// for service-account-issuer:
Image

save the config for kube-apiserver, and wait for a while, the kube-apiserver will be restarted, and you will see the start command updated:
Image

it will also works~ 😄

@juliusvonkohout juliusvonkohout linked a pull request Feb 3, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants