Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

401 Unauthorized when looking for entrypoint/cmd of an image hosted on a private AWS ECR with v3.6.0 #13947

Open
3 of 4 tasks
Sirz3chs opened this issue Nov 27, 2024 · 13 comments · May be fixed by #14008
Open
3 of 4 tasks
Labels
area/controller Controller issues, panics type/bug type/regression Regression from previous behavior (a specific type of bug)

Comments

@Sirz3chs
Copy link

Sirz3chs commented Nov 27, 2024

Pre-requisites

  • I have double-checked my configuration
  • I have tested with the :latest image tag (i.e. quay.io/argoproj/workflow-controller:latest) and can confirm the issue still exists on :latest. If not, I have explained why, in detail, in my description below.
  • I have searched existing issues and could not find a match for this bug
  • I'd like to contribute the fix myself (see contributing guide)

What happened? What did you expect to happen?

I'm upgrading from v3.5.12 to v3.6.0 using the official Helm chart. After the upgrade, workflows with images hosted on ECR fail to start.

I always have this error:

failed to look-up entrypoint/cmd for image "<account-id>.dkr.ecr.<region>.amazonaws.com/<image>:<version>", 
you must either explicitly specify the command, or list the image's command in the index: https://argo-workflows.readthedocs.io/en/latest/workflow-executors/#emissary-emissary: 
GET https://<account-id>.dkr.ecr.<region>.amazonaws.com/v2/<image>/manifests/<version>: unexpected status code 401 Unauthorized: Not Authorized

I precise that my image does contains an ENTRYPOINT. I tried to give IAM permissions to read the ECR repository to the workflow-controller with IRSA but it doesn't work.

Version(s)

v3.6.0

Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

Any simple workflow whose image is hosted on a private ECR repository.

Logs from the workflow controller

{"Phase":"","ResourceVersion":"714280881","level":"info","msg":"Processing workflow","namespace":"simple-workflow","time":"2024-11-27T06:56:44.261Z","workflow":"mycli-l5rc4"}
{"base":"*v1alpha1.Workflow (namespace=simple-workflow,name=mycli-l5rc4)","level":"debug","msg":"Resolving the template","time":"2024-11-27T06:56:44.268Z","tmpl":"*v1alpha1.WorkflowStep (mycli/mycli#false)"}
{"base":"*v1alpha1.Workflow (namespace=simple-workflow,name=mycli-l5rc4)","level":"debug","msg":"Found stored template","time":"2024-11-27T06:56:44.268Z","tmpl":"*v1alpha1.WorkflowStep (mycli/mycli#false)"}
{"base":"*v1alpha1.Workflow (namespace=simple-workflow,name=mycli-l5rc4)","level":"debug","msg":"Resolving the template","time":"2024-11-27T06:56:44.274Z","tmpl":"*v1alpha1.WorkflowStep (mycli/mycli#false)"}
{"base":"*v1alpha1.Workflow (namespace=simple-workflow,name=mycli-l5rc4)","level":"debug","msg":"Found stored template","time":"2024-11-27T06:56:44.274Z","tmpl":"*v1alpha1.WorkflowStep (mycli/mycli#false)"}
{"level":"info","msg":"Task-result reconciliation","namespace":"simple-workflow","numObjs":0,"time":"2024-11-27T06:56:44.297Z","workflow":"mycli-l5rc4"}
{"level":"debug","msg":"Skipping artifact GC","namespace":"simple-workflow","time":"2024-11-27T06:56:44.297Z","workflow":"mycli-l5rc4"}
{"level":"info","msg":"Updated phase  -\u003e Running","namespace":"simple-workflow","time":"2024-11-27T06:56:44.297Z","workflow":"mycli-l5rc4"}
{"level":"debug","msg":"Event(v1.ObjectReference{Kind:\"Workflow\", Namespace:\"simple-workflow\", Name:\"mycli-l5rc4\", UID:\"8f9e51d3-2334-40f9-93f0-2f2f3321903f\", APIVersion:\"argoproj.io/v1alpha1\", ResourceVersion:\"714280881\", FieldPath:\"\"}): type: 'Normal' reason: 'WorkflowRunning' Workflow Running","time":"2024-11-27T06:56:44.298Z"}
{"level":"debug","msg":"Evaluating node mycli-l5rc4: template: *v1alpha1.WorkflowStep (mycli), boundaryID: ","namespace":"simple-workflow","time":"2024-11-27T06:56:44.303Z","workflow":"mycli-l5rc4"}
{"level":"warning","msg":"Node was nil, will be initialized as type Skipped","namespace":"simple-workflow","time":"2024-11-27T06:56:44.303Z","workflow":"mycli-l5rc4"}
{"level":"debug","msg":"Initializing node mycli-l5rc4: template: *v1alpha1.WorkflowStep (mycli), boundaryID: ","namespace":"simple-workflow","time":"2024-11-27T06:56:44.303Z","workflow":"mycli-l5rc4"}
{"level":"info","msg":"was unable to obtain node for , letting display name to be nodeName","namespace":"simple-workflow","time":"2024-11-27T06:56:44.303Z","workflow":"mycli-l5rc4"}
{"level":"info","msg":"Pod node mycli-l5rc4 initialized Pending","namespace":"simple-workflow","time":"2024-11-27T06:56:44.303Z","workflow":"mycli-l5rc4"}
{"level":"debug","msg":"Executing node mycli-l5rc4 with container template: mycli\n","namespace":"simple-workflow","time":"2024-11-27T06:56:44.303Z","workflow":"mycli-l5rc4"}
{"level":"debug","msg":"","namespace":"simple-workflow","needLocation":false,"time":"2024-11-27T06:56:44.303Z","workflow":"mycli-l5rc4"}
{"level":"warning","msg":"couldn't get boundaryTemplate through nodeName mycli-l5rc4","namespace":"simple-workflow","time":"2024-11-27T06:56:44.303Z","workflow":"mycli-l5rc4"}
{"error":"failed to look-up entrypoint/cmd for image \"012345678910.dkr.ecr.eu-west-3.amazonaws.com/mycli:2.1.0\", you must either explicitly specify the command, or list the image's command in the index: https://argo-workflows.readthedocs.io/en/latest/workflow-executors/#emissary-emissary: GET https://012345678910.dkr.ecr.eu-west-3.amazonaws.com/v2/mycli/manifests/2.1.0: unexpected status code 401 Unauthorized: Not Authorized\n","level":"error","msg":"Mark error node","namespace":"simple-workflow","nodeName":"mycli-l5rc4","time":"2024-11-27T06:56:44.333Z","workflow":"mycli-l5rc4"}
{"level":"info","msg":"node mycli-l5rc4 phase Pending -\u003e Error","namespace":"simple-workflow","time":"2024-11-27T06:56:44.333Z","workflow":"mycli-l5rc4"}
{"level":"info","msg":"node mycli-l5rc4 message: failed to look-up entrypoint/cmd for image \"012345678910.dkr.ecr.eu-west-3.amazonaws.com/mycli:2.1.0\", you must either explicitly specify the command, or list the image's command in the index: https://argo-workflows.readthedocs.io/en/latest/workflow-executors/#emissary-emissary: GET https://012345678910.dkr.ecr.eu-west-3.amazonaws.com/v2/mycli/manifests/2.1.0: unexpected status code 401 Unauthorized: Not Authorized\n","namespace":"simple-workflow","time":"2024-11-27T06:56:44.333Z","workflow":"mycli-l5rc4"}
{"level":"info","msg":"node mycli-l5rc4 finished: 2024-11-27 06:56:44.333377165 +0000 UTC","namespace":"simple-workflow","time":"2024-11-27T06:56:44.333Z","workflow":"mycli-l5rc4"}
{"error":"failed to look-up entrypoint/cmd for image \"012345678910.dkr.ecr.eu-west-3.amazonaws.com/mycli:2.1.0\", you must either explicitly specify the command, or list the image's command in the index: https://argo-workflows.readthedocs.io/en/latest/workflow-executors/#emissary-emissary: GET https://012345678910.dkr.ecr.eu-west-3.amazonaws.com/v2/mycli/manifests/2.1.0: unexpected status code 401 Unauthorized: Not Authorized\n","level":"error","msg":"error in entry template execution","namespace":"simple-workflow","time":"2024-11-27T06:56:44.333Z","workflow":"mycli-l5rc4"}
{"level":"debug","msg":"Task results completion status: map[]","namespace":"simple-workflow","time":"2024-11-27T06:56:44.333Z","workflow":"mycli-l5rc4"}
{"level":"info","msg":"Updated phase Running -\u003e Error","namespace":"simple-workflow","time":"2024-11-27T06:56:44.333Z","workflow":"mycli-l5rc4"}
{"level":"info","msg":"Updated message  -\u003e error in entry template execution: failed to look-up entrypoint/cmd for image \"012345678910.dkr.ecr.eu-west-3.amazonaws.com/mycli:2.1.0\", you must either explicitly specify the command, or list the image's command in the index: https://argo-workflows.readthedocs.io/en/latest/workflow-executors/#emissary-emissary: GET https://012345678910.dkr.ecr.eu-west-3.amazonaws.com/v2/mycli/manifests/2.1.0: unexpected status code 401 Unauthorized: Not Authorized\n","namespace":"simple-workflow","time":"2024-11-27T06:56:44.334Z","workflow":"mycli-l5rc4"}
{"level":"info","msg":"Marking workflow completed","namespace":"simple-workflow","time":"2024-11-27T06:56:44.334Z","workflow":"mycli-l5rc4"}
{"level":"info","msg":"Marking workflow as pending archiving","namespace":"simple-workflow","time":"2024-11-27T06:56:44.334Z","workflow":"mycli-l5rc4"}
{"level":"debug","msg":"Event(v1.ObjectReference{Kind:\"Workflow\", Namespace:\"simple-workflow\", Name:\"mycli-l5rc4\", UID:\"8f9e51d3-2334-40f9-93f0-2f2f3321903f\", APIVersion:\"argoproj.io/v1alpha1\", ResourceVersion:\"714280881\", FieldPath:\"\"}): type: 'Warning' reason: 'WorkflowFailed' error in entry template execution: failed to look-up entrypoint/cmd for image \"012345678910.dkr.ecr.eu-west-3.amazonaws.com/mycli:2.1.0\", you must either explicitly specify the command, or list the image's command in the index: https://argo-workflows.readthedocs.io/en/latest/workflow-executors/#emissary-emissary: GET https://012345678910.dkr.ecr.eu-west-3.amazonaws.com/v2/mycli/manifests/2.1.0: unexpected status code 401 Unauthorized: Not Authorized\n","time":"2024-11-27T06:56:44.334Z"}
{"level":"debug","msg":"Log changes patch: {\"metadata\":{\"annotations\":{\"workflows.argoproj.io/pod-name-format\":\"v2\"},\"labels\":{\"workflows.argoproj.io/archive-strategy\":\"false\",\"workflows.argoproj.io/completed\":\"true\",\"workflows.argoproj.io/phase\":\"Error\",\"workflows.argoproj.io/workflow-archiving-status\":\"Pending\"}},\"status\":{\"artifactGCStatus\":{\"notSpecified\":true},\"artifactRepositoryRef\":{\"artifactRepository\":{\"s3\":{\"bucket\":\"my-argo-workflows-artifacts\",\"endpoint\":\"s3.amazonaws.com\",\"insecure\":false,\"keyFormat\":\"{{workflow.creationTimestamp.Y}}/{{workflow.creationTimestamp.m}}/{{workflow.creationTimestamp.d}}/{{workflow.name}}/{{pod.name}}\",\"region\":\"eu-west-3\",\"useSDKCreds\":true}},\"default\":true},\"conditions\":[{\"status\":\"True\",\"type\":\"Completed\"}],\"finishedAt\":\"2024-11-27T06:56:44Z\",\"message\":\"error in entry template execution: failed to look-up entrypoint/cmd for image \\\"012345678910.dkr.ecr.eu-west-3.amazonaws.com/mycli:2.1.0\\\", you must either explicitly specify the command, or list the image's command in the index: https://argo-workflows.readthedocs.io/en/latest/workflow-executors/#emissary-emissary: GET https://012345678910.dkr.ecr.eu-west-3.amazonaws.com/v2/mycli/manifests/2.1.0: unexpected status code 401 Unauthorized: Not Authorized\\n\",\"nodes\":{\"mycli-l5rc4\":{\"displayName\":\"mycli-l5rc4\",\"finishedAt\":\"2024-11-27T06:56:44Z\",\"id\":\"mycli-l5rc4\",\"message\":\"failed to look-up entrypoint/cmd for image \\\"012345678910.dkr.ecr.eu-west-3.amazonaws.com/mycli:2.1.0\\\", you must either explicitly specify the command, or list the image's command in the index: https://argo-workflows.readthedocs.io/en/latest/workflow-executors/#emissary-emissary: GET https://012345678910.dkr.ecr.eu-west-3.amazonaws.com/v2/mycli/manifests/2.1.0: unexpected status code 401 Unauthorized: Not Authorized\\n\",\"name\":\"mycli-l5rc4\",\"phase\":\"Error\",\"startedAt\":\"2024-11-27T06:56:44Z\",\"templateName\":\"mycli\",\"templateScope\":\"local/\",\"type\":\"Pod\"}},\"phase\":\"Error\",\"startedAt\":\"2024-11-27T06:56:44Z\",\"storedWorkflowTemplateSpec\":{\"activeDeadlineSeconds\":21600,\"arguments\":{},\"entrypoint\":\"mycli\",\"podGC\":{\"deleteDelayDuration\":\"24h\",\"strategy\":\"OnWorkflowCompletion\"},\"templates\":[{\"container\":{\"args\":[\"--version\"],\"image\":\"012345678910.dkr.ecr.eu-west-3.amazonaws.com/mycli:2.1.0\",\"name\":\"\",\"resources\":{}},\"inputs\":{},\"metadata\":{},\"name\":\"mycli\",\"outputs\":{},\"serviceAccountName\":\"simple-workflow\"}],\"ttlStrategy\":{\"secondsAfterCompletion\":259200},\"workflowMetadata\":{\"labels\":{\"workflows.argoproj.io/archive-strategy\":\"false\"}},\"workflowTemplateRef\":{\"name\":\"mycli\"}}}}","time":"2024-11-27T06:56:44.334Z"}
{"level":"info","msg":"Workflow update successful","namespace":"simple-workflow","phase":"Error","resourceVersion":"714280885","time":"2024-11-27T06:56:44.346Z","workflow":"mycli-l5rc4"}
{"level":"debug","msg":"Event(v1.ObjectReference{Kind:\"Workflow\", Namespace:\"simple-workflow\", Name:\"mycli-l5rc4\", UID:\"8f9e51d3-2334-40f9-93f0-2f2f3321903f\", APIVersion:\"argoproj.io/v1alpha1\", ResourceVersion:\"714280885\", FieldPath:\"\"}): type: 'Normal' reason: 'WorkflowNodeRunning' Running node mycli-l5rc4: failed to look-up entrypoint/cmd for image \"012345678910.dkr.ecr.eu-west-3.amazonaws.com/mycli:2.1.0\", you must either explicitly specify the command, or list the image's command in the index: https://argo-workflows.readthedocs.io/en/latest/workflow-executors/#emissary-emissary: GET https://012345678910.dkr.ecr.eu-west-3.amazonaws.com/v2/mycli/manifests/2.1.0: unexpected status code 401 Unauthorized: Not Authorized\n","time":"2024-11-27T06:56:44.346Z"}
{"level":"debug","msg":"Event(v1.ObjectReference{Kind:\"Workflow\", Namespace:\"simple-workflow\", Name:\"mycli-l5rc4\", UID:\"8f9e51d3-2334-40f9-93f0-2f2f3321903f\", APIVersion:\"argoproj.io/v1alpha1\", ResourceVersion:\"714280885\", FieldPath:\"\"}): type: 'Warning' reason: 'WorkflowNodeError' Error node mycli-l5rc4: failed to look-up entrypoint/cmd for image \"012345678910.dkr.ecr.eu-west-3.amazonaws.com/mycli:2.1.0\", you must either explicitly specify the command, or list the image's command in the index: https://argo-workflows.readthedocs.io/en/latest/workflow-executors/#emissary-emissary: GET https://012345678910.dkr.ecr.eu-west-3.amazonaws.com/v2/mycli/manifests/2.1.0: unexpected status code 401 Unauthorized: Not Authorized\n","time":"2024-11-27T06:56:44.346Z"}
{"level":"info","msg":"archiving workflow","namespace":"simple-workflow","time":"2024-11-27T06:56:44.354Z","uid":"8f9e51d3-2334-40f9-93f0-2f2f3321903f","workflow":"mycli-l5rc4"}
{"level":"info","msg":"Queueing Error workflow simple-workflow/mycli-l5rc4 for delete in 72h0m0s due to TTL","time":"2024-11-27T06:56:44.412Z"}

Logs from in your workflow's wait container

kubectl logs -n argo -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded
@Sirz3chs Sirz3chs changed the title 401 Unauthorized when looking for entrypoint/cmd of an image hosted on AWS ECR with v3.6.0 401 Unauthorized when looking for entrypoint/cmd of an image hosted on a private AWS ECR with v3.6.0 Nov 27, 2024
@Joibel
Copy link
Member

Joibel commented Nov 27, 2024

@tico24, could this be related to the helm chart changes rather than the controller? I'm not aware of controller differences that might cause this.

@tico24
Copy link
Member

tico24 commented Nov 27, 2024

The controller yaml doesn't change, nor do any controller permissions.
argoproj/argo-helm@argo-workflows-0.42.7...argo-workflows-0.45.0

@tooptoop4
Copy link
Contributor

@Sirz3chs try #9802 (comment)

@Sirz3chs
Copy link
Author

@tooptoop4 I tried with a random secret but it didn't change anything.

@Joibel
Copy link
Member

Joibel commented Dec 16, 2024

Is this only occurring with Amazon ECR, or is anyone seeing it with a different container registry?

@Joibel Joibel added area/controller Controller issues, panics type/regression Regression from previous behavior (a specific type of bug) labels Dec 16, 2024
@Joibel
Copy link
Member

Joibel commented Dec 16, 2024

@Sirz3chs you said:

I precise that my image does contains an ENTRYPOINT. I tried to give IAM permissions to read the ECR repository to the workflow-controller with IRSA but it doesn't work.

Which mechanism was your controller getting permissions to access ECR before you upgraded to 3.6?

@Joibel
Copy link
Member

Joibel commented Dec 16, 2024

I believe this should be managed magically by the use of k8schain package. Nothing has changed here since 3.5, but perhaps it's just all too out of date relative to aws-sdk-go-v2.

I'll put up a speculative fix.

@Joibel Joibel linked a pull request Dec 16, 2024 that will close this issue
@Joibel
Copy link
Member

Joibel commented Dec 16, 2024

I have put up something that might work in #14008 - if anyone needs help testing it reach out to me here or in Slack.

It won't get merged without confirmation that it helps.

@Sirz3chs
Copy link
Author

Which mechanism was your controller getting permissions to access ECR before you upgraded to 3.6?

@Joibel I didn't grant any specific permissions to the controller. Only the EKS nodes have the necessary rights to pull ECR images.

@tooptoop4
Copy link
Contributor

google/go-containerregistry#1950 / tektoncd/pipeline#7698 have details on the cause

@omerlh
Copy link
Contributor

omerlh commented Dec 29, 2024

Same issue happened here when trying to upgrade from 3.5.8 to 3.6.2. Kubernetes version EKS 1.25. Rolling back to 3.5.8 solved the issue

@manishkumar-ops
Copy link

Same issue when upgrading from 3.5.11 to 3.6.2. K8s version EKS 1.28. Roll back to 3.5.11 works.

@Yaworski-Joseph-bah
Copy link

https://github.com/google/go-containerregistry/releases/tag/v0.20.3 is released and fixes this issue. If we upgrade our dependencies, it should fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/controller Controller issues, panics type/bug type/regression Regression from previous behavior (a specific type of bug)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants