Reuse existing task pods #7144

porbmv83 · 2021-11-03T15:19:33Z

porbmv83
Nov 3, 2021

Instead of creating a new pod per task Argo could look if there is an existing pod available and use it. I guess this would increase performance because it wouldn't need to create it again.

Charlie17Li · 2023-03-18T09:22:26Z

Charlie17Li
Mar 18, 2023

Are there any plans to support this feature? 🤔

1 reply

evgeniy-khatko May 6, 2023

Is there any progress on this issue?
Looks like important scaling feature. Is there a workaround or supported solution?

I can imagine in most cases a workflow would pull a docker image and execute it. The pulling and image building is an overhead that can be avoided by reusing workflow pods? Am I missing smth obvious?

sarabala1979 · 2023-05-07T06:01:45Z

sarabala1979
May 7, 2023
Maintainer

You can use containerSet template. It will use the same pod to execute the all steps.

1 reply

marcstreeter Aug 12, 2023

the problem here is with potential wasted costs as all containers are running even when not in use

pisymbol · 2023-10-13T16:04:16Z

pisymbol
Oct 13, 2023

A really great idea would be if one could specify a pool of pods (a workflow pod pool) and these pods would be kept "hot" for workflows to use.

For example, let's say I have a task that requires a large ML model to be loaded into memory (which can take minutes depending on the size), and I have several workflows and/or tasks within a workflow that all use it. What I want is that Pod to be kept "hot" (running) while workflows are ran instead of each workflow spawning the same Pod over and over again and incurring the initial startup/scheduling costs over and over again.

The container template would work for some use cases (keep inference steps together maybe) but comes with a lot of limitations.

0 replies

jswxstw · 2024-01-23T13:50:32Z

jswxstw
Jan 23, 2024
Collaborator

The solution we adopted is to use HTTP/Plugin Template instead of Container Template.
This allows all steps to be executed in the same agent pod.

5 replies

dineshkumar20 Mar 6, 2024

@jswxstw Could you please explain how it will use same agent pod?

jswxstw Mar 6, 2024
Collaborator

You can refer to the following workflow to compare the differences and time consumption between running with shell and http templates.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: http-benchmark-
spec:
  entrypoint: main
  templates:
    - name: main
      steps:
        - - name: step1
            template: http
            arguments:
              parameters: [{name: url, value: "https://raw.githubusercontent.com/argoproj/argo-workflows/4e450e250168e6b4d51a126b784e90b11a0162bc/pkg/apis/workflow/v1alpha1/generated.swagger.json"}]
        - - name: step2
            template: http
            arguments:
              parameters: [{name: url, value: "https://raw.githubusercontent.com/argoproj/argo-workflows/4e450e250168e6b4d51a126b784e90b11a0162bc/pkg/apis/workflow/v1alpha1/generated.swagger.json"}]
        - - name: step3
            template: http
            arguments:
              parameters: [{name: url, value: "https://raw.githubusercontent.com/argoproj/argo-workflows/4e450e250168e6b4d51a126b784e90b11a0162bc/pkg/apis/workflow/v1alpha1/generated.swagger.json"}]
    - name: shell
      inputs:
        parameters:
          - name: url
      container:
        image: docker/whalesay:latest
        command: [ sh, -c ]
        args: [ "echo '{{inputs.parameters.url}}'" ]
    - name: http
      inputs:
        parameters:
          - name: url
      http:
        url: "{{inputs.parameters.url}}"

dineshkumar20 Mar 7, 2024

Thank you for sharing the details. This approach will be helpful when we make HTTP requests from each step. In my case, all steps will process some files and generate reports. Would it be possible to reuse the same pod?

jswxstw Mar 7, 2024
Collaborator

If you are interested in how this workflow works, you can refer to the documentation: argo-agent.
In simple terms, each workflow will only start one agent pod at most. For HTTP/Plugin templates, the controller will generate tasks and write them into WorkflowTaskSet. The worker goroutines in the agent pod are responsible for executing these tasks, and the node results will be written back to WorkflowTaskSet.

jswxstw Mar 7, 2024
Collaborator

For complex steps, you can consider using Plugins.

shuangkun · 2024-03-09T03:09:28Z

shuangkun
Mar 9, 2024
Collaborator

Do you want to use a service that is already running in the cluster, such as Vincent diagram, and then all workflows will call this service, and then have some input and get some output?

Probably you can use this: https://argo-workflows.readthedocs.io/en/release-3.5/async-pattern/
Define your input and output rules, and define two pods in argo, one to trigger and one to wait. Avoid wasting resources creating Pods for this service.
We have many scenarios where we call an emr job or interact with the slurm cluster in the argo workflow.

0 replies

jrbe228 · 2024-08-29T18:55:35Z

jrbe228
Aug 29, 2024

One hacky workaround using daemon containers to reuse a pod:
- Client containers run kubectl exec <pod-name> -- <your-command>
- Daemon containers run a trivial process to keep pod alive - tail -f /dev/null
- Essentially we are using the K8s CLI + API to facilitate server / client handshake

Working example:

apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
 name: kubectl-daemon-loop
spec:
 entrypoint: main
 templates:
 - name: main
   steps:
     - - name: run-daemon
         template: daemon

     - - name: lookup-pod-name
         template: lookup-pod-name
         arguments:
           parameters:
             - name: daemon-ip
               value: "{{steps.run-daemon.ip}}"

     - - name: exec-in-daemon
         template: exec-command
         arguments:
           parameters:
             - name: daemon-pod-name
               value: "{{steps.lookup-pod-name.outputs.parameters.pod-name}}"
             - name: command
               value: "{{item}}"
         withItems:              # or use withSequence
         - ls / | head -n 3
         - ls / | head -n 4
         - ls / | head -n 5
         - ls / | head -n 6

 - name: daemon
   daemon: true
   container:
     image: busybox
     command: ["sh", "-c", "tail -f /dev/null"]
   nodeSelector:
     kubernetes.io/os: linux

 - name: lookup-pod-name
   inputs:
     parameters:
       - name: daemon-ip
   outputs:
     parameters:
       - name: pod-name
         valueFrom:
           path: /tmp/pod-name.txt
   container:
     image: bitnami/kubectl
     command: [sh, -c]
     args: 
       - |
         POD_NAME=$(kubectl get pods -n argo -o custom-columns=NAME:.metadata.name --no-headers --field-selector status.podIP="{{inputs.parameters.daemon-ip}}") &&\
         echo $POD_NAME &&\
         echo -n $POD_NAME > /tmp/pod-name.txt
   nodeSelector:
     kubernetes.io/os: linux

 - name: exec-command
   inputs:
     parameters:
       - name: daemon-pod-name
       - name: command
   container:
     image: bitnami/kubectl
     command: [sh, "-c"]
     args: ["kubectl exec {{inputs.parameters.daemon-pod-name}} -- {{inputs.parameters.command}}"]
   nodeSelector:
     kubernetes.io/os: linux

Results in:

3 replies

jswxstw Sep 5, 2024
Collaborator

Step exec-command still create a new pod, am I right?

jrbe228 Sep 5, 2024

@jswxstw - correct, new pods are being created. If you want to strictly avoid new pods, maybe a script template approach?

jswxstw Sep 6, 2024
Collaborator

Script template will also create a new pod.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reuse existing task pods #7144

{{title}}

Replies: 6 comments 10 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Reuse existing task pods #7144

Replies: 6 comments · 10 replies

sarabala1979 May 7, 2023 Maintainer

jswxstw Jan 23, 2024 Collaborator

jswxstw Mar 6, 2024 Collaborator

jswxstw Mar 7, 2024 Collaborator

jswxstw Mar 7, 2024 Collaborator

shuangkun Mar 9, 2024 Collaborator

jswxstw Sep 5, 2024 Collaborator

jswxstw Sep 6, 2024 Collaborator

Replies: 6 comments 10 replies

sarabala1979
May 7, 2023
Maintainer

jswxstw
Jan 23, 2024
Collaborator

jswxstw Mar 6, 2024
Collaborator

jswxstw Mar 7, 2024
Collaborator

jswxstw Mar 7, 2024
Collaborator

shuangkun
Mar 9, 2024
Collaborator

jswxstw Sep 5, 2024
Collaborator

jswxstw Sep 6, 2024
Collaborator