Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide an exclusive lock step in a pipeline to avoid concurrent builds #5471

Closed
jstrachan opened this issue Sep 13, 2019 · 11 comments
Closed
Labels
area/lighthouse for the lighthouse webhook + ChatOps handler area/tekton area/tekton indicates that the issue was reported against the Servereless Jenkins X Pipelines kind/enhancement An enhancement of an existing feature lifecycle/rotten priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@jstrachan
Copy link
Member

jstrachan commented Sep 13, 2019

Summary

When performing a release via GitOps we don't want to allow concurrent builds; as odd things can happen due to concurrency. We can happily process PRs concurrently - but often for releases we want to mark those as running one at a time for a given repository + branch.

Eventually this may end up being a core part of tekton: tektoncd/pipeline#1305

Until then a workaround could be to write a leadership election step of sorts - a step which blocks until it grabs a lock for being the 1 allowed pipeline for a given string (the git URL + branch)

Steps to reproduce the behavior

Run 2 release pipelines on a staging/production environment and issues can occur

Expected behavior

We force a 2nd pipeline on a given environment release to block until the first one complets

Strawman design

We already expose the pod logs into each pod via /etc/podinfo via the downward API. Though we should maybe be more explicit and expose the current PipelineRun name into pipeline via the $PIPELINE_RUN environment variable.

We could create a new step - say - jx step pipeline lock which does this:

  • grab the current pod'sPipelineRun name and the repository owner/repo/branch
  • find the SourceRepository for the owner/repo (we could expose this as a label/env var too to simplify the code)
  • use an annotation key of jenkins.io./lock/owner/repo/branch on the SourceRepository
  • lookup the annotation for the key.
  • if empty, set the value to the PipelineRun name
  • if the update works and the value of the updated PipelineRun annotation has the current PipelineRun name, the current pod is the leader and the step can terminate.
  • if not the value of the key is the leaderPipelineRun resource to watch. The step then watches the leaderPipelineRun until its complete/deleted and all pods which have the label tekton.dev/pipelineRun=leaderPipelineRun complete/fail
  • once there is no running leaderPipelineRun or pods then its time to try become the leader again, so try update the SourceRepository again
  • repeat until some timeout and then fail?
@jstrachan jstrachan added area/tekton area/tekton indicates that the issue was reported against the Servereless Jenkins X Pipelines kind/enhancement An enhancement of an existing feature priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Sep 13, 2019
@msvticket
Copy link
Member

After getting the urgent issue (#5281) fixed with my PR I continued to think in a similar direction. The principal difference is that your solution seems to put the lock on the whole pipeline run while I was thinking of having the lock on a particular step (so rather TaskRun). The reasoning beaning that there might be time consuming steps that can be concurrent. Both varieties would be useful though.

@msvticket
Copy link
Member

msvticket commented Sep 13, 2019

There are also cases where the order of PipelineRuns are of importance so a lock with FIFO behaviour would be preferable. Maybe solve by storing a queue of runs in the SourceRepository? But then you need to take into consideration that the pod first in the queue might die while waiting.

@jenkins-x-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Provide feedback via https://jenkins-x.io/community.
/lifecycle stale

@jenkins-x-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Provide feedback via https://jenkins-x.io/community.
/lifecycle rotten

@msvticket
Copy link
Member

/remove-lifecycle rotten

@jenkins-x-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Provide feedback via https://jenkins-x.io/community.
/lifecycle stale

@jenkins-x-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Provide feedback via https://jenkins-x.io/community.
/lifecycle rotten

@jenkins-x-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Provide feedback via https://jenkins-x.io/community.
/close

@jenkins-x-bot
Copy link
Contributor

@jenkins-x-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Provide feedback via https://jenkins-x.io/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the jenkins-x/lighthouse repository.

@keskad
Copy link
Contributor

keskad commented Mar 14, 2022

That would be nice to actually limit concurrency for specific jobs or even stages to avoid overlapping, especially, when external services are used.

@ankitm123
Copy link
Member

I think we want to solve it upstream in tekton, instead of jx (There's a TEP for that: tektoncd/experimental#699). We have something basic in lighthouse using max_concurrency I believe (https://github.com/jenkins-x/lighthouse/blob/main/docs/trigger/github-com-jenkins-x-lighthouse-pkg-config-job.md)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/lighthouse for the lighthouse webhook + ChatOps handler area/tekton area/tekton indicates that the issue was reported against the Servereless Jenkins X Pipelines kind/enhancement An enhancement of an existing feature lifecycle/rotten priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

6 participants