Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

race conditions when having more than one pipeline of the same branch #394

Closed
gerardcl opened this issue Jan 12, 2022 · 1 comment · Fixed by #474
Closed

race conditions when having more than one pipeline of the same branch #394

gerardcl opened this issue Jan 12, 2022 · 1 comment · Fixed by #474
Assignees
Labels
bug Something isn't working
Milestone

Comments

@gerardcl
Copy link
Member

We have detected at least two situations where one can have at least two pipelines at the same time running (never with same start time):

  • pushing two consecutive commits with close git push time or while there was already one running:
    two-consecutives-commits

  • pushing a commit and afterwards creating a PR:
    commit-and-pr

Depending on the timings, errors can happen in any task:

  • task ods-start:
step-ods-start
+ ods-start -project=myproject -environment= -version= -git-full-ref=refs/heads/test-multiple-commits-ods-bug -git-ref-spec= -url=https://bb.example.com/scm/myproject/myproject-acomponent.git -pr-key=0 -pr-base= -http-proxy= -https-proxy= -no-proxy= -ssl-verify=true -submodules=true -depth=1 -pipeline-run-name=poc-ast-test-multiple-commits-ods-bug-wctxc
[INFO] Cleaning checkout directory ...
[INFO] Checking out https://bb.example.com/scm/myproject/myproject-acomponent.git@refs/heads/test-multiple-commits-ods-bug into /workspace/source ...
[ERROR] {"level":"info","ts":1641973524.5227394,"caller":"git/git.go:169","msg":"Successfully cloned https://bb.example.com/scm/myproject/myproject-acomponent.git @ f76f8cf66880e6409668953e34c0d175eba1522b (grafted, HEAD, origin/test-multiple-commits-ods-bug) in path /workspace/source"}
{"level":"error","ts":1641973524.5810995,"caller":"git/git.go:54","msg":"Error running git [submodule update --recursive --init --depth=1]: exit status 1\nfatal: not a git repository (or any parent up to mount point /workspace)\nStopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).\nfatal: /usr/libexec/git-core/git-submodule cannot be used without a working tree.\n","stacktrace":"github.com/tektoncd/pipeline/pkg/git.run\n\t/opt/app-root/src/go/src/github.com/tektoncd/pipeline-0.24.0/pkg/git/git.go:54\ngithub.com/tektoncd/pipeline/pkg/git.submoduleFetch\n\t/opt/app-root/src/go/src/github.com/tektoncd/pipeline-0.24.0/pkg/git/git.go:204\ngithub.com/tektoncd/pipeline/pkg/git.Fetch\n\t/opt/app-root/src/go/src/github.com/tektoncd/pipeline-0.24.0/pkg/git/git.go:171\nmain.main\n\t/opt/app-root/src/go/src/github.com/tektoncd/pipeline-0.24.0/cmd/git-init/main.go:53\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:225"}
{"level":"fatal","ts":1641973524.581184,"caller":"git-init/main.go:54","msg":"Error fetching git repository: exit status 1","stacktrace":"main.main\n\t/opt/app-root/src/go/src/github.com/tektoncd/pipeline-0.24.0/cmd/git-init/main.go:54\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:225"}

2022/01/12 07:45:24 exit status 1
  • task <tech>-build, e.g. sonar:
2022/01/12 08:31:54 scan failed: scanning failed: exit status 1, stderr: ERROR: Error during SonarQube Scanner execution
ERROR: Another SonarQube analysis is already in progress for this project
ERROR: 
ERROR: Re-run SonarQube Scanner using the -X switch to enable full debug logging.
  • task deploy-helm:
Upgrading Helm release to acomponent-0.1.0+a71a414ad68ab8f57a92934a8d820e56fad8e129.tgz...
helm --namespace=myproject-dev secrets upgrade --wait --install --values=./chart/secrets.yaml --set=image.tag=a71a414ad68ab8f57a92934a8d820e56fad8e129 acomponent acomponent-0.1.0+a71a414ad68ab8f57a92934a8d820e56fad8e129.tgz
[helm-secrets] Decrypt skipped: ./chart/secrets.yaml
Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
Error: plugin "secrets" exited with error
  • task ods-finish:
[INFO] Uploading .ods/artifacts/xunit-reports/report.xml to Nexus repository ods-temporary-artifacts, group /myproject/myproject-acomponent/a2a63f6fdc27ba35654679bfc7996366f74b53d2/xunit-reports ...

2022/01/07 15:32:15 cannot upload artifacts of main repository: could not upload component: component not uploaded: 400 Bad Request

So, any race condition can happen when having more than one pipeline running at same time.

For now, usually the workaround is just being aware of this and afterwards re-run the pipeline in openshift.

Solutions:
1- no solution, but just awareness
2- queueing/semaphore mechanisms (PVC checks, file checks,...)
3- investigate what is already possible from tekton and/or openshift pipelines

New proposals, comments, etc., are welcome!

NOTE: this might also happen when not being only in same branch context

@gerardcl gerardcl added the bug Something isn't working label Jan 12, 2022
@michaelsauter
Copy link
Member

Thanks for opening this.

I wrongly assumed that pipelines wouldn't start in parallel because they mount the same PVC. Seems this is not the case.

Solution (1) is the short-term solution but not one I am ready to accept.

Related to (2) and (3) I found:

@michaelsauter michaelsauter self-assigned this Jan 14, 2022
@michaelsauter michaelsauter added this to the 0.3.0 milestone Jan 14, 2022
This was referenced Jan 14, 2022
michaelsauter added a commit that referenced this issue Mar 21, 2022
Pipeline runs belonging to one repository now run sequentially. If a
pipeline run cannot start immediately, it is created as "pending", and a
process is started to periodically check if it can start.

Since the pipeline manager service may be restarted, it check on boot if
there are any pending runs for the repositories under its control and
starts the periodic check for those.

We can improve on the design by adding a signal in the finish task of a
pipeline run that the run will soon finish, reducing the up to 30s wait
time for the next run.

Closes #394.
michaelsauter added a commit that referenced this issue Mar 21, 2022
Pipeline runs belonging to one repository now run sequentially. If a
pipeline run cannot start immediately, it is created as "pending", and a
process is started to periodically check if it can start.

Since the pipeline manager service may be restarted, it check on boot if
there are any pending runs for the repositories under its control and
starts the periodic check for those.

We can improve on the design by adding a signal in the finish task of a
pipeline run that the run will soon finish, reducing the up to 30s wait
time for the next run.

Closes #394.
michaelsauter added a commit that referenced this issue Mar 21, 2022
Pipeline runs belonging to one repository now run sequentially. If a
pipeline run cannot start immediately, it is created as "pending", and a
process is started to periodically check if it can start.

Since the pipeline manager service may be restarted, it check on boot if
there are any pending runs for the repositories under its control and
starts the periodic check for those.

We can improve on the design by adding a signal in the finish task of a
pipeline run that the run will soon finish, reducing the up to 30s wait
time for the next run.

Closes #394.
michaelsauter added a commit that referenced this issue Mar 23, 2022
Related to #394.
michaelsauter added a commit that referenced this issue Mar 23, 2022
Related to #394.
michaelsauter added a commit that referenced this issue Mar 23, 2022
Related to #394.
michaelsauter added a commit that referenced this issue Mar 24, 2022
Pipeline runs belonging to one repository now run sequentially. If a
pipeline run cannot start immediately, it is created as "pending", and a
process is started to periodically check if it can start.

Since the pipeline manager service may be restarted, it check on boot if
there are any pending runs for the repositories under its control and
starts the periodic check for those.

We can improve on the design by adding a signal in the finish task of a
pipeline run that the run will soon finish, reducing the up to 30s wait
time for the next run.

Closes #394.
michaelsauter added a commit that referenced this issue Mar 24, 2022
Pipeline runs belonging to one repository now run sequentially. If a
pipeline run cannot start immediately, it is created as "pending", and a
process is started to periodically check if it can start.

Since the pipeline manager service may be restarted, it check on boot if
there are any pending runs for the repositories under its control and
starts the periodic check for those.

We can improve on the design by adding a signal in the finish task of a
pipeline run that the run will soon finish, reducing the up to 30s wait
time for the next run.

Closes #394.
michaelsauter added a commit that referenced this issue Mar 24, 2022
Pipeline runs belonging to one repository now run sequentially. If a
pipeline run cannot start immediately, it is created as "pending", and a
process is started to periodically check if it can start.

Since the pipeline manager service may be restarted, it check on boot if
there are any pending runs for the repositories under its control and
starts the periodic check for those.

We can improve on the design by adding a signal in the finish task of a
pipeline run that the run will soon finish, reducing the up to 30s wait
time for the next run.

Closes #394.
michaelsauter added a commit that referenced this issue Mar 24, 2022
Pipeline runs belonging to one repository now run sequentially. If a
pipeline run cannot start immediately, it is created as "pending", and a
process is started to periodically check if it can start.

Since the pipeline manager service may be restarted, it check on boot if
there are any pending runs for the repositories under its control and
starts the periodic check for those.

We can improve on the design by adding a signal in the finish task of a
pipeline run that the run will soon finish, reducing the up to 30s wait
time for the next run.

Closes #394.
michaelsauter added a commit that referenced this issue Mar 24, 2022
Pipeline runs belonging to one repository now run sequentially. If a
pipeline run cannot start immediately, it is created as "pending", and a
process is started to periodically check if it can start.

Since the pipeline manager service may be restarted, it check on boot if
there are any pending runs for the repositories under its control and
starts the periodic check for those.

We can improve on the design by adding a signal in the finish task of a
pipeline run that the run will soon finish, reducing the up to 30s wait
time for the next run.

Closes #394.
michaelsauter added a commit that referenced this issue Mar 25, 2022
Pipeline runs belonging to one repository now run sequentially. If a
pipeline run cannot start immediately, it is created as "pending", and a
process is started to periodically check if it can start.

Since the pipeline manager service may be restarted, it check on boot if
there are any pending runs for the repositories under its control and
starts the periodic check for those.

We can improve on the design by adding a signal in the finish task of a
pipeline run that the run will soon finish, reducing the up to 30s wait
time for the next run.

Closes #394.
michaelsauter added a commit that referenced this issue Mar 25, 2022
Pipeline runs belonging to one repository now run sequentially. If a
pipeline run cannot start immediately, it is created as "pending", and a
process is started to periodically check if it can start.

Since the pipeline manager service may be restarted, it check on boot if
there are any pending runs for the repositories under its control and
starts the periodic check for those.

We can improve on the design by adding a signal in the finish task of a
pipeline run that the run will soon finish, reducing the up to 30s wait
time for the next run.

Closes #394.
michaelsauter added a commit that referenced this issue Mar 28, 2022
Pipeline runs belonging to one repository now run sequentially. If a
pipeline run cannot start immediately, it is created as "pending", and a
process is started to periodically check if it can start.

Since the pipeline manager service may be restarted, it check on boot if
there are any pending runs for the repositories under its control and
starts the periodic check for those.

We can improve on the design by adding a signal in the finish task of a
pipeline run that the run will soon finish, reducing the up to 30s wait
time for the next run.

Closes #394.
michaelsauter added a commit that referenced this issue Mar 28, 2022
Pipeline runs belonging to one repository now run sequentially. If a
pipeline run cannot start immediately, it is created as "pending", and a
process is started to periodically check if it can start.

Since the pipeline manager service may be restarted, it check on boot if
there are any pending runs for the repositories under its control and
starts the periodic check for those.

We can improve on the design by adding a signal in the finish task of a
pipeline run that the run will soon finish, reducing the up to 30s wait
time for the next run.

Closes #394.
michaelsauter added a commit that referenced this issue Mar 28, 2022
Pipeline runs belonging to one repository now run sequentially. If a
pipeline run cannot start immediately, it is created as "pending", and a
process is started to periodically check if it can start.

Since the pipeline manager service may be restarted, it check on boot if
there are any pending runs for the repositories under its control and
starts the periodic check for those.

We can improve on the design by adding a signal in the finish task of a
pipeline run that the run will soon finish, reducing the up to 30s wait
time for the next run.

Closes #394.
michaelsauter added a commit that referenced this issue Mar 28, 2022
Pipeline runs belonging to one repository now run sequentially. If a
pipeline run cannot start immediately, it is created as "pending", and a
process is started to periodically check if it can start.

Since the pipeline manager service may be restarted, it check on boot if
there are any pending runs for the repositories under its control and
starts the periodic check for those.

We can improve on the design by adding a signal in the finish task of a
pipeline run that the run will soon finish, reducing the up to 30s wait
time for the next run.

Closes #394.
michaelsauter added a commit that referenced this issue Mar 28, 2022
Pipeline runs belonging to one repository now run sequentially. If a
pipeline run cannot start immediately, it is created as "pending", and a
process is started to periodically check if it can start.

Since the pipeline manager service may be restarted, it check on boot if
there are any pending runs for the repositories under its control and
starts the periodic check for those.

We can improve on the design by adding a signal in the finish task of a
pipeline run that the run will soon finish, reducing the up to 30s wait
time for the next run.

Closes #394.
michaelsauter added a commit that referenced this issue Mar 30, 2022
Pipeline runs belonging to one repository now run sequentially. If a
pipeline run cannot start immediately, it is created as "pending", and a
process is started to periodically check if it can start.

Since the pipeline manager service may be restarted, it check on boot if
there are any pending runs for the repositories under its control and
starts the periodic check for those.

We can improve on the design by adding a signal in the finish task of a
pipeline run that the run will soon finish, reducing the up to 30s wait
time for the next run.

Closes #394.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants