You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed a strange behavior when i run multiple instances of the worker (say 3) all pointing to the same database. Currently took the postgres implementation with some changes.
Here's the screenshot
You can see that it contains several orchestration:SimpleOrchestration spans. This happens also to activities.
I also see several of these logs.
{"time":"2024-01-13T14:55:54.491837674+08:00","level":"ERROR","msg":"orchestration-processor: failed to complete work item: instance 'db1659b0-1528-4042-a500-0cb3822f2cad' no longer exists or was locked by a different worker"}
{"time":"2024-01-13T14:55:54.497473338+08:00","level":"ERROR","msg":"orchestration-processor: failed to abandon work item: lock on work-item was lost"}
I think this happens while the other workers are all processing the work items, while one of them has already transitioned or completed the work item.
The text was updated successfully, but these errors were encountered:
I found the issue here, the different executors / workers instances are picking up the same rows from the database.
The way I fixed it is to use SELECT FOR UPDATE SKIP LOCKED, this way, those items already picked up by other process will simply be ignored by the other process. This is done on both GetOrchestrationWorkItem and GetActivityWorkItem.
The code is in this repo.
This repo is a copy of the original PR #33, just wanted to see how it works with postgres db.
I noticed a strange behavior when i run multiple instances of the worker (say 3) all pointing to the same database. Currently took the postgres implementation with some changes.
Here's the screenshot
You can see that it contains several
orchestration:SimpleOrchestration
spans. This happens also to activities.I also see several of these logs.
I think this happens while the other workers are all processing the work items, while one of them has already transitioned or completed the work item.
The text was updated successfully, but these errors were encountered: