Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support chunking of actions submitted to backend #44

Open
cgillum opened this issue Nov 11, 2023 · 0 comments
Open

Support chunking of actions submitted to backend #44

cgillum opened this issue Nov 11, 2023 · 0 comments

Comments

@cgillum
Copy link
Member

cgillum commented Nov 11, 2023

Some backend types may not be able to handle saving an unbounded number of actions to storage in a single go. In such cases, it's necessary to break a large batch of actions into smaller chunks to remain within certain limits of the store. For example, Azure Cosmos DB doesn't allow saving more than 100 documents at a time in a single batch transaction, and that transaction must not exceed 2 MB in size (source). This is an issue in the Dapr Workflow project, as noted here: dapr/dapr#6544.

Even for stores which support saving unbounded numbers of records in a single transaction, it may be desirable to break those transactions into smaller chunks. One reason could be that large transactions could occupy too many database resources. Another is that large transactions could take a long time, increase the chance of failures, and cause work to need to be redone more often. In degenerate cases, this could cause workflows to get stuck, continuously consume huge amounts of resources, and continuously schedule the same work over and over.

Rather than making each backend implementation do its own chunking, the durabletask-go engine should support this directly. Depending on configuration, the orchestration engine can submit multiple calls to the backend, one for each logical chunk. The configuration for this, for example, could include MaxNewHistoryEventCount and MaxNewHistoryEventBytes settings. When the payload of an orchestration result is close to exceeding either of these numbers, a call to Backend.CompleteOrchestrationWorkItem is called to save the current chunk. The engine will then continue building the next payload until a final call to Backend.CompleteOrchestrationWorkItem is made with the final set of updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant