Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace remove_at with processing_attempts #171

Closed
markstory opened this issue Feb 6, 2025 · 0 comments · Fixed by #179
Closed

Replace remove_at with processing_attempts #171

markstory opened this issue Feb 6, 2025 · 0 comments · Fixed by #179
Assignees

Comments

@markstory
Copy link
Member

The remove_at column has not worked out as well as planned. We are unable to differentiate activations that are stuck in sqlite because workers are unable to process them (worker death), and tasks that are stuck in sqlite because we have no workers (absent workers) available.

In sandbox testing this scenario has come up, as we can fill a broker's db with activations, shut it down, and then when it starts in the future all of its tasks are past remove_at but the broker is unable to make progress because all remove_at values are in the past.

The solution discussed on Feb 6 for this was to replace remove_at with processing_attempts. Each time we reset an activation from processing -> pending, we also increment the processing_attempts counter.

In upkeep we can scan for pending tasks that have a processing_attempt higher than the max allowed attempts, and discard/deadletter those activations. This will allow us to move from timestamp based purging to attempt based, which also simplifies the absent worker scenario.

Changes to make

  • Add processing_attempts to sqlite
  • Remove remove_at from sqlite.
  • Each time an activation is moved out of processing into pending increment the attempt counter.
  • Add max_processing_attempts to configuration
  • Remove remove_deadline from config.
  • During upkeep any activations with processing_attempts in excess of configuration value should be moved to failed so that they can be discarded/deadlettered.
  • Simplify logic used to remove_completed to no longer require an incomplete task to follow a complete one. All completed tasks can be removed from sqlite.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants