Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a flag to override allowed migrations in health check #2135

Open
ecordell opened this issue Nov 15, 2024 · 0 comments
Open

Add a flag to override allowed migrations in health check #2135

ecordell opened this issue Nov 15, 2024 · 0 comments
Labels
kind/proposal Something fundamentally needs to change

Comments

@ecordell
Copy link
Contributor

ecordell commented Nov 15, 2024

Problem Statement

Right now, the health check for SpiceDB is very restrictive; in most cases (except where we hardcode specific exceptions) the datastore health check requires that the database be at exactly one migration level, the latest for that specific version of SpiceDB.

This health check only has to succeed once on startup, and then the health check ignores the migration state. Under normal circumstances, this doesn't cause problems during upgrades because new SpiceDB versions (with newer migrations) will start up while the old instances are still running.

This can cause (and has caused) problems if the "old" instances of SpiceDB need to restart for some reason after a newer migration has run - they will fail to come up healthy. This is in spite of the majority of the migrations for SpiceDB being fully backwards-compatible (the exceptions being phased migrations, in which case there is always some step to run that is backwards-compatible).

We can't simply allow the "next" migration, because we don't know what the "next" one is before we publish a specific version of spicedb. We also may want to skip ahead several migrations (i.e. the spicedb operator allows this as long as it is safe).

Solution Brainstorm

We could potentially add something smarter to the health check, so that SpiceDB only ensures that it has what it needs to work and not a specific migration name (i.e. check that field X exists on table Y).

But a simpler option would just be to add a flag, i.e. --alowed-migrations that allow a user to specify a set of additional migrations that are considered safe on startup. The operator can set this flag to the migration of the newer version and roll the "old" version with that flag before deploying the migration itself. Then the old instances will continue to work even if they need to restart / reschedule after the migration has run.

@ecordell ecordell added the kind/proposal Something fundamentally needs to change label Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/proposal Something fundamentally needs to change
Projects
None yet
Development

No branches or pull requests

1 participant