You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now, the health check for SpiceDB is very restrictive; in most cases (except where we hardcode specific exceptions) the datastore health check requires that the database be at exactly one migration level, the latest for that specific version of SpiceDB.
This health check only has to succeed once on startup, and then the health check ignores the migration state. Under normal circumstances, this doesn't cause problems during upgrades because new SpiceDB versions (with newer migrations) will start up while the old instances are still running.
This can cause (and has caused) problems if the "old" instances of SpiceDB need to restart for some reason after a newer migration has run - they will fail to come up healthy. This is in spite of the majority of the migrations for SpiceDB being fully backwards-compatible (the exceptions being phased migrations, in which case there is always some step to run that is backwards-compatible).
We can't simply allow the "next" migration, because we don't know what the "next" one is before we publish a specific version of spicedb. We also may want to skip ahead several migrations (i.e. the spicedb operator allows this as long as it is safe).
Solution Brainstorm
We could potentially add something smarter to the health check, so that SpiceDB only ensures that it has what it needs to work and not a specific migration name (i.e. check that field X exists on table Y).
But a simpler option would just be to add a flag, i.e. --alowed-migrations that allow a user to specify a set of additional migrations that are considered safe on startup. The operator can set this flag to the migration of the newer version and roll the "old" version with that flag before deploying the migration itself. Then the old instances will continue to work even if they need to restart / reschedule after the migration has run.
The text was updated successfully, but these errors were encountered:
Problem Statement
Right now, the health check for SpiceDB is very restrictive; in most cases (except where we hardcode specific exceptions) the datastore health check requires that the database be at exactly one migration level, the latest for that specific version of SpiceDB.
This health check only has to succeed once on startup, and then the health check ignores the migration state. Under normal circumstances, this doesn't cause problems during upgrades because new SpiceDB versions (with newer migrations) will start up while the old instances are still running.
This can cause (and has caused) problems if the "old" instances of SpiceDB need to restart for some reason after a newer migration has run - they will fail to come up healthy. This is in spite of the majority of the migrations for SpiceDB being fully backwards-compatible (the exceptions being phased migrations, in which case there is always some step to run that is backwards-compatible).
We can't simply allow the "next" migration, because we don't know what the "next" one is before we publish a specific version of spicedb. We also may want to skip ahead several migrations (i.e. the spicedb operator allows this as long as it is safe).
Solution Brainstorm
We could potentially add something smarter to the health check, so that SpiceDB only ensures that it has what it needs to work and not a specific migration name (i.e. check that field X exists on table Y).
But a simpler option would just be to add a flag, i.e.
--alowed-migrations
that allow a user to specify a set of additional migrations that are considered safe on startup. The operator can set this flag to the migration of the newer version and roll the "old" version with that flag before deploying the migration itself. Then the old instances will continue to work even if they need to restart / reschedule after the migration has run.The text was updated successfully, but these errors were encountered: