Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

839: Setup DB migrations #841

Closed
wants to merge 1 commit into from
Closed

Conversation

sarahsporck
Copy link
Contributor

Flyway seems to be the most popular tool in regard to db migrations. There is a community and an enterprise version and I think the community edition provides enough for our regards.
Setup seems to be quite simple.

@sarahsporck sarahsporck force-pushed the 839-setup-db-migrations branch from 42868c4 to dc30c96 Compare March 3, 2023 14:52
@sarahsporck sarahsporck linked an issue Mar 3, 2023 that may be closed by this pull request
@michael-markl
Copy link
Member

I feel like I cannot really review this right now as I don't have much time and not a lot of expertise on this matter... There's probably some more considerations to take regarding DB migrations:

  • Proper documentation of the workflow
  • When do we do DB migrations? At the start of every backend run? Manually using another CLI command?
  • How do we make sure, the backend runs in sync with the DB schema? (Migrating on start would work; Alternatively: On backend start: Verify that all migrations found in the backend code have been applied)
  • Backup / Recovery strategy: Before every DB migration, we should probably do a backup, try to migrate and verify everything worked out as expected. If not, roll back.

Maybe the CMS team has more insights how to handle / setup DB migrations?

@sarahsporck
Copy link
Contributor Author

I feel like I cannot really review this right now as I don't have much time and not a lot of expertise on this matter...

Same... I already worked with migrations, but not in any critical environment.

There's probably some more considerations to take regarding DB migrations:

  • Proper documentation of the workflow

I also thought of this. especially when changing branches and stuff the database might not match the current state.

  • When do we do DB migrations? At the start of every backend run? Manually using another CLI command?

It's possible to create a gradle task for this. But for production I think it is better to execute it every time we start the backend. Already applied migrations will not be reapplied.

  • How do we make sure, the backend runs in sync with the DB schema? (Migrating on start would work; Alternatively: On backend start: Verify that all migrations found in the backend code have been applied)
  • Backup / Recovery strategy: Before every DB migration, we should probably do a backup, try to migrate and verify everything worked out as expected. If not, roll back.

The framework tries to apply the migrations in a single transaction. If anything fails, the database will roll back.
Regardless I think backing up the db becomes very important.

Maybe the CMS team has more insights how to handle / setup DB migrations?

As far as I know the CMS automatically generates the migration files, as migrations are already integrated into their db framework. This is definitely nothing, that exposed currently supports. I guess I will still make a call with the backend team to gather some more experience. Maybe @maxammann finds a few minutes to take a look.

Either way dropping the index is currently not really a pressing issue. I'll merge #833 and keep this PR open for further discussion.

Base automatically changed from 826-emails-should-not-be-case-sensitive-when-logging-in to main March 6, 2023 12:24
Copy link
Member

@timobrembeck timobrembeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I cannot really review this right now as I don't have much time and not a lot of expertise on this matter...

Same... I already worked with migrations, but not in any critical environment.

Django takes care of most critical issues completely automatically, so with my comments I'm just throwing another unqualified opinion into the ring 😅

  • Proper documentation of the workflow

I also thought of this. especially when changing branches and stuff the database might not match the current state.

For our CMS, Django provides an easy way of undoing migrations in such cases... if I understood correctly, flyway only supports this as part of their "Teams" plan, and also since the migrations are not autogenerated, you have to manually write an "undo" migration for every migration you write, so nearly twice the work? If you don't need to persist data between branch changes, dropping the db and re-creating it with the new schema could be sufficient for the development workflow? Honestly, this is also what we're doing most of the time, since re-creating is faster than figuring out which migrations to undo 😄

  • When do we do DB migrations? At the start of every backend run? Manually using another CLI command?

It's possible to create a gradle task for this. But for production I think it is better to execute it every time we start the backend. Already applied migrations will not be reapplied.

Depending on how often the server restarts, this could be a bit overkill... I mean it's really only required when the code base changes, so on every new deployment? Since it's automated anyway, maybe just add the migration step to the autoupdate cronjob?

  • How do we make sure, the backend runs in sync with the DB schema? (Migrating on start would work; Alternatively: On backend start: Verify that all migrations found in the backend code have been applied)

Maybe the validate command can be used on every server start?
On the other hand, if the deployment and migration steps are fully automated, there isn't much potential for errors anymore (we should only set up notifications in case the migration itself fails)...

  • Backup / Recovery strategy: Before every DB migration, we should probably do a backup, try to migrate and verify everything worked out as expected. If not, roll back.

The framework tries to apply the migrations in a single transaction. If anything fails, the database will roll back. Regardless I think backing up the db becomes very important.

Database backups seem to exist on our server infra? But restoring them should probably be tested at least once in a while to make sure we didn't miss anything important.

@@ -0,0 +1 @@
alter table administrators drop constraint administrators_email_unique;
Copy link
Member

@timobrembeck timobrembeck Mar 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be different for this specific setup, but in all migration frameworks I know the initial migration starts with an empty database and creates all tables and columns to allow a deterministic outcome (or is this done as part of the setupDatabase* functions?). Otherwise, this might even throw an error on a new setup where the administrators_email_unique constraint doesn't exist yet?

Of course this would normally cause an error on all systems which already have initialized the database, so they have to be baselined (or the new initial migration would have to be applied with the skipExecutingMigrations option) to tell flyway that it should mark the initial migration as applied without really changing anything.
This file would then be v2 and could be applied on top of the initial one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jep you're completely right. It must be at least V2 to be properly applied. Tested that again today. The setupDatabase* functions will create all the tables. Correct. But as this is coupled into the kotlin code and nondeterministic, meaning there can be a different outcome when applying the setupDatabase function e.g. initially and on an already existing Database, As the tables are created in kotlin directly, and columns and constraints are added automatically when we add them in java. See jetbrains exposed.

@michael-markl
Copy link
Member

michael-markl commented Mar 7, 2023

On another note:
Maybe we can use exposed's statementsRequiredToActualizeScheme (or related methods in this file) to validate that our backend with the schema defined in Exposed actually runs against a DB that supports that schema? (And maybe even suggest us, devs, potential migrations statements for Flyway)

@sarahsporck
Copy link
Contributor Author

So we could also consider using a simple native implementation for migrations. In that case we could take the following project as a basis: https://gitlab.com/andreas-mausch/exposed-migrations/-/tree/master/
I'll also try implementing this "simple" implementation in the following weeks. then we can compare.

Copy link
Contributor

@f1sh1918 f1sh1918 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the V1 file should be removed since this was just an example and we start with V2, no?
I feel fine with this solution and tested some sql and kotlin migrations on it.
Maybe we can trigger a database backup before running migration. Just couldn't figure out yet how we can check if a migration has to be executed and then trigger a backup. To create a backup on every backup start before starting migration is also an option

@@ -0,0 +1 @@
alter table administrators drop constraint administrators_email_unique;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the problem I described above, but I think it's the wrong approach - the more migrations you add, the more complex it will get to keep all possible database states in mind and write your migrations conditionally for all these cases.
Instead, I think it's preferable to add exactly one deterministic history of migrations, starting with an empty database, creating the existing schema in v1, and so on. Then, you can always rely on the fact that all n-1 migrations have been applied and you know exactly which state the database is in when you run a new migration.

@sarahsporck
Copy link
Contributor Author

I feel like the custom approach comes with higher flexibility and serves the same features as flyway. So I'd go with that solution and close this PR.

@sarahsporck sarahsporck deleted the 839-setup-db-migrations branch August 18, 2023 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Setup DB migration
4 participants