Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

911: Setup migrations #906

Merged
merged 7 commits into from
Apr 4, 2023
Merged

911: Setup migrations #906

merged 7 commits into from
Apr 4, 2023

Conversation

sarahsporck
Copy link
Contributor

So this PR is basically the counterpart to #841. I did not use an external migration tool, but only created a migration table, as it was also suggested by @maxammann .

Pros:

  • we do not need an external closed source dependency
  • missing features such as rollback can be added later
  • migrations can be written in *.java

Cons:

  • missing experience on how migrations work and best practices
  • potentially additional maintenance
  • migrations cannot be written in sql (only indirect by passing the sql string to exposed)

While originally, I was pretty convinced that flyway is the better solution I now think that using a custom migration implementation has the same value, and is probably also implemented very similar.

Please, leave you're opinions on this in a timely manner so we can decide, which way we want to go :)

Copy link
Contributor

@f1sh1918 f1sh1918 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well i checked both PRs now and i feel good with flyway to be honest i just created some basic migration scripts in sql and kotlin and everything worked fine. For our concerns it should be fine. Since i have no real experience in database migration i would tend to use a library which is for my opinion easy to use and has a good documentation and you can find some easy examples

Copy link
Member

@michael-markl michael-markl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally (and somewhat surprisingly for me :D), I think I like this approach better as it has quite a little footprint, is easy to understand (no need to fight through flyway docs), and super flexible.

I think in both PRs you are missing one important aspect:
V1 should always create all tables at some baseline (I would suggest baseline=current tables in production).

This means, we'll have to create a lot of duplication in the V1 migration file (as we should not use the schema definitions in the "production code", as they can change in the future), but that's fine.
That being said, consequently, we should remove all other setupDatabase functions. We never want files other than migration files to alter the DB schema.

Copy link
Member

@michael-markl michael-markl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a next step / issue, we could set up our CI to run all migrations starting from an empty table, just to have some confidence on push.

@f1sh1918
Copy link
Contributor

In a next step / issue, we could set up our CI to run all migrations starting from an empty table, just to have some confidence on push.

But does this migration apply @michael-markl i couldn't get it work. Still having the email index

@f1sh1918
Copy link
Contributor

f1sh1918 commented Mar 30, 2023

Personally (and somewhat surprisingly for me :D), I think I like this approach better as it has quite a little footprint, is easy to understand (no need to fight through flyway docs), and super flexible.

I think in both PRs you are missing one important aspect: V1 should always create all tables at some baseline (I would suggest baseline=current tables in production).

This means, we'll have to create a lot of duplication in the V1 migration file (as we should not use the schema definitions in the "production code", as they can change in the future), but that's fine. That being said, consequently, we should remove all other setupDatabase functions. We never want files other than migration files to alter the DB schema.

OK so as i got that right, for every adjustment i make for the schema like adding a column, i have to write a migration.
Hm so it would make sense also to be able to execute sql files?!
Maybe a stupid question but can we not just run the setupDatabase functions once in the first migration file to create the baseline?

@michael-markl
Copy link
Member

michael-markl commented Mar 30, 2023

OK so as i got that right, for every adjustment i make for the schema like adding a column, i have to write a migration.

That's my understanding of DB migrations, yes.

Hm so it would make sense also to be able to execute sql files?!

I'm not entirely sure what your questions refers to, but you can also execute arbitrary SQL statements through exposed.

Maybe a stupid question but can we not just run the setupDatabase functions once in the first migration file to create the baseline?

Not a stupid question. But: I think, what DB migrations usually want to achieve is, that if you start with an empty DB and apply all migrations, you end up in a deterministic state. This means, that once you have written a migration (and executed it on the prod system), you should usually never change it. Otherwise, you cannot replicate the production schema by simply running all migrations again on an empty DB.

So let's assume, we would just use the setupDatabase functions in the V1 migration step. At some point we want to add/drop a column from a table. Then you would adjust the schema in the production code (that we call from our webservice). But with this adjustment, you also change the V1 migration step.

What we could do, I think, is redefine our "first" schema inside the V1 migration file (basically copy pasting every Schema we have into the migration file) and call exposed's SchemaUtils.create on these schemas.
Edit1: Another possibility would be to use pg_dump --schema-only on our current prod DB and to use that as our V1 script.

Edit2:

But does this migration apply @michael-markl i couldn't get it work. Still having the email index

I haven't tested it.

Copy link
Member

@timobrembeck timobrembeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very promising 👍

I don't have a strong preference towards one of the two solutions, but I'd give this one a slight edge since having the possibility to use the exposed ORM in migrations can be a crucial advantage, especially when dealing with data migrations where you not only alter the schema of a table, but also its contents.

I noticed there is a fork of the initial code base, Suwayomi/exposed-migrations, which seems to be more or less maintained and has a few improvements, so I guess we could also import that library and provide all our suggestions as upstream contributions.

Other than that, I agree with everything @michael-markl said 😁

@sarahsporck sarahsporck force-pushed the setup-db-migrations-2 branch from ba46271 to dcd88e3 Compare March 31, 2023 09:50
@sarahsporck
Copy link
Contributor Author

@michael-markl did you create the baseline migration from the production db?

@michael-markl
Copy link
Member

@michael-markl did you create the baseline migration from the production db?

No, that still needs to be done (see Todo in V2 step). I don't have access to the backend so I couldn't do it 😅

@sarahsporck
Copy link
Contributor Author

@maxammann or @f1sh1918 could one of you send me the current production sql schemes? I think you just need to run pg_dump --schema-only. Correct me if I'm wrong @michael-markl 😅

@sarahsporck sarahsporck force-pushed the setup-db-migrations-2 branch from 2095012 to 8987b88 Compare March 31, 2023 10:52
@sarahsporck sarahsporck force-pushed the setup-db-migrations-2 branch from 8987b88 to 8e0f2d7 Compare March 31, 2023 13:12
@sarahsporck sarahsporck changed the title Custom migrations 911: Setup migrations Mar 31, 2023
@sarahsporck sarahsporck added this to the Bayern Launch milestone Mar 31, 2023
@sarahsporck sarahsporck linked an issue Mar 31, 2023 that may be closed by this pull request
@sarahsporck sarahsporck self-assigned this Mar 31, 2023
@sarahsporck sarahsporck added prio: high Issue must be solved within the next weeks. and removed prio: high Issue must be solved within the next weeks. labels Mar 31, 2023
@sarahsporck sarahsporck removed this from the Bayern Launch milestone Mar 31, 2023
@f1sh1918
Copy link
Contributor

f1sh1918 commented Mar 31, 2023

@maxammann or @f1sh1918 could one of you send me the current production sql schemes? I think you just need to run pg_dump --schema-only. Correct me if I'm wrong @michael-markl 😅

I've tried several times now. No idea why it didn't work. No error and nothing.. Just couldn't find any created schema sql file on the server. Gonna try to get it tommorow again...

Update:
Okay had to use postgres instead of backend user. Some error messages would be nice :)
I commited the baseline.sql file

@maxammann

This comment was marked as duplicate.

import app.ehrenamtskarte.backend.migration.Migration
import app.ehrenamtskarte.backend.migration.Statement

internal class V1_InitializeMigrations : Migration() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, the migration table should be created by the migration framework and not by a migration itself.
It just does not make sense design-wise, since if you change the schema of the migration table and its usage in MigrationUtils, these changes need to be present from the first migration on (e.g. when setting up a new system), but this first migration is supposed to be immutable. So if you define your migration schema inside a migration, you will never be able to change it, which kind of defies the point of a migration.
So I would suggest to use SchemaUtils.create() somewhere inside the migration utils in this specific case. Most likely you will never change the migration schema anyways, but just in case you do, you are flexible enough to handle the transition in the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree :D So the changes to the migrations table should probably be handled separately.

So I would suggest to use SchemaUtils.create() somewhere inside the migration utils in this specific case. Most likely you will never change the migration schema anyways, but just in case you do, you are flexible enough to handle the transition in the code.

I think it is reasonable to create the migrations table and do any other Migrations-Table-Migrations (:D) just before we apply any actual migrations (but still inside the transaction, so if migration fails, we can safely go back to the old backend code (with the old migrations table))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but I would not handle migration-schema changes inside separate migrations, because then we would have to create a migrations-migration table to keep track of which migration-migrations are already applied 😂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! I guess, changes to the migration schema will happen rarely enough to handle them in the code directly without keeping track of them in another table :D
Also, once these changes have been applied in prod, we can drop the code for migrating to new new migration schema :D

@sarahsporck sarahsporck marked this pull request as ready for review April 3, 2023 07:51
Copy link
Member

@michael-markl michael-markl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once this gets deployed (I guess once it's merged), we should not forget to run migrateSkipBaseline on prod.

@sarahsporck
Copy link
Contributor Author

@michael-markl is there anything holding this back to be merged? I also tested this locally and did not encounter any issues.

@michael-markl
Copy link
Member

No, imo it can be merged. I am excited for this😻

@sarahsporck sarahsporck merged commit 7009894 into main Apr 4, 2023
@sarahsporck sarahsporck deleted the setup-db-migrations-2 branch April 4, 2023 12:26
@f1sh1918
Copy link
Contributor

f1sh1918 commented Apr 4, 2023

@sarahsporck
yes we should have waited for deactivation of auto update on production
i wanted to test it now and i get this error with an empty database

Exception in thread "main" app.ehrenamtskarte.backend.migration.DatabaseOutOfSyncException: 
Latest migration versions do not match: Version on DB null - Code Version 2

@sarahsporck
Copy link
Contributor Author

@sarahsporck
yes we should have waited for deactivation of auto update on production
i wanted to test it now and i get with an empty database

Exception in thread "main" app.ehrenamtskarte.backend.migration.DatabaseOutOfSyncException: 
Latest migration versions do not match: Version on DB null - Code Version 2

Locally? And you did execute just the migrate command?

@f1sh1918
Copy link
Contributor

f1sh1918 commented Apr 4, 2023

@sarahsporck
yes we should have waited for deactivation of auto update on production
i wanted to test it now and i get with an empty database

Exception in thread "main" app.ehrenamtskarte.backend.migration.DatabaseOutOfSyncException: 
Latest migration versions do not match: Version on DB null - Code Version 2

Locally? And you did execute just the migrate command?

Ok got it working. I didn't know how to drop these functions in the beginning of the V0001_Baseline
I thought we just wait for staging server for migration but its fine :)

@sarahsporck
Copy link
Contributor Author

Currently migratuons are not yet run in the ci. But can only executed locally. So as long as you don't accidentally migrate on the live system everything should be fine.

@michael-markl
Copy link
Member

michael-markl commented Apr 5, 2023

Since #916 is merged, migrations are run in PRs and on merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Setup DB migration
5 participants