Introduce basic migration logic #428

limemloh · 2025-01-24T10:57:45Z

Purpose

Introduce migration logic into the ccdscan-indexer binary, allowing us to run migrations from rust.
ccdscan-api now ensures the database schema version is compatible and ccdscan-indexer ensures it is the latest before running.

Changes

Change the DATABASE_URL environment variable for both binaries to allow for disabling compile-time checked queries while running the service and to follow the naming convention.
Update the .env.template with more settings and comments
Update readme with new instructions on how to run and introduce migrations.
Removed default value for node endpoint CLI arg

lassemand · 2025-01-24T14:40:36Z

backend-rust/src/migrations/m0001-initialize.sql

@@ -0,0 +1,782 @@
+CREATE EXTENSION IF NOT EXISTS pg_trgm;


Why do we have this file? It seems to be pretty much the same as what is already there.

I assume it at the very least means we have to remove the old ones?

Yes, the old one needs to be removed and I will do that right before merging.
Merging in changes to the old file from main, is just simpler when I keep this around.

lassemand · 2025-01-24T14:49:04Z

backend-rust/src/bin/ccdscan-api.rs

@@ -60,6 +64,9 @@ async fn main() -> anyhow::Result<()> {
        .connect(&cli.database_url)
        .await
        .context("Failed constructing database connection pool")?;
+    // Ensure the database schema is compatible with supported schema version.
+    migrations::ensure_compatible_schema_version(&pool, SUPPORTED_SCHEMA_VERSION).await?;


Just do the same migration here instead.

During it this way we have an annoying situation where we are dependent on the api is initialising after the migration has occurred on the indexer which is a pretty annoying feature to implement deployment wise.

At least it cannot be done as is, since it happens application side.

I really don't think this should be doing migrations.

Most database migrations require changes to the indexer, so if the API starts doing migrations, it might crash it.
New columns need to be populated by the indexer, before the API should start using them anyway.
Removing columns would crash the indexer.

We also don't want several migrations to be running at once, so since we only run one indexer instance, this is probably a better fit.

Currently you are not making changes to the indexer. This is database relevant only. As is these changes will cause failures to the deployment because the SQL schema will inconsistently be incompatible upon startup of the API. The simple solution is to do what I suggested above.

An alternative could be to provide the information about whether the api schema has been applied externally. This could for example be in the health endpoint.

A simpler approach yet is to just keep retrying until it is valid instead of failing

backend-rust/src/migrations.rs

DOBEN · 2025-01-24T11:22:39Z

backend-rust/src/scalar_types.rs

@@ -3,7 +3,6 @@ use async_graphql::{scalar, InputValueError, InputValueResult, Scalar, ScalarTyp

 pub type Amount = UnsignedLong;
 pub type TokenId = String;
-pub type RewardPeriodLength = i64;


After a rebase on top of main, this line and the rust-submodule-link update should disappear.

DOBEN · 2025-01-24T11:32:11Z

backend-rust/.env.template

+PGPASSWORD=${DB_PASSWORD}
+
+# gRPC interface of the node.
+CCDSCAN_INDEXER_GRPC_ENDPOINTS=http://localhost:20000


Can we use as default a public URL node. That would allow external people to easier start the service (even mention the URLs to nodes on all three networks) successfully since people don't need to figure out how to set up a node (or how to change this part to a public node).

DOBEN · 2025-01-24T14:49:25Z

backend-rust/.env.template

+CCDSCAN_INDEXER_DATABASE_URL=postgres://postgres:$PGPASSWORD@localhost/ccdscan
+
+# Database connection used by the ccdscan-api.
+CCDSCAN_API_DATABASE_URL=${CCDSCAN_INDEXER_DATABASE_URL}


Running make setup && make (as described in the READMe) overwrites the CCDSCAN_API_DATABASE_URL/PGPASSWORD string now to an empty string when run for the first time. The make files need to be tested with these changes.

I am a bit in favor of keeping it simple and telling people to rename the .env_template file to .env without providing make files. Our make files will get out of date or break earlier or later (if we keep them, we should have some CI pipelines running the make file long-term).

DOBEN · 2025-01-24T15:44:44Z

backend-rust/README.md

+
+## Database schema setup and migrations
+
+To set up the database schema either from an empty database or migration from an older release of `ccdscan-indexer` run:


Can we add the command to start an empty database here.

DOBEN · 2025-01-24T15:53:11Z

backend-rust/README.md

@@ -39,7 +59,7 @@ cargo run --bin ccdscan-indexer -- --node https://grpc.testnet.concordium.com:20
 Note: Since the indexer puts a lot of load on the node, use your own local node whenever possible.
 If using the public nodes, run the indexer as short as possible.

-<!-- TODO When service become stable: add documentation of arguments and environment variables. -->
+Both binaries read variables from a `.env` file if present in the directory, use `.env.template` in this project as the starting point.


How about the following order to get someone who reads the READme for the first time up-to-speed as fast as possible:

First paragraph includes all dependencies (## Dependencies and ## Setup for development paragraph).

Second paragraph about setups:

mention the envs file (so people know they need to set up envs before running/migrating anything) - e.g. sentence line 62

mention the database: e.g. how to get an empty database set-up (cli command).

Third paragraph: Migration/Running the services (rest of ReadMe).

DOBEN · 2025-01-24T16:24:42Z

backend-rust/README.md

 ```

-where `<description>` is replaced by a short description of the nature of the migration.
+NOTE: Having compile-time checked queries will cause issues, since the queries are invalid until the database have been properly migrated. This is done by _not_ having the `DATABASE_URL` environment variable set until after running the migrations or using:


This Note could be added to the ccdscan-api service further up as well with your given cli command as a work around since it can happen in the ccdscan-api service as well.

DOBEN · 2025-01-24T16:30:37Z

backend-rust/src/migrations.rs

+    #[display("0000:Empty database with no tables yet.")]
+    Empty            = 0,
+    #[display(
+        "0001:Initial schema version with tables for blocks, transaction, accounts, contracts, \


Suggested change

"0001:Initial schema version with tables for blocks, transaction, accounts, contracts, \

"0001:Initial schema version with tables for blocks, transactions, accounts, contracts, \

DOBEN · 2025-01-25T21:53:52Z

backend-rust/src/migrations.rs

+            "Database is using an older schema version not supported by this version of \
+             `ccdscan-api`.
+
+Use `ccdscan-indexer --migrate` migrate the database schema."


Suggested change

Use `ccdscan-indexer --migrate` migrate the database schema."

Use `ccdscan-indexer --migrate` to migrate the database schema."

DOBEN · 2025-01-25T22:11:15Z

backend-rust/src/migrations.rs

+    /// The latest known version of the schema.
+    const LATEST: SchemaVersion = SchemaVersion::InitialFirstHalf;


This requires to update the LATEST constant every time we extend the SchemaVersion enum.

Some alternatives:

Along the line of nightly Rust feature std::mem::variant_count (not available in stable Rust so).

Some helper crates, e.g.:

use strum::IntoEnumIterator; use strum_macros::EnumIter; #[derive(Debug, PartialEq, Eq, EnumIter, PartialOrd, Ord, Clone, Copy, derive_more::Display)] #[repr(i64)] pub enum SchemaVersion { #[display("0000:Empty database with no tables yet.")] Empty, #[display( "0001:Initial schema version with tables for blocks, transaction, accounts, contracts, \ tokens and more." )] InitialFirstHalf, } impl SchemaVersion { pub fn latest() -> Self { SchemaVersion::iter().max().unwrap() } }

DOBEN · 2025-01-25T22:12:43Z

backend-rust/src/migrations.rs

+    fn from_version(version: i64) -> Option<SchemaVersion> {
+        match version {
+            0 => Some(SchemaVersion::Empty),
+            1 => Some(SchemaVersion::InitialFirstHalf),


You explain in the READMe well how the SchemaVerrsion and its functions need to be extended in the future. To be bulletproof against devs extending the SchemaVerrsion (without having read that part of the READMe), it would be create if we get compile-errors when extending the SchemaVersion enum and forget to update any of the functions here.

DOBEN · 2025-01-25T22:13:39Z

backend-rust/src/migrations.rs

+#[repr(i64)]
+pub enum SchemaVersion {
+    #[display("0000:Empty database with no tables yet.")]
+    Empty            = 0,


Suggested change

Empty = 0,

Empty ,

DOBEN · 2025-01-25T22:25:13Z

backend-rust/src/migrations.rs

+/// Query the migrations table for any migrations which have been destructive
+/// since the provided version.


DOBEN · 2025-01-25T22:43:32Z

backend-rust/src/migrations.rs

+    if let Some(destructive_migration) = destructive_migration {
+        anyhow::bail!(
+            "Database is using a newer schema version, which is not compatible with the current \
+             version:
+    Support {}
+    Found {}",
+            Migration::from(supported),
+            destructive_migration


Might be good to return all breaking changes (even practically it is unlikely we have more than one) and distinguish between supported, current and breaking versions.

Suggested change

if let Some(destructive_migration) = destructive_migration {

anyhow::bail!(

"Database is using a newer schema version, which is not compatible with the current \

version:

Support {}

Found {}",

Migration::from(supported),

destructive_migration

if let Some(destructive_migration:Vec<Migrations>) = destructive_migrations {

anyhow::bail!(

"Database is using a newer schema version which is not compatible with the supported \

version of this service. The following breaking schema migration have happened:

Found {}

Support {}

List of breaking schema versions: {}",

current,

Migration::from(supported),

destructive_versions

DOBEN · 2025-01-26T23:02:31Z

backend-rust/src/migrations.rs

+        "0001:Initial schema version with tables for blocks, transaction, accounts, contracts, \
+         tokens and more."
+    )]
+    InitialFirstHalf = 1,


Suggested change

InitialFirstHalf = 1,

InitialFirstHalf,

DOBEN · 2025-01-26T23:22:04Z

backend-rust/src/migrations.rs

+    fn from_version(version: i64) -> Option<SchemaVersion> {
+        match version {
+            0 => Some(SchemaVersion::Empty),
+            1 => Some(SchemaVersion::InitialFirstHalf),
+            _ => None,
+        }
+    }


Implementing the try_from trait might be more common than a custom conversion function.

Suggested change

fn from_version(version: i64) -> Option<SchemaVersion> {

match version {

0 => Some(SchemaVersion::Empty),

1 => Some(SchemaVersion::InitialFirstHalf),

_ => None,

}

}

impl TryFrom<i64> for SchemaVersion {

type Error = anyhow::Error;

fn try_from(version: i64) -> Result<Self, Self::Error> {

match version {

0 => Ok(SchemaVersion::Empty),

1 => Ok(SchemaVersion::InitialFirstHalf),

_ => anyhow::bail!("Unknown database schema version"),

}

}

}

DOBEN · 2025-01-26T23:50:31Z

backend-rust/src/migrations.rs

+    /// Run migrations for this schema version to the next.
+    async fn migration_to_next(&self, pool: &PgPool) -> anyhow::Result<SchemaVersion> {
+        let mut tx = pool.begin().await?;
+        let start_time = chrono::Utc::now();
+        let new_version = match self {
+            SchemaVersion::Empty => {
+                // Set up the initial database schema.
+                tx.as_mut()
+                    .execute(sqlx::raw_sql(include_str!("./migrations/m0001-initialize.sql")))
+                    .await?;
+                SchemaVersion::InitialFirstHalf
+            }
+            SchemaVersion::InitialFirstHalf => unimplemented!(
+                "No migration implemented for database schema version {}",
+                self.as_i64()
+            ),
+        };
+        let end_time = chrono::Utc::now();
+        insert_migration(&mut tx, &new_version.into(), start_time, end_time).await?;
+        tx.commit().await?;
+        Ok(new_version)
+    }


Suggested change

/// Run migrations for this schema version to the next.

async fn migration_to_next(&self, pool: &PgPool) -> anyhow::Result<SchemaVersion> {

let mut tx = pool.begin().await?;

let start_time = chrono::Utc::now();

let new_version = match self {

SchemaVersion::Empty => {

// Set up the initial database schema.

tx.as_mut()

.execute(sqlx::raw_sql(include_str!("./migrations/m0001-initialize.sql")))

.await?;

SchemaVersion::InitialFirstHalf

}

SchemaVersion::InitialFirstHalf => unimplemented!(

"No migration implemented for database schema version {}",

self.as_i64()

),

};

let end_time = chrono::Utc::now();

insert_migration(&mut tx, &new_version.into(), start_time, end_time).await?;

tx.commit().await?;

Ok(new_version)

}

fn next_version(self) -> anyhow::Result<SchemaVersion> {

(self.as_i64() + 1).try_into()

}

/// Run migrations for this schema version to the next.

async fn migration_to_next(&self, pool: &PgPool) -> anyhow::Result<SchemaVersion> {

let mut tx = pool.begin().await?;

let start_time = chrono::Utc::now();

match self {

SchemaVersion::Empty => {

// Set up the initial database schema.

tx.as_mut()

.execute(sqlx::raw_sql(include_str!("./migrations/m0001-initialize.sql")))

.await?;

}

SchemaVersion::InitialFirstHalf => unimplemented!(

"No migration implemented for database schema version {}",

self.as_i64()

),

};

let new_version = self.next_version()?;

let end_time = chrono::Utc::now();

insert_migration(&mut tx, &new_version.into(), start_time, end_time).await?;

tx.commit().await?;

Ok(new_version)

}

limemloh added 2 commits January 24, 2025 11:53

Introduce basic migration logic

ad818af

Fix clippy warnings

bebc66a

limemloh requested review from lassemand and DOBEN January 24, 2025 11:30

Add example of using SQLX_OFFLINE

55c86e5

lassemand reviewed Jan 24, 2025

View reviewed changes

DOBEN reviewed Jan 24, 2025

View reviewed changes

DOBEN reviewed Jan 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce basic migration logic #428

Introduce basic migration logic #428

limemloh commented Jan 24, 2025 •

edited

Loading

lassemand Jan 24, 2025

limemloh Jan 24, 2025

lassemand Jan 24, 2025

limemloh Jan 24, 2025

lassemand Jan 25, 2025 •

edited

Loading

lassemand Jan 25, 2025 •

edited

Loading

DOBEN Jan 24, 2025 •

edited

Loading

DOBEN Jan 24, 2025 •

edited

Loading

DOBEN Jan 24, 2025 •

edited

Loading

DOBEN Jan 24, 2025

DOBEN Jan 24, 2025

DOBEN Jan 24, 2025 •

edited

Loading

DOBEN Jan 24, 2025

DOBEN Jan 25, 2025

DOBEN Jan 25, 2025

DOBEN Jan 25, 2025

DOBEN Jan 25, 2025

DOBEN Jan 25, 2025

DOBEN Jan 25, 2025

DOBEN Jan 26, 2025

DOBEN Jan 26, 2025

DOBEN Jan 26, 2025


		## Database schema setup and migrations

		To set up the database schema either from an empty database or migration from an older release of `ccdscan-indexer` run:

	"0001:Initial schema version with tables for blocks, transaction, accounts, contracts, \
	"0001:Initial schema version with tables for blocks, transactions, accounts, contracts, \

	Use `ccdscan-indexer --migrate` migrate the database schema."
	Use `ccdscan-indexer --migrate` to migrate the database schema."

		/// The latest known version of the schema.
		const LATEST: SchemaVersion = SchemaVersion::InitialFirstHalf;

		/// Query the migrations table for any migrations which have been destructive
		/// since the provided version.

Introduce basic migration logic #428

Are you sure you want to change the base?

Introduce basic migration logic #428

Conversation

limemloh commented Jan 24, 2025 • edited Loading

Purpose

Changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lassemand Jan 25, 2025 • edited Loading

Choose a reason for hiding this comment

lassemand Jan 25, 2025 • edited Loading

Choose a reason for hiding this comment

DOBEN Jan 24, 2025 • edited Loading

Choose a reason for hiding this comment

DOBEN Jan 24, 2025 • edited Loading

Choose a reason for hiding this comment

DOBEN Jan 24, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DOBEN Jan 24, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

limemloh commented Jan 24, 2025 •

edited

Loading

lassemand Jan 25, 2025 •

edited

Loading

lassemand Jan 25, 2025 •

edited

Loading

DOBEN Jan 24, 2025 •

edited

Loading

DOBEN Jan 24, 2025 •

edited

Loading

DOBEN Jan 24, 2025 •

edited

Loading

DOBEN Jan 24, 2025 •

edited

Loading