Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up Cloud SQL Postgres database for dagster storage #2996

Merged
merged 11 commits into from
Nov 3, 2023
Merged
Prev Previous commit
Next Next commit
Set PUDL_SETTINGS_YML to etl_full.yml and add git sha to Cloud SQL da…
…tabase name
  • Loading branch information
bendnorman committed Nov 1, 2023
commit ab9425a332060d67972c3fe649e15fabf5f1ae2a
2 changes: 1 addition & 1 deletion .github/workflows/build-deploy-pudl.yml
Original file line number Diff line number Diff line change
@@ -116,7 +116,7 @@ jobs:
--container-env DAGSTER_PG_PASSWORD="$DAGSTER_PG_PASSWORD" \
--container-env DAGSTER_PG_HOST="104.154.182.24" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to run Postgres in the container itself, so we don't have to do all the weird Cloud SQL lifecycle management stuff.

On the other hand, it is weird to run your application and database in the same container. But gcloud compute only supports one container per VM (unless you want to run docker compose manually on the VM). So I'm OK with the Cloud SQL stuff for now as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my experience, it seems like it's best to keep the app and database containers separate. I went with Cloud SQL because it's what dagster recommends and it seemed simpler than figuring out how to launch multiple containers.

I guess it's a little strange to spin the database up and down but I don't want to pay for the time it goes unused.

--container-env DAGSTER_PG_DB="dagster-storage" \
--container-env PUDL_SETTINGS_YML="/home/catalyst/src/pudl/package_data/settings/etl_fast.yml" \
--container-env PUDL_SETTINGS_YML="/home/catalyst/src/pudl/package_data/settings/etl_full.yml" \

# Start the VM
- name: Start the deploy-pudl-vm
4 changes: 2 additions & 2 deletions docker/gcp_pudl_etl.sh
Original file line number Diff line number Diff line change
@@ -8,7 +8,7 @@ function setup_dagster_storage() {
# Start dagster-storage Cloud SQL instance
gcloud sql instances patch dagster-storage --activation-policy=ALWAYS
# Create database
gcloud sql databases create dagster-storage --instance=dagster-storage
gcloud sql databases create "dagster-storage-$ACTION_SHA-$GITHUB_REF" --instance=dagster-storage
}

function send_slack_msg() {
@@ -53,7 +53,7 @@ function shutdown_vm() {
upload_file_to_slack $LOGFILE "pudl_etl logs for $ACTION_SHA-$GITHUB_REF:"

# Delete dagster-storage database
gcloud sql databases delete dagster-storage --instance=dagster-storage
gcloud sql databases delete "dagster-storage-$ACTION_SHA-$GITHUB_REF" --instance=dagster-storage
# Start dagster-storage Cloud SQL instance
gcloud sql instances patch dagster-storage --activation-policy=NEVER