-
-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set up Cloud SQL Postgres database for dagster storage #2996
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## dev #2996 +/- ##
=====================================
Coverage 88.7% 88.7%
=====================================
Files 91 91
Lines 11007 11007
=====================================
Hits 9766 9766
Misses 1241 1241
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should work, I think, though lots of aspects of the postgres setup aren't actually in this PR.
I have some blocking questions about the lifecycle management stuff - I don't think we need to fully figure out how to make it robust, but we should at least talk about that before moving on.
@@ -85,6 +85,8 @@ jobs: | |||
|
|||
# Deploy PUDL image to GCE | |||
- name: Deploy | |||
env: | |||
DAGSTER_PG_PASSWORD: ${{ secrets.DAGSTER_PG_PASSWORD }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future we should probably store this secret in Google Secret Manager instead - but this is fine for now. We need to get our nightly builds working consistently again, so there's urgency. And, I think we're building up momentum to revamp the infra completely - what's one more piece of jankiness at this juncture?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're saying pass secrets from Google Secret Manager directly to the VM instead of storing GCP secrets in Github Actions and passing them to the VM?
@@ -110,6 +112,11 @@ jobs: | |||
--container-env AWS_ACCESS_KEY_ID=${{ secrets.AWS_ACCESS_KEY_ID }} \ | |||
--container-env AWS_SECRET_ACCESS_KEY=${{ secrets.AWS_SECRET_ACCESS_KEY }} \ | |||
--container-env AWS_DEFAULT_REGION=${{ secrets.AWS_DEFAULT_REGION }} \ | |||
--container-env DAGSTER_PG_USERNAME="postgres" \ | |||
--container-env DAGSTER_PG_PASSWORD="$DAGSTER_PG_PASSWORD" \ | |||
--container-env DAGSTER_PG_HOST="104.154.182.24" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be nice to run Postgres in the container itself, so we don't have to do all the weird Cloud SQL lifecycle management stuff.
On the other hand, it is weird to run your application and database in the same container. But gcloud compute only supports one container per VM (unless you want to run docker compose manually on the VM). So I'm OK with the Cloud SQL stuff for now as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on my experience, it seems like it's best to keep the app and database containers separate. I went with Cloud SQL because it's what dagster recommends and it seemed simpler than figuring out how to launch multiple containers.
I guess it's a little strange to spin the database up and down but I don't want to pay for the time it goes unused.
docker/gcp_pudl_etl.sh
Outdated
# Start dagster-storage Cloud SQL instance | ||
gcloud sql instances patch dagster-storage --activation-policy=ALWAYS | ||
# Create database | ||
gcloud sql databases create $DAGSTER_PG_DB --instance=dagster-storage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Does this wait for the database to spin up before returning?
- Does create throw an error if the DB already exists? Do we expect the DB to persist data over runs, or do we want to drop everything between runs? I see we delete the database below, but what happens if we error out and get interrupted before we delete the database?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'll double check the database is created before returning.
- I figured it was fine to delete the database even if the build fails because the dagster event logs aren't actually that helpful when debugging build failures.
I removed the Cloud SQL lifecycle logic from the |
PR Overview
This PR sets up a Cloud SQL Postgres database for dagster storage in our nightly builds. This will hopefully prevent the dagster backend storage database from locking up and killing our builds.
Changes:
docker/gcp_pudl_etl.sh
to start and stop thedagster-storage
Cloud SQL instance and create and delete adagster-storage-$ACTION_SHA-$GITHUB_REF
database. This way, we have a fresh database for each nightly build run and won't incur Cloud SQL instance charges when we aren't running a nightly build.pudl-deployment-tag
andpudl-deployment-dev
VMs so they can be whitelisted by the Cloud SQL instance.PUDL_SETTINGS_YAML
to the container. I think this var was set manually in the GCP Console.PR Checklist
dev
).