-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move dev DB schema creation and stamping to a separate process #1123
Conversation
#!/usr/bin/env python3 | ||
import sys | ||
"""Initialize the dev environment's DB.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The #!
needs to be above the docstring or it doesn't work! 🤦
bin/make_db
actually comes from the cookiecutter, so I'll need to move these changes into the cookiecutter
via/db.py
Outdated
def create_engine(settings): | ||
"""Return a SQLAlchemy engine.""" | ||
return sqlalchemy.create_engine(settings["database_url"]) | ||
|
||
|
||
def create_all(engine): | ||
"""Create all DB tables that don't already exist.""" | ||
Base.metadata.create_all(engine) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pulled create_all()
out into a separate process that's not called by the includeme()
below anymore, and is imported and called by bin/make_db
instead.
create_engine()
also had to be pulled out because make_db
needs to call it in order to get an engine to pass to create_all(engine)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could probably call "Base.metadata.create_all(engine)" directly in the script? Save ourselves a test. I'm not sure wrapping it here provides benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've inlined the create_all()
and also the stamping, but left create_engine()
in db.py
because it's actually called in two places
if config.registry.settings.get("dev"): | ||
# Create the database tables if they don't already exist. | ||
Base.metadata.create_all(engine) | ||
stamp(engine) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly not having dev specific stuff in here seems like a win to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
6cb8fbd
to
884c296
Compare
1ec445e
to
7f96bf1
Compare
Context ------- At startup (when you run `make dev`) Via's Gunicorn processes attempt to create the DB tables if they don't already exist. This is done in dev only, not in production: # via/db.py def includeme(config): ... if config.registry.settings.get("dev"): # Create the database tables if they don't already exist. Base.metadata.create_all(engine) Problem ------- Via actually has four Gunicorn workers in dev (see the `workers = 4` in `conf/gunicorn/dev.conf.py`). All 4 try to create the DB tables at the same time and 3 of them crash, printing out error messages. The issue is kind of difficult to reproduce: 1. You need to actually have a model, so you won't be able to produce the issue on the `db` branch without changes. See #1072 and #1093 for branches on which you can reproduce the issue. 2. The model's table needs to actually be missing from the DB table. You can do that by running this command: make services args=down && make services && make dev 3. Since the issue is a race condition you won't be able to reproduce it reliably, but if you run the above command several times you should see it. Solution -------- This commit moves the `sqlalchemy.create_all()` call out of the Gunicorn worker process and instead adds a separate `make_db` process to the list of processes that `supervisord-dev.conf` runs when you run `make dev`. This new `make_db` process starts up, calls `create_all()`, and then exits, and since it's a singular process there's no collision.
@@ -1,5 +1,9 @@ | |||
{ | |||
"programs": { | |||
"make_db": { | |||
"command": "bin/make_db", | |||
"startsecs": "0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default if bin/make_db
exits in under a second (which it easily can do) Supervisor will consider this a crash regardless of the exit code and will restart the process. Setting startsecs
to 0 tells Supervisor not to consider this a crash and not to restart the process (unless the exit code is non-zero)
|
||
|
||
def main(): | ||
with bootstrap("conf/development.ini") as env: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with
is the recommended way to use bootstrap()
in the Pyramid docs, make_db
wasn't doing this before
|
||
if not is_stamped(engine): | ||
Base.metadata.create_all(engine) | ||
alembic.command.stamp(alembic.config.Config("conf/alembic.ini"), "head") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned on the other PR I reckon either here or in the Makefile
we should run the migrations as part of the make db
process
https://github.com/hypothesis/lms/blob/main/Makefile#L18
I reckon with that we'll get a "it just works" experience for dev that are not making changes here often and just want to have the service running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Via does run the migrations in Makefile
. I think make db
has always done this for all our apps?
It may be a bit confusing that there's a bin/make_db
script, but calling that script is only one of two commands that make db
runs. May want to move the alembic uprade
command into the make_db
script somehow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen Jon already took a look here.
Made a comment about the migration in the make db
process. I reckon that's the right solution but if you prefer we can chat about that once we have actual migrations in this project.
This will make Via out of sync with the cookiecutter. There's a couple of follow-up cookiecutter PRs to fix this: hypothesis/cookiecutters#146, hypothesis/cookiecutters#147
Context
At startup (when you run
make dev
) Via's Gunicorn processes attempt tocreate the DB tables if they don't already exist. This is done in dev
only, not in production:
Problem
Via actually has four Gunicorn workers in dev (see the
workers = 4
inconf/gunicorn/dev.conf.py
). All 4 try to create the DB tables at thesame time and 3 of them crash, printing out error messages.
The issue is kind of difficult to produce:
You need to actually have a model, so you won't be able to produce
the issue on the
db
branch without changes. SeeCache YouTube transcripts in the DB #1072 and
Cache YouTube video titles in the DB #1093 for branches on which you
can reproduce the issue.
The model's table needs to actually be missing from the DB table. You
can do that by running this command:
Since the issue is a race condition you won't be able to reproduce it
reliably, but if you run the above command several times you should
see it.
Solution
This commit moves the
sqlalchemy.create_all()
call out of the Gunicornworker process and instead adds a separate
make_db
process to the listof processes that
supervisord-dev.conf
runs when you runmake dev
.This new
make_db
process starts up, callscreate_all()
, and thenexits, and since it's a singular process there's no collision.