create Docker builds and add docker-compose config for self-hosting #280

jshimko · 2024-09-26T20:24:56Z

By popular demand, I bring you... Docker builds! This PR adds support for building/starting the entire project with a single command and provides a base docker-compose.yml that can be used as a reference for how to self-host Stack using Docker builds instead of Vercel.

TL;DR

To test this out, you can clone my fork and just run the new Docker startup command...

git clone --single-branch -b docker-builds https://github.com/jshimko/stack.git stack-docker
cd stack-docker

pnpm docker:up

# which is just a convenient alias for...
# docker compose up -d && docker compose logs -f dashboard backend

When running this for the first time, Docker Compose will build the dashboard and backend Docker images and then start them both once complete. It will then tail the logs of both so you can watch for any issues or watch request logs. Once the containers are all started, you should now be able to see the dashboard at http://localhost:8101 and should be able to sign in. That's it! The database is already migrated and a new "self-host" seed script has been run inside the container to ensure the required starting data exists.

Details

The first time you run pnpm docker:up will build the Docker images, but if you change code and want to create a new build, you will have to explicitly run pnpm docker:build and then pnpm docker:up again. There's also a pnpm docker:reset command will kills all of the containers and deletes all data volumes so you can quickly start over from scratch with fresh databases/services. When the db is empty on first start, the backend will automatically migrate and re-seed the database again. And if there are new migrations in a build, the backend will apply them at startup. See the root package.json for more details on these new commands.

Most of the Dockerfile and docker-compose stuff is fairly self-explanatory and I tried to include plenty of comments for anyone that might be digging into those bits. Other than that, I just had to make a few small tweaks to code to be able to support build/deployment in Docker rather than Vercel. The biggest one would be the support for runtime environment variables on the client side.

NextJS runtime env

As you probably know, NextJS compiles all NEXT_PULIC_ env vars into static values when running next build. This isn't an issue on Vercel, but it's definitely an issue in a Docker build because it would mean that you need to hard code runtime values like URL's or analytics API keys into your Docker images and that means you'd have to create a new build for every URL change even though the code hasn't changed. Fortunately, you can use a package called next-runtime-env to fix this problem (see README for detailed explanation). The only changes required to adopt that package are to replace every process.env.NEXT_PUBLIC_ value in the code with that package's env('NEXT_PUBLIC_WHATEVER') helper. That ensures the values set at runtime are always up to date and aren't hard coded in the build output. That said, I replaced every process.env.NEXT_PUBLIC_X variable with env('NEXT_PUBLIC_X') in the dashboard and backend (mostly the dashboard).

Related to all of that, I also realized that the @stackframe/stack package was using several process.env.NEXT_PUBLIC_X values as well, so that meant we'd need the same thing there. However, I didn't want to force next-runtime-env to be a dependency of that package. I also didn't want to create any conflicts for customers that may already be using next-runtime-env because that package writes the NEXT_PUBLIC_ values to window.__ENV in the browser and having two instances of that could potentially result in stepping on each others' toes when writing values to the window object. To ensure that's not a possibility, I created a simple helper component that does the exact same thing as next-runtime-env except writes internal Stack vars to window.__STACK_ENV__. You can find that in packages/stack/src/lib/env. So it solves the same problem in a couple dozen lines of code with no third party dependency and removes the possibility of conflicts when people are already using next-runtime-env.

~~New NEXT_PUBLIC_INSECURE_COOKIE config~~

Since Docker builds result in a Node app running with NODE_ENV=production, that meant cookies were expected to be coming from a URL with https://. That meant that running the builds with docker-compose on localhost wasn't able to work and resulted in an infinite loop of redirects in the browser. To work around that limitation, I added a new env var called NEXT_PUBLIC_INSECURE_COOKIE that allows you to use the dashboard on localhost when NODE_ENV=production. I added detailed comments about this in the code and noted that it should NEVER be used in production. See packages/stack/src/lib/cookie.ts for details. (Addressed outside of this PR)

Self host seed script

Instead of messing with the existing seed script that is used for local dev, I created a new one that is intended to be used inside the backend Docker container on first startup. It's pretty self-explanatory if you read through it. The main difference from the original seed script is it allows you to pass a few new env vars that can configure a default admin user and optionally disable signups to the internal project. Both the admin user and the signup disable are optional and both default to being skipped, so the default behavior is you sign up to create your initial account just like the original seed script. The downside of that though is you won't have access to the internal Stack Dashboard project once in the dashboard. You'll be a member of it, but you won't be able to manage it or add other users. The new seed script admin user does get access to that project, so that's probably what most people will prefer when self-hosting.

You can find these new env options in the new .env file at apps/backend/.env.docker. Note that the new docker-compose.yml loads that file automatically. Also note that I added a apps/dashboard/.env.docker file as well so that Docker deployments and local dev have their own distinct configs. This is important for several reasons. First, because the old docker-compose config used for local dev is still in place so that pnpm dev functions the same way it already did. That means the new docker-compose config has it's own databases, etc. and the URL's and db ports are different to avoid any conflicts or confusion. Second, you can't set URL env vars to http://localhost:PORT inside a Docker container because localhost is no longer your machine in that context. It's the container itself. So that means you need a URL that resolves to the right place from within the container, but also resolves to localhost outside the container. Fortunately, Docker has a solution for this with host.docker.internal. In short, any URL's that have localhost in them in local dev needed to be converted to host.docker.internal when running in a container. See Docker's workaround docs for that here. That is enabled in the docker-compose config by these lines on the backend and dashboard...

  backend:
    ...
    extra_hosts:
      - "host.docker.internal:host-gateway"

Docker should already have taken care of this hostname resolution for you when it was installed, but just in case it didn't, you can add the following to your /etc/hosts file...

127.0.0.1 host.docker.internal

CI Docker Builds

Lastly, I added a Github workflow to build and publish the new Docker builds. All that is required to get them working is to provide 3 secrets in your repo or org configs.

DOCKER_REPO
DOCKER_USER
DOCKER_PASSWORD

So, for example, if you create a stack-auth org on Docker Hub, DOCKER_REPO would just be your org name of stack-auth and the user/password values could be for any user that has push access to your account.

As for the build tags that are created by this workflow, there are 3 different potential tag formats. Any time you merge to dev or main, the workflow will build and push two tags - one is the short SHA from the commit (first 7 chars) and the other is the branch name. So a commit to dev would look like this:

# assuming you go with the `stack-auth` org on Docker Hub

stack-auth/stack-dashboard:abc1234
stack-auth/stack-dashboard:dev

stack-auth/stack-backend:abc1234
stack-auth/stack-backend:dev

In this case, the :dev branch tag will always be an alias for the latest build on the dev branch while the :abc1234 tag is the specific commit hash. This allows users to just pull the "latest" of dev or pin to a specific commit. The main branch builds work the same way.

I also configured it to build when tagged with a version number. I know you don't currently use tags on your releases, but I think it'd be really helpful if you did so it's clear when package versions have actually changed. But also because that triggers a special tag format in the docker/metadata-action that automates the tagging. When you tag a commit with a version number in the format of 1.2.3, that tells the metadata action that this is an official release and the resulting Docker builds that get pushed will be in the format:

stack-auth/stack-dashboard:1.2.3
stack-auth/stack-dashboard:latest

stack-auth/stack-backend:1.2.3
stack-auth/stack-backend:latest

That allows users to pick a specific production release or just always pull the "latest" stable prod release.

docker pull stack-auth/stack-dashboard:latest
docker pull stack-auth/stack-backend:latest

Misc

I also updated Prisma to the latest 5.20.0 release. I know that probably seems unrelated, but this was because you were previously on a fairly old version and that version didn't have a Prisma binary available for linux/arm64 architecture. The reason this mattered was anyone trying to build on a new M series Macbook would get an error when running pnpm install inside the Debian container that these Docker images are based on. Updating to the a more recent Prisma version solved this.

Ok, I think that's everything. I've been using all of this in our own Kubernetes deployments for the last couple weeks while iterating on things and everything is really stable for me, so I think I smoothed out all the rough edges. Let me know if you have any questions of if there's anything else I can do!

* dev: fixed <p> in <p> problem in extra info added password update fixed verification code, added tests (stack-auth#259) Fix team invitation docs remove slack oauth, allow no email in oauth feat: Add twitter oauth provider (stack-auth#206) redirect to team page after team creation fixed account-setting mobile style chore: update package versions fixed delete client error fix(typo): Remove the t('') wrapping the "Click here" (stack-auth#256) removed deprecated code

* dev: fixed sidebar layout style fixed team invitation detail docs added team metadata to the client library fixed docs tag fixed sign in with XYZ button translation Added handling for user canceling the oauth process (stack-auth#260) fixed account-setting styling issues

* dev: fixed self-host docs

* dev: OTP (stack-auth#263) fix team name overflow (stack-auth#262)

* dev: fixed visual bugs userIdOrMe support on all yup validations fixed env vars # Conflicts: # apps/dashboard/src/app/layout.tsx

* dev: Fixed yup union error message (stack-auth#278) Made password repeat on sign up configurable (stack-auth#273)

* dev: chore: update package versions

vercel · 2024-09-26T20:25:05Z

@jshimko is attempting to deploy a commit to the Stack Team on Vercel.

A member of the Team first needs to authorize it.

* dev: Update README.md fixed translation added list user tests fixed email update (stack-auth#284) feat: add swap order option (stack-auth#283)

* dev: New contact channels (stack-auth#287) Fix team creation on the server not automatically adding the current user (stack-auth#266) chore: update package versions fixed current user docs fixed maybeFullPage layout

* dev: Update README.md Update README.md Update README.md

* dev: fix: user should select at least one provider before creating project (stack-auth#285) Update CONTRIBUTING.md fixed typo

* dev: chore: update package versions updated translations feat: show error message when no auth method enabled (stack-auth#282) added success and destructive variants in toast file for colorful toasts (stack-auth#291)

N2D4 · 2024-10-08T02:25:10Z

This is excellent, thank you so much for your contribution!

From a first glance, most of this seems great. My main concern is about the handling of the environment variables; this PR (and next-env-runtime) uses noStore, which disables static site generation and partial pre-rendering, both for ourselves and our customers who use the @stackframe/stack package.

I would actually argue that requiring a Docker image rebuild when updating envvars is by design; this way, the statically generated files (and with those the initial Next.js response) can contain as much information as possible. This means we can't easily publish a Docker image to a registry, though. IMO this is an acceptable tradeoff; if you're self-hosting, you're setting yourself up to struggle with much harder things than just rebuilding a Docker image from scratch. (DB migrations, for example.)

What do you think?

jshimko · 2024-10-08T14:24:55Z

Thanks for the review @N2D4! So, a couple thoughts...

First, noStore only disables static rendering for the component that it is used in. A few lines from their docs...

unstable_noStore can be used to declaratively opt out of static rendering and indicate a particular component should not be cached.

unstable_noStore is preferred over export const dynamic = 'force-dynamic' as it is more granular and can be used on a per-component basis.

In this particular case, we're only effecting a single NextJS <Script/> component. Other components outside of that should render the same as previously.

As for hard coding environment variables in a Docker builds, that is very much against best practices for a variety of reasons, but particularly because it breaks the best practice of Docker builds always being stateless (also long considered a best practice for software in general by the classic "12 Factor App" methodology). Docker even mentions that in their best practices for building under the section "Create ephemeral containers". Which also links directly to the 12 Factor site...

Refer to Processes under The Twelve-factor App methodology to get a feel for the motivations of running containers in such a stateless fashion.

See also Factor 3 - Config: https://12factor.net/config

Requiring every user to build their own images just to set a dynamic URL also means nobody can even use the same build between dev, staging, production, etc. And the only reason for that is because the URL changes between those envs. Supporting runtime config entirely solves that. If I deploy a build for my staging environment and thoroughly test everything, I need to be able to use that same build again when I promote it to production. Otherwise a new Docker build doesn't guarantee everything is 100% the same. At the very least, it's unnecessary overhead to create a duplicate build that only changes a few environment variables.

Perhaps more importantly, hard coding env into Docker builds means that nobody can ever publish reusable Docker builds. That one is kind of a non-starter for us. For example, all of our Kubernetes deployments (dev/staging/production) are completely automated and the release process goes from 1) development (latest of main branch) to 2) staging (a commit tagged for release) to 3) production (same tagged release promoted from staging) and the same Docker builds are used from end to end. Having to build and publish a new image for every deployment adds a lot of opportunities for build inconsistencies, needless CI/CD complexity, and tons of Docker image storage that would all otherwise be avoided by supporting runtime configs in a single reusable build.

Lastly, having Docker builds be reusable means that you (Stack Auth) can publish "official" builds (using the Github workflow I added in this PR). Since most self-host users aren't modifying the code, most won't even need to bother with the build step. There's a big advantage to having a single official source of production builds that everyone uses. This completely removes the "works with my build" debugging headache that will inevitably turn into a time consuming community support nightmare. With official builds, the only thing that differs between users is the config they pass in. That greatly reduces the amount of things that could be preventing a deployment from working correctly. If it works for one properly configured deployment, it should work for every properly configured deployment because you can be sure it is 100% the same code and build output. That literally allows you point to working configs in the docs and just say "works on my machine"! :)

Unrelated, just wanted to clarify on this comment about migrations...

IMO this is an acceptable tradeoff; if you're self-hosting, you're setting yourself up to struggle with much harder things than just rebuilding a Docker image from scratch. (DB migrations, for example.)

Migrations are actually automated in the Docker builds. They run on backend container startup in the entrypoint script (which can be disabled with the STACK_SKIP_MIGRATIONS env var if needed). So any time a new migration is released, it will automatically apply when the new backend build is deployed. I'd argue this is even easier than deploying on Vercel where you need to manually apply migrations to your database and try to time it with the code release that depends on it. Not running on startup also assumes all self-host users will even be aware that a new Stack release has new migrations in it. That's why automation is key here. As you know, deploying code that expects migrations to have been run can very easily lead to production downtime when Prisma falls over due to schema mismatches. Automating migrations ensures the app can't even start up until the new migrations have been successfully applied. Also, manually doing stuff to a production database is no fun!

So, not sure if I made a convincing case here, but happy to answer any questions or clarify anything further if you're still concerned about these changes. Let me know what you think!

* dev: chore: update package versions Project specific JWKs (stack-auth#293)

N2D4 · 2024-10-08T19:19:07Z

In this particular case, we're only effecting a single NextJS <Script/> component. Other components outside of that should render the same as previously.

What this means in practice is that this <Script /> component will suspend, which will pause the rendering of all components up to the closest Suspense boundary. Since <StackProvider /> is at the very top of the layout.tsx, this would be the entire page. We could wrap <StackProvider /> in a Suspense boundary, but in that case the <Script /> would not be the first script that runs on the page, and window.__STACK_ENV__ will not be available when the statically rendered components are hydrated in the browser.

Migrations are actually automated in the Docker builds. They run on backend container startup in the entrypoint script (which can be disabled with the STACK_SKIP_MIGRATIONS env var if needed). So any time a new migration is released, it will automatically apply when the new backend build is deployed. [...]

That only works if you don't worry about downtime during migrations. Think of the following scenario when renaming a column from A to B:

DB v1, server v1: The initial version where the col is named A.
DB v2, server v1: We add a new col named B.
DB v2, server v2: We update the server to write to both A and B, but it still reads from A.
DB v3, server v2: We do a database migration where we copy A to B.
DB v3, server v3: We update the server to write and read from B only.
DB v4, server v3: We delete A.

You need to coordinate the server and DB updates when migrating, but if you don't and you do all the updates at the same time (particularly if you're running multiple revisions of the server at the same time as part of your autoscaling/rollout), things will break. This is acceptable if you're fine with a few minutes of downtime, but not otherwise.

I get your point regarding configless container builds. Cal.com has a similar setup and they publish a Docker image that's designed for local usage, alongside a build script to customize it for production; we could have something like that. The other alternative is that we disable static rendering/PPR on the Docker version, and essentially do what you did here, while keeping it in the main deployment. I'll think about this a bit today.

* dev: (66 commits) Make doctoc update all files (stack-auth#311) Add doctoc to CONTRIBUTING.md OTP auth on the client SDK and dashboard (stack-auth#309) Revert "added filtering params" added filtering params added jwt tests again Disable Prettier in VSCode settings Remove unnecessary envvar TOTP retry Update README Update README Update README chore(docs): update TOC Add port mapping to README Use `pnpm run build:dev` in setup script SDK classes/hooks reference docs (stack-auth#301) Contact channel client (stack-auth#290) chore: update package versions fixed account settings bugs chore: update package versions ... # Conflicts: # apps/backend/package.json # apps/dashboard/package.json # packages/stack/src/lib/cookie.ts # packages/stack/src/lib/stack-app.ts # pnpm-lock.yaml

N2D4 · 2024-10-29T17:03:01Z

I talked to the Next.js team and they have an --experimental-build-mode CLI flag that we should be able to use to not inline variables. This would do what we want, though I'm not sure if the behavior has already been released with Next.js 15 or not — at least it's a new avenue though.

dbjpanda · 2024-10-29T20:13:07Z

@jshimko @N2D4 Great work happening here. Would love to try it out soon.

fomalhautb · 2024-11-27T12:59:28Z

@jshimko Can you give us the permission to edit this PR?

fomalhautb · 2024-12-01T14:58:15Z

Continued in #353

jshimko added 30 commits September 18, 2024 15:13

create Docker builds and add support for runtime env config

047cbe5

[email protected]

114ff41

remove logged default adin details from seed script

083dfa3

build for both cpu platforms

12efde1

disable client team creation by default in seed

04bae10

Merge branch 'dev' into docker-builds

315e47e

* dev: fixed self-host docs

fix self host seed script

2ea9788

Merge branch 'dev' into docker-builds

ccc2594

* dev: OTP (stack-auth#263) fix team name overflow (stack-auth#262)

prisma 5.20.0

896088d

disable arm builds for now

1128361

use prisma directly in docker entrypoint

9b39752

fix wrong dir

3e0538a

Merge branch 'dev' into docker-builds

7f224a2

* dev: fixed visual bugs userIdOrMe support on all yup validations fixed env vars # Conflicts: # apps/dashboard/src/app/layout.tsx

env cleanup

d8a0426

revert to previous prisma command in entrypoint script

12b4f2c

Merge branch 'dev' into docker-builds

3019f97

* dev: Fixed yup union error message (stack-auth#278) Made password repeat on sign up configurable (stack-auth#273)

remove unused multi platform config

8ebb72d

move self-host seed script config to Docker env file

210afd3

self host seed script cleanup

9e639b3

seed script cleanup

e27365b

db rename

656fd4f

only run docker build on dev and main

7e3c062

refactor the backend docker build to support NextJS standalone mode

9d2251f

Merge branch 'dev' into docker-builds

2a12fb1

* dev: chore: update package versions

add more comments to Dockerfiles

f6b7de8

remove unused script

44f3917

delete empty line

8c5e0e9

remove unused pnpm in final docker build

c6f3d49

jshimko added 3 commits September 26, 2024 13:53

improve backend startup healthcheck in docker-compose

9c46cbd

lint cleanup

1baec26

remove unused script

10c4ca7

fomalhautb requested a review from N2D4 September 29, 2024 00:07

Merge branch 'dev' into docker-builds

016676f

* dev: Update README.md fixed translation added list user tests fixed email update (stack-auth#284) feat: add swap order option (stack-auth#283)

csyedbilal mentioned this pull request Oct 1, 2024

Self-host docker #265

Closed

jshimko added 12 commits October 1, 2024 13:52

Merge branch 'dev' into docker-builds

18b0675

* dev: New contact channels (stack-auth#287) Fix team creation on the server not automatically adding the current user (stack-auth#266) chore: update package versions fixed current user docs fixed maybeFullPage layout

fix self host seed script after authMethod schema updates

f85efa5

skip docker buildx check for SecretsUsedInArgOrEnv

5bde713

Merge branch 'dev' into docker-builds

80c2460

* dev: Update README.md Update README.md Update README.md

remove exposed db ports

fe3c631

remove unnecessary EXPOSE directives

ca7954c

Merge branch 'dev' into docker-builds

3fea967

* dev: fix: user should select at least one provider before creating project (stack-auth#285) Update CONTRIBUTING.md fixed typo

Merge branch 'dev' into docker-builds

0551457

* dev: chore: update package versions updated translations feat: show error message when no auth method enabled (stack-auth#282) added success and destructive variants in toast file for colorful toasts (stack-auth#291)

[email protected]

2eceee7

fix authMethodConfigs schema change in seed script

cc2cee2

ensure seed script admin user isn’t created because of code comments

a69d738

fix self host seed script contactChannel auth option

78dbf23

Merge branch 'dev' into docker-builds

acf5b94

* dev: chore: update package versions Project specific JWKs (stack-auth#293)

fomalhautb mentioned this pull request Nov 28, 2024

Self-host docker #353

Merged

fomalhautb closed this Dec 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

create Docker builds and add docker-compose config for self-hosting #280

create Docker builds and add docker-compose config for self-hosting #280

jshimko commented Sep 26, 2024 •

edited

Loading

vercel bot commented Sep 26, 2024

N2D4 commented Oct 8, 2024

jshimko commented Oct 8, 2024

N2D4 commented Oct 8, 2024 •

edited

Loading

N2D4 commented Oct 29, 2024

dbjpanda commented Oct 29, 2024

fomalhautb commented Nov 27, 2024

fomalhautb commented Dec 1, 2024

create Docker builds and add docker-compose config for self-hosting #280

create Docker builds and add docker-compose config for self-hosting #280

Conversation

jshimko commented Sep 26, 2024 • edited Loading

TL;DR

Details

vercel bot commented Sep 26, 2024

N2D4 commented Oct 8, 2024

jshimko commented Oct 8, 2024

N2D4 commented Oct 8, 2024 • edited Loading

N2D4 commented Oct 29, 2024

dbjpanda commented Oct 29, 2024

fomalhautb commented Nov 27, 2024

fomalhautb commented Dec 1, 2024

jshimko commented Sep 26, 2024 •

edited

Loading

N2D4 commented Oct 8, 2024 •

edited

Loading