-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add workflows for local development using containerised YNR #2485
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK then. Lets start off with the stuff that is just broken. I don't think these things necessarily are blockers before Sym can a go at bootstrapping the project, but @symroe - if you want to jump in and do it now, these are the things that I found were just broken and here are the workarounds/fixes. Hopefully that will stop you from wasting time hitting the same rough edges as I did.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Medium priority
There's a bag of related issues around how we configure the application. I'm going to structure this as one "review" containing a series of inline comments, but I think its all kind of related. This might be one where its useful for us to collectively discuss the approach to application configuration (again) to hammer this out and work out wehat next. Maybe we can compare notes when we've both spun the app up locally.
This stuff does need sorting, but I don't think its a pre-requisite before Sym has a go at spinning up the application and the next actions are perhaps not 100% defined yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lower priority things
Sorry. Another one that I didn't catch yesterday but I think this one matters. Every time I load a page, I get
in the console. This is using We could get round it for now by just using a mock or local filesystem cache in local dev, but then that's a difference between local and prod. Any bug relating to cache won't reproduce locally. This brings a question about how we're going to approach cache. I guess in our current setup, we just run memcached on the VM (and in local dev) but in a docker-based setup memcached probably needs to become a separate service (ElastiCache?). Ultimately I'd quite like our dev setup to mirror what we're going to do in prod, but also maybe that is something we come to when we look at deployment rather than while we're still doing docker in local dev but VMs in prod. This might be another one to discuss first. |
Having pulled one worm out of the can, as a follow up to that: @symroe - are you aware of anything else that we're running as a service on the VM at the moment aside from memcached? |
This adds a Containerfile (a technology-agnostic Dockerfile) that can successfully install the app's dependencies into a container image. The image can be built with the following command (note the trailing build context of "."): podman build . The image's contents are controlled via an allowlist in .containerignore, which is expected to change significantly in later commits as more of this repo is used inside the container. The image is not yet set up to be invoked.
This adds a postgresql compose manifest as a demonstration that podman-compose can successfully instatiate the service, via: podman compose -f container/run/compose.service.postgres.yml up This is not yet used in any context.
This changes the approach to building the app image by using an Ubuntu LTS release as the base, and starting by installing the current set of packages explicitly referenced in deploy/vars.yml (modulo some outdated entries such as redis-server). This change is based on discussions with DC personnel with knowledge of the current setup, and influenced by the likely path we'll take to adopting a container-based production setup in the future. The image that's produced is not small, as it contains many, many system packages that are pulled in as dependencies of the package set we explicitly specify. The resulting large package set does, however, permit the app's base Python requirements to be installed successfully, and this can act as a baseline for future size/efficiency improvements.
This adds a "container" workflow to CircleCI. The workflow is unoptimised, and initially serves only to test the buildability of the container image. The .containerignore file is not respected by CircleCI's Docker builders, so it is symlinked into place as .dockerignore.
This adds "python manage.py check" to the end of the container image build process. Whilst this isn't strictly necessary it serves two useful purposes by: - demonstrating that the container *can* run the check; and - pre-generating many .pyc files and including them in the image for faster app initialisation. The Containerfile is also tweaked to: - read the list of system packages from an external file; - install the app's python requirements before copying its code, so the DX happy path doesn't incur the pip-install cost when only modifying code; - install the app's python requirements based on the sopn_parsing.txt file as this transitively includes "pandas", without which the check command fails.
This adds a simplistic mechanism to invalidate CircleCI's Docker Layer Cache (DLC), and force an image rebuild from scratch. Locally, "podman build --no-cache ." should be preferred instead.
Invoke a cut-down version of the current CI test command inside the built container.
This adds some system and pip dependencies that are currently encoded in the CI config. The system dependencies are copied into the container's package file, but a manually fetched system dependency (pandoc) is translated into its convenience pip package equivalent. The change to pandoc's installation mechanism requires the use of an alternative version of the pypandoc library that also provides the underlying pandoc binary. This requires us to use a later version than the currently-used pypandoc version, as the currently-used version doesn't have a binary-including-package counterpart. We bump to the latest pip package version.
This adds a docker-compose.yml file which, along with the included changes, permits a developer to run a hot-reloading instance of the app. To make this work, it: - invokes `npm install` early in the image build process - adds gunicorn to requirements/base.txt (as there aren't any environments where this is optional) - makes some (hopefully harmless!) changes to ynr/settings/testing.py that allow the app to start successfully This change also: - adds a convenience script at scripts/container.image.build.bash, intended for use both locally and in CI - increases the commenting/structure in the Containerfile The compose file must be at the repo root (and not inside the defunct container/run/ directory) because relative bind mount sources are resolved relative to the compose file's location, and we need to mount the ynr/ directory from inside the repo root. This Docker (hence also Podman) feature even goes so far (or appears to!) as to resolve the location reading through a symlink, so we can't simply link a compose file sitting at the repo root to a file in a subdirectory. The developer currently needs to manually execute some asset-related commands before starting the compose stack, in order to make app render correctly. These commands place content into the ynr/ directory, which is bind-mounted from the developer's machine. They will be automated in a later change: podman compose run --rm --no-deps -e DJANGO_SETTINGS_MODULE=ynr.settings.testing frontend npm run build podman compose run --rm --no-deps -e DJANGO_SETTINGS_MODULE=ynr.settings.testing frontend python manage.py collectstatic --no-input
This cleans the NPM cache after installation, and also uses the more deterministic "npm ci" command instead of "npm install". Also: - help future beginners by defining SOPN in the top-level README - reduce top-level repo clutter by renaming .containerignore to .dockerignore.
This adds a convenience shim for invoking "python manage.py" inside a freshly instantiated container, connected to the same DB and bind mounts that the composed "frontend" container would be able to access: $ ./scripts/container.compose.manage-py.bash check 2>/dev/null WARNING: no local settings file found. See local.py.example System check identified no issues (0 silenced). Invocations are currently noisy (hence the 2> redirection); this will be improved later.
This expands the set of file and directory paths which are bind mounted into the frontend container when using compose. The set is aligned with the paths included in the built container image via the .dockerignore file's contents.
This permits the container image required by the local compose stack to be built directly using the compose subcommand: podman compose build or podman compose build frontend A test is added to CI that asserts compose is superficially happy. We don't (yet) use compose to run any containers in CI, so this step is necessary trivial. Also: fix a podman/docker inconsistency where podman permitted compose "ports" to be a scalar value, but docker requires a list of values.
This allows a developer to place a file at the gitignored path env/frontend.env containing VAR=val key/value pairs which the frontend compose container will read on startup and instantiate as environment variables. This mechanism has been chosen as it affects all frontend container invocations, including those that don't start Django as a web server. This consistency will reduce unexpected drift between the primary frontend container started by a developer and any secondary containers. There are complications that currently make it fiddly to get this file *also* bind-mounted as the .env file that ynr/wsgi.py loads. For the moment, this means that environment variable changes require a restart of the frontend container.
This adds a script that shims "podman compose exec", and changes the renamed shim script that invokes Django management commands to use it. It also removes a couple of envvars from the compose file which are more properly situated in the gitignored, per-developer env/frontend.env file.
/ynr/media/ doesn't need to be included inside the frontend container image.
YNR's production RDS runs v16.4. Both the local containerised setup and the in-CI DB should mirror this.
This adds a script that executes pytest inside a running frontend container, passing any parameters through to pytest.
Docker compose requires an explicit healthcheck, whereas podman appears to infer a healthy container when an exposed port is listening.
This reverts commit 4af80b5. This was added as docker required it, so that we could test using docker-compose in CI. This has proven impractical, given the warnings at https://circleci.com/docs/docker-compose/#using-docker-compose-with-docker-executor: The remote docker daemon runs on a different system than the docker CLI and docker compose, so you must move data around to make this work. Mounting can usually be solved by making content available in a docker volume. In order to make this work, we'd have to vary the compose file so that in local dev content was bind-mounted, whereas it would be made available in a volume in CI. This would negate the point of running the compose stack in CI: to prove that local-dev workflows are happy. Because podman appears to have a problem with this setup on certain machines: ERRO[0000] Failed to start transient timer unit: Unit cdb85bbc93bfd59cc38703842f96c12d0cced997a7ee582b2954a6bb87905804.timer was already loaded or has a fragment file. ... our easiest route forward is to revert this change until we genuinely need to solve docker's healthcheck problem.
This adds scripts/container.run.bash, which runs a command in a fresh frontend container. Also: fix trailing whitespace
This allows a developer who invokes scripts/container.image.build.bash to specify arbitrary parameters for the underlying builder. e.g. "--no-cache", which is a better way of avoiding using the local build cache than bumping the CI-focussed build ARG at the top of container/build/Containerfile.
This adds some docs to help a developer get started modifying and running YNR inside a container. Also: structure README.md's headings more hierarchically, leaving only a single H1 on the page; some light rewording in README.md to improve its flow, along with a short new section that points a member of the public towards whocanivotefor.co.uk if they happen to come across this repo.
a169175
to
b2b4d5c
Compare
Initial development of containerised YNR used podman-compose v1.0.6, but that version is incompatible with the current schema used by the docker-compose.yml file. v1.2.0 is noted as being specifically required as the very recent v1.3.0 has not yet been properly tested with this branch.
On cache: I use the dummy cache locally, but I suspect the easiest thing to do here is to switch to |
Some DB migrations require files that exist in the data directory. This change ensures that it exists in the container image by adding it to the .dockerignore include list, and also bind mounts the path so that developers can update its files without rebuilding the image.
Just a note on ways of working: With a big PR like this where there will be multiple rounds of comments and changes, I prefer it if we don't rebase and force-push the branch mid-review. The reason for this is: If we push additional commits to address review comments, it is easy for the reviewer to come back to a PR and quickly see what has changed since they last looked at it. The reviewer can easily tell from GitHub's UI that they've reviewed everything up to commit a169175 (for example) so everything after that is new since the last round of comments. If the entire history has been re-written since they last viewed the PR and new changes are squashed into commits further back in history, it is much harder to tell at a glance "what changed since I last looked at this?". The reviewer basically has to come back to the thing from first principles. It doesn't matter that we've done a rebase and force-push already. Lets just play the ball where it landed from this point, but for subsequent changes, can we just push more commits to the branch and then squash things to tidy up as a final step once we're ready to merge. Thanks |
Understood. The rebase was solely to remove the first-and-last commits' mitigations for a failing test that magically fixed itself on tip of |
Some editors complain if a Containerfile doesn't use uppercase instructions.
This tweaks some docs following PR feedback.
Whilst I can configure the webapp to use |
This alters the Containerfile's setting of NODE_ENV so it's less susceptible to future line-order changes having unintended side effects.
Yeah. Agreed 👍 Migrating to the DB cache engine should be another PR for another day. For the moment, we'll use dummy cache for local dev. I've captured this in the notes doc from this morning too. |
This reverts all changes to ynr.settings.testing, and defaults the containerised YNR app to use the base ynr.settings module throughout the build process and the compose invocation. The pytest wrapper script is allowed to use pytest's default settings already configured via pyproject.toml. This change is generally implemented by removing the explicit use of ynr.settings.testing across various scenarios, and opting in to manage.py's default of ynr.settings (which inherits ynr.settings.base). Also: small tweaks to docs. Signed-off-by: Jonathan Matthews <[email protected]>
Resolved by 196d5f3. The dummy cache is specified in |
Thanks. I've had a look at the latest changes and given it another spin locally. I think from my point of view everything is either addressed, or we've made the conscious decision to deal with it as a follow up PR and noted it somewhere else. So, 👍 from me. There's a couple of comment threads still open where it would be useful to get a final check from Sym before we merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I've run through the docs from start to finish on a computer that I didn't use previously (to isolate any problems that might happen when setting it up in on a system that had tested this half way through the PR).
Everything worked as expected 🎉
So: from the point of view of "can I install the app" I think this is good to go. This isn't the same as saying that all the code is perfect, but I don't have anything to add to that that we've not talked about and planned already.
👍
This removes mention of an alternative method for initially populating the DB, via the API, as the fact that it doesn't currently work is a known issue.
@symroe I'm not sure what the DC etiquette is for merging PRs, but if it would be my responsibility then I'll defer to yourself for this one. I'll let you choose a time when you'd be able to respond to any production deployment problems that might result from these changes. As discussed, we've carefully isolated this PR's scope so it shouldn't affect production; but this feels like the appropriate handling for this change. |
The general rule is "you break it, you fix it" (e.g merge your own PRs and own that through to production). Given that you don't have prod access, it's reasonable that we don't do that now. I'll merge it now, as I currently have time to fix any problems. |
This adds workflows, scripts, docs and in-repo infra for YNR to be invoked inside a container during local development. Non-local environments are not impacted by this change.
The branch being PR'd contains ~30 commits that generally build on top of each other in narrative fashion. These are presented as-is, as they contain information that's relevant about the development of the local-dev setup that results. They can, of course, be squashed if needed, but that loss of information should be an explicit project choice.
CI failure on tip of master previously blocking this PR
Edit: whilst this failure was previously reproducible, it has now fixed itself in CircleCI when the same commit is exercised (see the two runs visible at https://app.circleci.com/pipelines/github/DemocracyClub/yournextrepresentative/3940). I'll leave this section here for posterity, and in case the problem reoccurs. The first and last commits mentioned here have been removed.
Original: The first and last commits of this PR's stack relate to a single Python test that only started to fail after the stack was rebased on top of the recent #2481. The failure is present on the commit at the tip of the
master
branch, suggesting the failure is unrelated to this stack's changes. This pair of commits should be removed from this PR, pre-merge, after the root cause of the issue is addressed.This stack is considered complete, but the PR is WIP. Changes may be needed to address various items in the PR checklist.To review
README.md
docs/INSTALL.md
docs/DEVELOPMENT.md
as you feel appropriatedocs/DEVELOPMENT.md
to test the process of making a "normal" code change to YNR.container/build/Containerfile
scripts/*
docker-compose.yml
requirements/base.txt
requirements/sopn_parsing.txt
ynr/apps/sopn_parsing/README.md
PR Checklist