Skip to content

Commit

Permalink
Clean up some nightly build infrastructure cruft (#3962)
Browse files Browse the repository at this point in the history
* Bump docker container to mamba v2.0.3

* Warn against using nightly build outputs directly for Zenodo.

* Remove obsolete refs to GCE_INSTANCE in nightly build workflow.

* Update RTD backend link to new interface.

* Update nightly build docs to reflect use of Google Batch.

* Bump a couple of non-python dependency versions.

* Simplify nightly/stable/workflow_dispatch logic in nightly build script.

* Make VCE RARE row count asset check non-blocking for fast ETL testing

* Add test distribution of parquet and other outputs.

* Use BUILD_ID as test distribution path to ensure uniqueness.

* Discontinue parquet distribution. Remove test distribution files.

* Remove AWS CLI commands and use gcloud storage instead.

* Add AWS credentials from envvars

* Create ~/.aws directory before attempting to write credentials

* Remove dangling && from now separate commands in build script.

* Remove AWS S3 access test.

* Don't && the removal of existing paths, in case it isn't there.

* Fix source path for AWS S3 distribution.

* Remove all testing shortcuts and revert to FULL ETL.

* Remove unnecessary copy_to_dist_path function

* Use more specific verstion tag matching pattern.

* Use more specific version tag matching pattern.

* Remove unnecessary conditional in stable deployment

* Use more generous timeouts/retries in Zenodo data release script

* Relock dependencies.

* Switch to new Slack GitHub Action syntax.

* Switch to using postgres 17 and fast ETL to run a quick test deployment.

* Use postgres 16 since 17 isn't yet available in our Docker image sources.

* Update comment about postgres version.

* Use Ubuntu 24.04 micromamba image.

* Go back to doing full ETL after Postgres 16 test.

* Re-lock dependencies

* Remove jq, use envvar for PG_VERSION, test fast ETL.

* Add a little workflow to test pattern matching.

* Fix typo in regex-test workflow.

* Use a more restrictive tag matching pattern.

* Use a more specific tag pattern to trigger data releases.

* Revert to a simple version tag pattern v20*

* revert to running full ETL.

* Relock dependencies
  • Loading branch information
zaneselvans authored Nov 20, 2024
1 parent aaaabfc commit f3cdf14
Show file tree
Hide file tree
Showing 11 changed files with 419 additions and 546 deletions.
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/versioned_release.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ assignees: ""
- [ ] Verify [`catalystcoop.pudl` PyPI (software) release](https://pypi.org/project/catalystcoop.pudl/)
- [ ] Verify that [PUDL repo archive on Zenodo](https://zenodo.org/doi/10.5281/zenodo.3404014) has been updated w/ new version
- [ ] Wait 6-10 hours for a successful build to complete
- [ ] Activate new version on the [RTD admin panel](https://readthedocs.org/projects/catalystcoop-pudl/versions/) and verify that it builds successfully.
- [ ] Activate new version on the [RTD admin panel](https://app.readthedocs.org/projects/catalystcoop-pudl/) and verify that it builds successfully.
- [ ] Verify that `stable` and the version tag point at same git ref
- [ ] Verify that [`stable` docs on RTD](https://catalystcoop-pudl.readthedocs.io/en/stable/) have been updated
- [ ] Verify `gs://pudl.catalyst.coop/vYYYY.M.x` has the new expected data.
Expand Down
24 changes: 7 additions & 17 deletions .github/workflows/build-deploy-pudl.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
---
name: build-deploy-pudl
on:
workflow_dispatch:
Expand All @@ -11,8 +12,6 @@ on:

env:
GCP_BILLING_PROJECT: ${{ secrets.GCP_BILLING_PROJECT }}
GCE_INSTANCE: pudl-deployment-tag # This is changed to pudl-deployment-dev if running on a schedule
GCE_INSTANCE_ZONE: ${{ secrets.GCE_INSTANCE_ZONE }}
GCS_OUTPUT_BUCKET: gs://builds.catalyst.coop
BATCH_JOB_JSON: batch_job.json

Expand All @@ -24,12 +23,6 @@ jobs:
contents: write
id-token: write
steps:
- name: Use pudl-deployment-dev vm if running on a schedule
if: ${{ (github.event_name == 'schedule') }}
run: |
echo "This action was triggered by a schedule."
echo "GCE_INSTANCE=pudl-deployment-dev" >> $GITHUB_ENV
- name: Checkout Repository
uses: actions/checkout@v4
with:
Expand All @@ -56,7 +49,6 @@ jobs:
- name: Show freshly set envvars
if: ${{ env.SKIP_BUILD != 'true' }}
run: |
echo "GCE_INSTANCE: $GCE_INSTANCE"
echo "NIGHTLY_TAG: $NIGHTLY_TAG"
echo "BUILD_ID: $BUILD_ID"
echo "BATCH_JOB_ID: $BATCH_JOB_ID"
Expand Down Expand Up @@ -140,8 +132,6 @@ jobs:
--container-env BUILD_ID=${{ env.BUILD_ID }} \
--container-env BUILD_REF=${{ github.ref_name }} \
--container-env FLY_ACCESS_TOKEN=${{ secrets.FLY_ACCESS_TOKEN }} \
--container-env GCE_INSTANCE=${{ env.GCE_INSTANCE }} \
--container-env GCE_INSTANCE_ZONE=${{ env.GCE_INSTANCE_ZONE }} \
--container-env GCP_BILLING_PROJECT=${{ secrets.GCP_BILLING_PROJECT }} \
--container-env GITHUB_ACTION_TRIGGER=${{ github.event_name }} \
--container-env NIGHTLY_TAG=${{ env.NIGHTLY_TAG }} \
Expand All @@ -160,13 +150,13 @@ jobs:
if: ${{ env.SKIP_BUILD != 'true' }}
run: gcloud batch jobs submit run-etl-${{ env.BATCH_JOB_ID }} --config ${{ env.BATCH_JOB_JSON }} --location us-west1

- name: Post to a pudl-deployments channel
- name: Post to pudl-deployments channel
if: always()
id: slack
uses: slackapi/slack-github-action@v2
with:
channel-id: "C03FHB9N0PQ"
slack-message: "`${{ env.BUILD_ID }}` build-deploy-pudl status: ${{ (env.SKIP_BUILD == 'true') && 'skipped' || job.status }}\n${{ env.GCS_OUTPUT_BUCKET }}/${{ env.BUILD_ID }}"
env:
channel-id: "C03FHB9N0PQ"
SLACK_BOT_TOKEN: ${{ secrets.PUDL_DEPLOY_SLACK_TOKEN }}
method: chat.postMessage
token: ${{ secrets.PUDL_DEPLOY_SLACK_TOKEN }}
payload: |
text: "`${{ env.BUILD_ID }}` build-deploy-pudl status: ${{ (env.SKIP_BUILD == 'true') && 'skipped' || job.status }}\n${{ env.GCS_OUTPUT_BUCKET }}/${{ env.BUILD_ID }}"
channel: "C03FHB9N0PQ"
29 changes: 14 additions & 15 deletions devtools/zenodo/zenodo_data_release.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,24 +87,26 @@ def __init__(self, env: str):

logger.info(f"Using Zenodo token: {token[:4]}...{token[-4:]}")

def retry_request(self, *, method, url, max_tries=5, timeout=5, **kwargs):
def retry_request(self, *, method, url, max_tries=6, timeout=2, **kwargs):
"""Wrap requests.request in retry logic.
Passes method, url, and **kwargs to requests.request.
"""
base_timeout = 2
for try_num in range(1, max_tries):
try:
return requests.request(
method=method, url=url, timeout=timeout, **kwargs
method=method, url=url, timeout=timeout**try_num, **kwargs
)
except requests.RequestException as e:
timeout = base_timeout**try_num
logger.warning(f"Attempt #{try_num} Got {e}, retrying in {timeout} s")
time.sleep(timeout)
logger.warning(
f"Attempt #{try_num} Got {e}, retrying in {timeout**try_num} s"
)
time.sleep(timeout**try_num)

# don't catch errors on the last try.
return requests.request(method=method, url=url, timeout=timeout, **kwargs)
return requests.request(
method=method, url=url, timeout=timeout**max_tries, **kwargs
)

def get_deposition(self, deposition_id: int) -> _LegacyDeposition:
"""LEGACY API: Get JSON describing a deposition.
Expand All @@ -115,7 +117,6 @@ def get_deposition(self, deposition_id: int) -> _LegacyDeposition:
method="GET",
url=f"{self.base_url}/deposit/depositions/{deposition_id}",
headers=self.auth_headers,
timeout=5,
)
logger.debug(
f"License from JSON for {deposition_id} is "
Expand All @@ -132,7 +133,6 @@ def get_record(self, record_id: int) -> _NewRecord:
method="GET",
url=f"{self.base_url}/records/{record_id}",
headers=self.auth_headers,
timeout=5,
)
return _NewRecord(**response.json())

Expand All @@ -146,7 +146,6 @@ def new_record_version(self, record_id: int) -> _NewRecord:
method="POST",
url=f"{self.base_url}/records/{record_id}/versions",
headers=self.auth_headers,
timeout=5,
)
return _NewRecord(**response.json())

Expand All @@ -162,7 +161,7 @@ def update_deposition_metadata(
data = {"metadata": metadata.model_dump()}
logger.debug(f"Setting metadata for {deposition_id} to {data}")
response = self.retry_request(
method="PUT", url=url, json=data, headers=self.auth_headers, timeout=5
method="PUT", url=url, json=data, headers=self.auth_headers
)
return _LegacyDeposition(**response.json())

Expand All @@ -175,7 +174,6 @@ def delete_deposition_file(self, deposition_id: int, file_id) -> requests.Respon
method="DELETE",
url=f"{self.base_url}/deposit/depositions/{deposition_id}/files/{file_id}",
headers=self.auth_headers,
timeout=5,
)

def create_bucket_file(
Expand All @@ -196,7 +194,6 @@ def create_bucket_file(
url=url,
headers=self.auth_headers,
data=file_content,
timeout=5,
)
return response

Expand All @@ -206,7 +203,6 @@ def publish_deposition(self, deposition_id: int) -> _LegacyDeposition:
method="POST",
url=f"{self.base_url}/deposit/depositions/{deposition_id}/actions/publish",
headers=self.auth_headers,
timeout=5,
)
return _LegacyDeposition(**response.json())

Expand Down Expand Up @@ -375,7 +371,10 @@ def get_html_url(self):
required=True,
help="Path to a directory whose contents will be uploaded to Zenodo. "
"Subdirectories are ignored. Can get files from GCS as well - just prefix "
"with gs://.",
"with gs://. NOTE: nightly build outputs are NOT suitable for creating a Zenodo "
"data release, as they include hundreds of individual Parquet files, which we "
"archive on Zenodo as a single zipfile. Check what files should actually be "
"distributed. E.g. it may be *.log *.zip *.json ",
)
@click.option(
"--publish/--no-publish",
Expand Down
19 changes: 7 additions & 12 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM mambaorg/micromamba:2.0.2
FROM mambaorg/micromamba:2.0.3-ubuntu24.04

ENV CONTAINER_HOME=/home/$MAMBA_USER
ENV PGDATA=${CONTAINER_HOME}/pgdata
Expand All @@ -8,10 +8,9 @@ USER root
SHELL [ "/bin/bash", "-exo", "pipefail", "-c" ]

# Install some linux packages
# awscli requires unzip, less, groff and mandoc
# hadolint ignore=DL3008
RUN apt-get update && \
apt-get install --no-install-recommends -y git jq unzip less groff mandoc postgresql && \
apt-get install --no-install-recommends -y git postgresql && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*

Expand All @@ -23,10 +22,13 @@ RUN printf '[GoogleCompute]\nservice_account = default' > /etc/boto.cfg
# hadolint ignore=DL3059
RUN usermod -aG postgres "$MAMBA_USER"

# We use an enviroment variable to set the Postgres version because it is also used in
# the nightly build script and this makes it easier to ensure they are all the same.
# Remember to bump the Postgres version. Postgres 17 was released in September, 2024.
ENV PG_VERSION=16
# Create new cluster for Dagster usage that's owned by $MAMBA_USER.
# When the PG major version changes we'll have to update this from 15 to 16
# hadolint ignore=DL3059
RUN pg_createcluster 15 dagster -u "$MAMBA_USER" -- -A trust
RUN pg_createcluster ${PG_VERSION} dagster -u "$MAMBA_USER" -- -A trust

# Switch back to being non-root user and get into the home directory
USER $MAMBA_USER
Expand Down Expand Up @@ -62,13 +64,6 @@ COPY --chown=${MAMBA_USER}:${MAMBA_USER} . ${PUDL_REPO}
ENV LD_LIBRARY_PATH=${CONDA_PREFIX}/lib
RUN ${CONDA_RUN} pip install --no-cache-dir --no-deps --editable ${PUDL_REPO}

# Install awscli2
# Change back to root because the install script needs access to /usr/local/aws-cli
# curl commands run within conda environment because curl is installed by conda.
USER root
RUN ${CONDA_RUN} bash -c 'curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && unzip awscliv2.zip && ./aws/install'
USER $MAMBA_USER

# Install flyctl
# hadolint ignore=DL3059
RUN ${CONDA_RUN} bash -c 'curl -L https://fly.io/install.sh | sh'
Expand Down
Loading

0 comments on commit f3cdf14

Please sign in to comment.