Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up some nightly build infrastructure cruft #3962

Merged
merged 41 commits into from
Nov 20, 2024
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
c64ebcc
Bump docker container to mamba v2.0.3
zaneselvans Nov 16, 2024
d1d00c5
Warn against using nightly build outputs directly for Zenodo.
zaneselvans Nov 16, 2024
ec534ba
Remove obsolete refs to GCE_INSTANCE in nightly build workflow.
zaneselvans Nov 16, 2024
5a2e566
Update RTD backend link to new interface.
zaneselvans Nov 16, 2024
8e1b2d3
Update nightly build docs to reflect use of Google Batch.
zaneselvans Nov 16, 2024
5411011
Bump a couple of non-python dependency versions.
zaneselvans Nov 16, 2024
3cfbe3b
Simplify nightly/stable/workflow_dispatch logic in nightly build script.
zaneselvans Nov 16, 2024
9973748
Make VCE RARE row count asset check non-blocking for fast ETL testing
zaneselvans Nov 16, 2024
a5764db
Add test distribution of parquet and other outputs.
zaneselvans Nov 16, 2024
5772e07
Use BUILD_ID as test distribution path to ensure uniqueness.
zaneselvans Nov 16, 2024
611e08c
Discontinue parquet distribution. Remove test distribution files.
zaneselvans Nov 16, 2024
0811c29
Remove AWS CLI commands and use gcloud storage instead.
zaneselvans Nov 16, 2024
65b396e
Add AWS credentials from envvars
zaneselvans Nov 16, 2024
c0d1632
Create ~/.aws directory before attempting to write credentials
zaneselvans Nov 16, 2024
581507d
Remove dangling && from now separate commands in build script.
zaneselvans Nov 16, 2024
c4179b0
Remove AWS S3 access test.
zaneselvans Nov 16, 2024
b7087e4
Don't && the removal of existing paths, in case it isn't there.
zaneselvans Nov 17, 2024
6834e9e
Fix source path for AWS S3 distribution.
zaneselvans Nov 17, 2024
5904506
Remove all testing shortcuts and revert to FULL ETL.
zaneselvans Nov 17, 2024
9be3ab5
Remove unnecessary copy_to_dist_path function
zaneselvans Nov 17, 2024
13861e6
Use more specific verstion tag matching pattern.
zaneselvans Nov 17, 2024
6a28d75
Use more specific version tag matching pattern.
zaneselvans Nov 17, 2024
6d1f0d5
Remove unnecessary conditional in stable deployment
zaneselvans Nov 17, 2024
6da81b4
Use more generous timeouts/retries in Zenodo data release script
zaneselvans Nov 18, 2024
5fff59b
Merge branch 'main' into nightly-build-cruft
zaneselvans Nov 18, 2024
dcab875
Relock dependencies.
zaneselvans Nov 18, 2024
d753006
Switch to new Slack GitHub Action syntax.
zaneselvans Nov 19, 2024
d2049aa
Switch to using postgres 17 and fast ETL to run a quick test deployment.
zaneselvans Nov 19, 2024
7baf4b8
Use postgres 16 since 17 isn't yet available in our Docker image sour…
zaneselvans Nov 19, 2024
1b0614f
Update comment about postgres version.
zaneselvans Nov 19, 2024
fcca6e3
Use Ubuntu 24.04 micromamba image.
zaneselvans Nov 19, 2024
39b8447
Go back to doing full ETL after Postgres 16 test.
zaneselvans Nov 19, 2024
74350fa
Re-lock dependencies
zaneselvans Nov 19, 2024
eacbaf8
Remove jq, use envvar for PG_VERSION, test fast ETL.
zaneselvans Nov 19, 2024
7717829
Add a little workflow to test pattern matching.
zaneselvans Nov 19, 2024
f3f9a72
Fix typo in regex-test workflow.
zaneselvans Nov 19, 2024
e8954c0
Use a more restrictive tag matching pattern.
zaneselvans Nov 19, 2024
c92fcb4
Use a more specific tag pattern to trigger data releases.
zaneselvans Nov 19, 2024
7721947
Revert to a simple version tag pattern v20*
zaneselvans Nov 20, 2024
67e2ccc
revert to running full ETL.
zaneselvans Nov 20, 2024
6a0198f
Relock dependencies
zaneselvans Nov 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/versioned_release.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ assignees: ""
- [ ] Verify [`catalystcoop.pudl` PyPI (software) release](https://pypi.org/project/catalystcoop.pudl/)
- [ ] Verify that [PUDL repo archive on Zenodo](https://zenodo.org/doi/10.5281/zenodo.3404014) has been updated w/ new version
- [ ] Wait 6-10 hours for a successful build to complete
- [ ] Activate new version on the [RTD admin panel](https://readthedocs.org/projects/catalystcoop-pudl/versions/) and verify that it builds successfully.
- [ ] Activate new version on the [RTD admin panel](https://app.readthedocs.org/projects/catalystcoop-pudl/) and verify that it builds successfully.
- [ ] Verify that `stable` and the version tag point at same git ref
- [ ] Verify that [`stable` docs on RTD](https://catalystcoop-pudl.readthedocs.io/en/stable/) have been updated
- [ ] Verify `gs://pudl.catalyst.coop/vYYYY.M.x` has the new expected data.
Expand Down
25 changes: 7 additions & 18 deletions .github/workflows/build-deploy-pudl.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,6 @@ on:

env:
GCP_BILLING_PROJECT: ${{ secrets.GCP_BILLING_PROJECT }}
GCE_INSTANCE: pudl-deployment-tag # This is changed to pudl-deployment-dev if running on a schedule
GCE_INSTANCE_ZONE: ${{ secrets.GCE_INSTANCE_ZONE }}
GCS_OUTPUT_BUCKET: gs://builds.catalyst.coop
BATCH_JOB_JSON: batch_job.json

Expand All @@ -24,12 +22,6 @@ jobs:
contents: write
id-token: write
steps:
- name: Use pudl-deployment-dev vm if running on a schedule
if: ${{ (github.event_name == 'schedule') }}
run: |
echo "This action was triggered by a schedule."
echo "GCE_INSTANCE=pudl-deployment-dev" >> $GITHUB_ENV

- name: Checkout Repository
uses: actions/checkout@v4
with:
Expand All @@ -56,7 +48,6 @@ jobs:
- name: Show freshly set envvars
if: ${{ env.SKIP_BUILD != 'true' }}
run: |
echo "GCE_INSTANCE: $GCE_INSTANCE"
echo "NIGHTLY_TAG: $NIGHTLY_TAG"
echo "BUILD_ID: $BUILD_ID"
echo "BATCH_JOB_ID: $BATCH_JOB_ID"
Expand Down Expand Up @@ -140,15 +131,13 @@ jobs:
--container-env BUILD_ID=${{ env.BUILD_ID }} \
--container-env BUILD_REF=${{ github.ref_name }} \
--container-env FLY_ACCESS_TOKEN=${{ secrets.FLY_ACCESS_TOKEN }} \
--container-env GCE_INSTANCE=${{ env.GCE_INSTANCE }} \
--container-env GCE_INSTANCE_ZONE=${{ env.GCE_INSTANCE_ZONE }} \
--container-env GCP_BILLING_PROJECT=${{ secrets.GCP_BILLING_PROJECT }} \
--container-env GITHUB_ACTION_TRIGGER=${{ github.event_name }} \
--container-env NIGHTLY_TAG=${{ env.NIGHTLY_TAG }} \
--container-env OMP_NUM_THREADS=4 \
--container-env PUDL_BOT_PAT=${{ secrets.PUDL_BOT_PAT }} \
--container-env PUDL_GCS_OUTPUT=${{ env.PUDL_GCS_OUTPUT }} \
--container-env PUDL_SETTINGS_YML="/home/mambauser/pudl/src/pudl/package_data/settings/etl_full.yml" \
--container-env PUDL_SETTINGS_YML="/home/mambauser/pudl/src/pudl/package_data/settings/etl_fast.yml" \
zaneselvans marked this conversation as resolved.
Show resolved Hide resolved
--container-env SLACK_TOKEN=${{ secrets.PUDL_DEPLOY_SLACK_TOKEN }} \
--container-env ZENODO_SANDBOX_TOKEN_PUBLISH=${{ secrets.ZENODO_SANDBOX_TOKEN_PUBLISH }} \
--container-env ZENODO_TARGET_ENV=${{ (startsWith(github.ref_name, 'v20') && 'production') || 'sandbox' }} \
Expand All @@ -160,13 +149,13 @@ jobs:
if: ${{ env.SKIP_BUILD != 'true' }}
run: gcloud batch jobs submit run-etl-${{ env.BATCH_JOB_ID }} --config ${{ env.BATCH_JOB_JSON }} --location us-west1

- name: Post to a pudl-deployments channel
- name: Post to pudl-deployments channel
if: always()
id: slack
uses: slackapi/slack-github-action@v2
with:
channel-id: "C03FHB9N0PQ"
slack-message: "`${{ env.BUILD_ID }}` build-deploy-pudl status: ${{ (env.SKIP_BUILD == 'true') && 'skipped' || job.status }}\n${{ env.GCS_OUTPUT_BUCKET }}/${{ env.BUILD_ID }}"
env:
channel-id: "C03FHB9N0PQ"
SLACK_BOT_TOKEN: ${{ secrets.PUDL_DEPLOY_SLACK_TOKEN }}
method: chat.postMessage
token: ${{ secrets.PUDL_DEPLOY_SLACK_TOKEN }}
payload: |
text: "`${{ env.BUILD_ID }}` build-deploy-pudl status: ${{ (env.SKIP_BUILD == 'true') && 'skipped' || job.status }}\n${{ env.GCS_OUTPUT_BUCKET }}/${{ env.BUILD_ID }}"
channel: "C03FHB9N0PQ"
zaneselvans marked this conversation as resolved.
Show resolved Hide resolved
29 changes: 14 additions & 15 deletions devtools/zenodo/zenodo_data_release.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,24 +87,26 @@ def __init__(self, env: str):

logger.info(f"Using Zenodo token: {token[:4]}...{token[-4:]}")

def retry_request(self, *, method, url, max_tries=5, timeout=5, **kwargs):
def retry_request(self, *, method, url, max_tries=6, timeout=2, **kwargs):
zaneselvans marked this conversation as resolved.
Show resolved Hide resolved
"""Wrap requests.request in retry logic.

Passes method, url, and **kwargs to requests.request.
"""
base_timeout = 2
for try_num in range(1, max_tries):
try:
return requests.request(
method=method, url=url, timeout=timeout, **kwargs
method=method, url=url, timeout=timeout**try_num, **kwargs
)
except requests.RequestException as e:
timeout = base_timeout**try_num
logger.warning(f"Attempt #{try_num} Got {e}, retrying in {timeout} s")
time.sleep(timeout)
logger.warning(
f"Attempt #{try_num} Got {e}, retrying in {timeout**try_num} s"
)
time.sleep(timeout**try_num)

# don't catch errors on the last try.
return requests.request(method=method, url=url, timeout=timeout, **kwargs)
return requests.request(
method=method, url=url, timeout=timeout**max_tries, **kwargs
)

def get_deposition(self, deposition_id: int) -> _LegacyDeposition:
"""LEGACY API: Get JSON describing a deposition.
Expand All @@ -115,7 +117,6 @@ def get_deposition(self, deposition_id: int) -> _LegacyDeposition:
method="GET",
url=f"{self.base_url}/deposit/depositions/{deposition_id}",
headers=self.auth_headers,
timeout=5,
zaneselvans marked this conversation as resolved.
Show resolved Hide resolved
)
logger.debug(
f"License from JSON for {deposition_id} is "
Expand All @@ -132,7 +133,6 @@ def get_record(self, record_id: int) -> _NewRecord:
method="GET",
url=f"{self.base_url}/records/{record_id}",
headers=self.auth_headers,
timeout=5,
)
return _NewRecord(**response.json())

Expand All @@ -146,7 +146,6 @@ def new_record_version(self, record_id: int) -> _NewRecord:
method="POST",
url=f"{self.base_url}/records/{record_id}/versions",
headers=self.auth_headers,
timeout=5,
)
return _NewRecord(**response.json())

Expand All @@ -162,7 +161,7 @@ def update_deposition_metadata(
data = {"metadata": metadata.model_dump()}
logger.debug(f"Setting metadata for {deposition_id} to {data}")
response = self.retry_request(
method="PUT", url=url, json=data, headers=self.auth_headers, timeout=5
method="PUT", url=url, json=data, headers=self.auth_headers
)
return _LegacyDeposition(**response.json())

Expand All @@ -175,7 +174,6 @@ def delete_deposition_file(self, deposition_id: int, file_id) -> requests.Respon
method="DELETE",
url=f"{self.base_url}/deposit/depositions/{deposition_id}/files/{file_id}",
headers=self.auth_headers,
timeout=5,
)

def create_bucket_file(
Expand All @@ -196,7 +194,6 @@ def create_bucket_file(
url=url,
headers=self.auth_headers,
data=file_content,
timeout=5,
)
return response

Expand All @@ -206,7 +203,6 @@ def publish_deposition(self, deposition_id: int) -> _LegacyDeposition:
method="POST",
url=f"{self.base_url}/deposit/depositions/{deposition_id}/actions/publish",
headers=self.auth_headers,
timeout=5,
)
return _LegacyDeposition(**response.json())

Expand Down Expand Up @@ -375,7 +371,10 @@ def get_html_url(self):
required=True,
help="Path to a directory whose contents will be uploaded to Zenodo. "
"Subdirectories are ignored. Can get files from GCS as well - just prefix "
"with gs://.",
"with gs://. NOTE: nightly build outputs are NOT suitable for creating a Zenodo "
"data release, as they include hundreds of individual Parquet files, which we "
"archive on Zenodo as a single zipfile. Check what files should actually be "
"distributed. E.g. it may be *.log *.zip *.json ",
)
@click.option(
"--publish/--no-publish",
Expand Down
17 changes: 5 additions & 12 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM mambaorg/micromamba:2.0.2
FROM mambaorg/micromamba:2.0.3-ubuntu24.04
zaneselvans marked this conversation as resolved.
Show resolved Hide resolved

ENV CONTAINER_HOME=/home/$MAMBA_USER
ENV PGDATA=${CONTAINER_HOME}/pgdata
Expand All @@ -8,10 +8,9 @@ USER root
SHELL [ "/bin/bash", "-exo", "pipefail", "-c" ]

# Install some linux packages
# awscli requires unzip, less, groff and mandoc
# hadolint ignore=DL3008
RUN apt-get update && \
apt-get install --no-install-recommends -y git jq unzip less groff mandoc postgresql && \
apt-get install --no-install-recommends -y git jq postgresql && \
zaneselvans marked this conversation as resolved.
Show resolved Hide resolved
apt-get clean && \
rm -rf /var/lib/apt/lists/*

Expand All @@ -24,9 +23,10 @@ RUN printf '[GoogleCompute]\nservice_account = default' > /etc/boto.cfg
RUN usermod -aG postgres "$MAMBA_USER"

# Create new cluster for Dagster usage that's owned by $MAMBA_USER.
# When the PG major version changes we'll have to update this from 15 to 16
# Remember to bump the Postgres version. Postgres 17 was released in September, 2024.
# Note that the Postgres version is also hardcoded in 2 places the nightly build script.
zaneselvans marked this conversation as resolved.
Show resolved Hide resolved
# hadolint ignore=DL3059
RUN pg_createcluster 15 dagster -u "$MAMBA_USER" -- -A trust
RUN pg_createcluster 16 dagster -u "$MAMBA_USER" -- -A trust

# Switch back to being non-root user and get into the home directory
USER $MAMBA_USER
Expand Down Expand Up @@ -62,13 +62,6 @@ COPY --chown=${MAMBA_USER}:${MAMBA_USER} . ${PUDL_REPO}
ENV LD_LIBRARY_PATH=${CONDA_PREFIX}/lib
RUN ${CONDA_RUN} pip install --no-cache-dir --no-deps --editable ${PUDL_REPO}

# Install awscli2
# Change back to root because the install script needs access to /usr/local/aws-cli
# curl commands run within conda environment because curl is installed by conda.
USER root
RUN ${CONDA_RUN} bash -c 'curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && unzip awscliv2.zip && ./aws/install'
USER $MAMBA_USER

# Install flyctl
zaneselvans marked this conversation as resolved.
Show resolved Hide resolved
# hadolint ignore=DL3059
RUN ${CONDA_RUN} bash -c 'curl -L https://fly.io/install.sh | sh'
Expand Down
Loading