Skip to content

Commit

Permalink
Merge pull request #3138 from catalyst-cooperative/dev
Browse files Browse the repository at this point in the history
Merge dev into main for 2023-12-08
  • Loading branch information
zaneselvans authored Dec 8, 2023
2 parents 5a7e037 + 94bb8c5 commit 3df3df3
Show file tree
Hide file tree
Showing 62 changed files with 2,266 additions and 2,408 deletions.
5 changes: 4 additions & 1 deletion .codecov.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
---
coverage:
range: 70..100
round: down
round: nearest
precision: 1

ignore:
- "src/pudl/validate.py"

codecov:
token: 23a7ee04-6ac5-4d1b-9d36-86b0c50d40c5
require_ci_to_pass: true
Expand Down
22 changes: 0 additions & 22 deletions .coveragerc

This file was deleted.

2 changes: 1 addition & 1 deletion .gitattributes
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
*.ipynb linguist-detectable=false
*.html linguist-detectable=false
eia861-transform.ipynb merge=ours
environments/conda-*lock.yml merge=ours
environments/conda-*lock.yml merge=ours linguist-generated=true
*.csv text
*.py text
*.json text
Expand Down
24 changes: 24 additions & 0 deletions .github/ISSUE_TEMPLATE/annual_updates.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
name: Integrate New Year of Data
about: Check-list for integrating a new year of data
title: ''
labels: new-data
assignees: ''

---

### New year of data integration check-list:

Based on the [Annual Updates Docs](https://catalystcoop-pudl.readthedocs.io/en/dev/dev/annual_updates.html)


- [ ] [Obtain fresh data](https://catalystcoop-pudl.readthedocs.io/en/latest/dev/annual_updates.html#obtain-fresh-data)
- [ ] [Map the structure of the new data](https://catalystcoop-pudl.readthedocs.io/en/latest/dev/annual_updates.html#map-the-structure-of-the-new-data)
- [ ] [Test data extraction](https://catalystcoop-pudl.readthedocs.io/en/latest/dev/annual_updates.html#test-data-extraction)
- [ ] [Update table and column transformations](https://catalystcoop-pudl.readthedocs.io/en/latest/dev/annual_updates.html#update-table-column-transformations)
- [ ] [Update the PUDL db schema](https://catalystcoop-pudl.readthedocs.io/en/latest/dev/annual_updates.html#update-the-pudl-db-schema)
- [ ] [Connect datasets](https://catalystcoop-pudl.readthedocs.io/en/latest/dev/annual_updates.html#connect-datasets)
- [ ] [Run the ETL](https://catalystcoop-pudl.readthedocs.io/en/latest/dev/annual_updates.html#run-the-etl)
- [ ] [Update the output routines and run full tests](https://catalystcoop-pudl.readthedocs.io/en/latest/dev/annual_updates.html#update-the-output-routines-and-run-full-tests)
- [ ] [Run and update data validations](https://catalystcoop-pudl.readthedocs.io/en/latest/dev/annual_updates.html#run-and-update-data-validations)
- [ ] [Update the documentation](https://catalystcoop-pudl.readthedocs.io/en/latest/dev/annual_updates.html#update-the-documentation)
30 changes: 30 additions & 0 deletions .github/ISSUE_TEMPLATE/new_dataset.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
name: New dataset
about: Provide information about a new dataset you'd like to see in PUDL
title: ''
labels: new-data
assignees: ''
---

### Overview

What is this dataset?

Why do you want it in PUDL?

Is it already partially in PUDL, or do we need to start from scratch?

### Logistics

Is this dataset publically available?

Where can one download the actual data?

How often does this dataset get updated?

What licensing restrictions apply?

### What do you know about it so far?

What have you done with this dataset so far? Have you run into any problems with
it yet?
52 changes: 14 additions & 38 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -1,49 +1,25 @@
<!--
Making a PUDL Pull Request
Before making a PR you may want to check out our:
Resources:
* contributing guidelines: https://catalystcoop-pudl.readthedocs.io/en/latest/CONTRIBUTING.html
* code of conduct: https://catalystcoop-pudl.readthedocs.io/en/latest/code_of_conduct.html
* development process: https://catalystcoop-pudl.readthedocs.io/en/latest/dev/index.html
## PR Process Overview
* PRs have to get an approving review before merging into their development branch.
* Most PRs should be made against the `dev` branch, unless they are part of some larger ongoing refactoring, in which case there will be a persistent development branch for that work.
* It is much easier to do timely code reviews on smaller chunks of code. We try to keep PRs under 500 lines of code.
* Draft PRs are a good way to get early feedback on designs or several incremental commits that will add up to larger changes. If you want a review of a draft PR, make sure you contact the reviewer directly or mention their username in the PR comment, so they get a notification.
* How quickly we can review a PR will depend on how large and complex it is, and how busy we are, but ideally we strive to get an initial review done within a week. If there are going to be delays, we should at least comment on the PR to let you know the situation.
* If you believe you've addressed a reviewer's comments, respond with a brief note and mark the comment resolved. If further discussion is requried respond and do not resolve the comment.
* Before a PR is merged all reviewer comments should be resolved. If a reviewer doesn't feel that their comment has been sufficiently addressed, they may unresolve a comment.
* Be careful not to accidentally "start a review" when responding to comments! If this does happen, don't forget to submit the review you've started so the other PR participatns can see your comments (they are invisible to others if marked "Pending").
* In the period after an initial review when there is significant back-and-forth with the reviewer deciding what changes should actually be made, there should probably be daily interaction. If significant changes are required, it's usually best to request another review after those changes have been made.
Feel free to delete the commented-out parts of the template before submitting the PR.
-->
# Overview

# PR Overview
Closes #XXXX.

<!--
What problem does this address?

Include a short narrative summary of what's going on in the PR. This can be a bulleted list. You might want to include:
What did you change?

* What are you changing and why?
* Are there any known unsolved problems remaining in the PR?
* Is there anything that you want a reivewer to pay particular attention to?
* What kind of feedback are you looking for on the PR?
-->
# Testing

# PR Checklist
How did you make sure this worked? How can a reviewer verify this?

- [ ] Merge the most recent version of the branch you are merging into (probably `dev`).
- [ ] All CI checks are passing. [Run tests locally to debug failures](https://catalystcoop-pudl.readthedocs.io/en/latest/dev/testing.html#running-tests-with-tox)
- [ ] Make sure you've included good docstrings.
```[tasklist]
# To-do list
- [ ] Make sure full ETL runs & `make pytest-integration-full` passes locally
- [ ] For major data coverage & analysis changes, [run data validation tests](https://catalystcoop-pudl.readthedocs.io/en/latest/dev/testing.html#data-validation)
- [ ] Include unit tests for new functions and classes.
- [ ] Defensive data quality/sanity checks in analyses & data processing functions.
- [ ] Update the [release notes](https://catalystcoop-pudl.readthedocs.io/en/latest/release_notes.html) and reference reference the PR and related issues.
- [ ] Do your own explanatory review of the PR to help the reviewer understand what's going on and identify issues preemptively.
- [ ] If updating analyses or data processing functions: make sure to update or write data validation tests
- [ ] Update the [release notes](../docs/release_notes.rst): reference the PR and related issues.
- [ ] Review the PR yourself and call out any questions or issues you have
```
21 changes: 15 additions & 6 deletions .github/workflows/build-deploy-pudl.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ env:
GITHUB_REF: ${{ github.ref_name }} # This is changed to dev if running on a schedule
GCE_INSTANCE: pudl-deployment-tag # This is changed to pudl-deployment-dev if running on a schedule
GCE_INSTANCE_ZONE: ${{ secrets.GCE_INSTANCE_ZONE }}
GCS_OUTPUT_BUCKET: gs://nightly-build-outputs.catalyst.coop

jobs:
build_and_deploy_pudl:
Expand All @@ -27,13 +28,14 @@ jobs:
echo "This action was triggered by a schedule." && echo "GCE_INSTANCE=pudl-deployment-dev" >> $GITHUB_ENV && echo "GITHUB_REF=dev" >> $GITHUB_ENV
- name: Checkout Repository
uses: actions/checkout@v3
uses: actions/checkout@v4
with:
ref: ${{ env.GITHUB_REF }}

- name: Get HEAD of the branch (main or dev)
run: |
echo "ACTION_SHA=$(git rev-parse HEAD)" >> $GITHUB_ENV
echo "SHORT_SHA=$(git rev-parse --short HEAD)" >> $GITHUB_ENV
- name: Print action vars
run: |
Expand All @@ -53,17 +55,17 @@ jobs:
type=ref,event=tag
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2.5.0
uses: docker/setup-buildx-action@v3.0.0

- name: Login to DockerHub
if: github.event_name != 'pull_request'
uses: docker/login-action@v2.1.0
uses: docker/login-action@v3.0.0
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}

- name: Build image and push to Docker Hub
uses: docker/build-push-action@v4.0.0
uses: docker/build-push-action@v5.1.0
with:
context: .
file: docker/Dockerfile
Expand All @@ -74,7 +76,7 @@ jobs:
cache-to: type=gha,mode=max

- id: "auth"
uses: "google-github-actions/auth@v1"
uses: "google-github-actions/auth@v2"
with:
workload_identity_provider: "projects/345950277072/locations/global/workloadIdentityPools/gh-actions-pool/providers/gh-actions-provider"
service_account: "deploy-pudl-github-action@catalyst-cooperative-pudl.iam.gserviceaccount.com"
Expand All @@ -83,6 +85,11 @@ jobs:
- name: Set up Cloud SDK
uses: google-github-actions/setup-gcloud@v1

- name: Determine commit information
run: |-
echo "COMMIT_BRANCH=$(gitrev-parse --abbrev-ref HEAD)" >> $GITHUB_ENV
echo "COMMIT_TIME=$(git log -1 --format=%cd --date=format:%Y-%m-%d-%H%M)" >> $GITHUB_ENV
# Deploy PUDL image to GCE
- name: Deploy
env:
Expand Down Expand Up @@ -119,6 +126,7 @@ jobs:
--container-env DAGSTER_PG_DB="dagster-storage" \
--container-env FLY_ACCESS_TOKEN=${{ secrets.FLY_ACCESS_TOKEN }} \
--container-env PUDL_SETTINGS_YML="/home/mambauser/src/pudl/package_data/settings/etl_full.yml" \
--container-env PUDL_GCS_OUTPUT=${{ env.GCS_OUTPUT_BUCKET }}/${{ env.COMMIT_TIME }}-${{ env.SHORT_SHA }}-${{ env.COMMIT_BRANCH }}
# Start the VM
- name: Start the deploy-pudl-vm
Expand All @@ -129,6 +137,7 @@ jobs:
uses: slackapi/[email protected]
with:
channel-id: "C03FHB9N0PQ"
slack-message: "build-deploy-pudl status: ${{ job.status }}\n${{ env.ACTION_SHA }}-${{ env.GITHUB_REF }}"
slack-message: "build-deploy-pudl status: ${{ job.status }}\n${{ env.COMMIT_TIME}}-${{ env.SHORT_SHA }}-${{ env.COMMIT_BRANCH }}"
env:
channel-id: "C03FHB9N0PQ"
SLACK_BOT_TOKEN: ${{ secrets.PUDL_DEPLOY_SLACK_TOKEN }}
6 changes: 3 additions & 3 deletions .github/workflows/docker-build-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
id-token: write
steps:
- name: Checkout Repository
uses: actions/checkout@v3
uses: actions/checkout@v4

- name: Docker Metadata
id: docker_metadata
Expand All @@ -24,10 +24,10 @@ jobs:
latest=auto
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2.5.0
uses: docker/setup-buildx-action@v3.0.0

- name: Build image but do not push to Docker Hub
uses: docker/build-push-action@v4.0.0
uses: docker/build-push-action@v5.1.0
with:
context: .
file: docker/Dockerfile
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ jobs:
- name: Set default GCP credentials
id: gcloud-auth
continue-on-error: true
uses: "google-github-actions/auth@v1"
uses: "google-github-actions/auth@v2"
with:
workload_identity_provider: "projects/345950277072/locations/global/workloadIdentityPools/gh-actions-pool/providers/gh-actions-provider"
service_account: "tox-pytest-github-action@catalyst-cooperative-pudl.iam.gserviceaccount.com"
Expand Down
10 changes: 5 additions & 5 deletions .github/workflows/run-etl.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
id-token: write
steps:
- name: Checkout Repository
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: Docker Metadata
id: docker_metadata
uses: docker/[email protected]
Expand All @@ -24,15 +24,15 @@ jobs:
latest=auto
tags: type=sha
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2.5.0
uses: docker/setup-buildx-action@v3.0.0
- name: Login to DockerHub
if: github.event_name != 'pull_request'
uses: docker/login-action@v2.1.0
uses: docker/login-action@v3.0.0
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Build image and push to Docker Hub
uses: docker/build-push-action@v4.0.0
uses: docker/build-push-action@v5.1.0
with:
context: .
file: docker/Dockerfile
Expand All @@ -48,7 +48,7 @@ jobs:
contents: read
steps:
- id: gcloud-auth
uses: google-github-actions/auth@v1
uses: google-github-actions/auth@v2
with:
workload_identity_provider: "projects/345950277072/locations/global/workloadIdentityPools/gh-actions-pool/providers/gh-actions-provider"

Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/zenodo-cache-sync.yml
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ jobs:
- name: Set default gcp credentials
id: gcloud-auth
uses: "google-github-actions/auth@v1"
uses: "google-github-actions/auth@v2"
with:
workload_identity_provider: "projects/345950277072/locations/global/workloadIdentityPools/gh-actions-pool/providers/gh-actions-provider"
service_account: "zenodo-cache-manager@catalyst-cooperative-pudl.iam.gserviceaccount.com"
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ repos:
verbose: false
pass_filenames: false
always_run: true
entry: pytest --doctest-modules src/pudl test/unit
entry: pytest --doctest-modules src/pudl test/unit -m "not slow"

# Configuration for pre-commit.ci
ci:
Expand Down
Loading

0 comments on commit 3df3df3

Please sign in to comment.