Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update excepted rows for no-fips id-ed respondents but keep annualize… #3023

Merged
merged 3 commits into from
Nov 7, 2023

Conversation

cmgosnell
Copy link
Member

@cmgosnell cmgosnell commented Nov 7, 2023

PR Overview

#3021

still don't know whyyyy the records from fipsified_respondents_ferc714 are now dropped, but those none of those records had any FIPS codes previously so I think it is okay to drop them.

in the summarized_demand_ferc714 case, the annualized records were being dropped because of a merge with the fipsified_respondents_ferc714. I made this merge a left merge so those records are now preserved. I also did a little merge/groupby cleanup for speed - shaved 15 seconds wahoo lol.

PR Checklist

  • Merge the most recent version of the branch you are merging into (probably dev).
  • All CI checks are passing. Run tests locally to debug failures
  • Make sure you've included good docstrings.
  • For major data coverage & analysis changes, run data validation tests
  • Include unit tests for new functions and classes.
  • Do your own explanatory review of the PR to help the reviewer understand what's going on and identify issues preemptively.
  • Defensive data quality/sanity checks in analyses & data processing functions.
  • Update the release notes and reference reference the PR and related issues.

@cmgosnell cmgosnell linked an issue Nov 7, 2023 that may be closed by this pull request
@cmgosnell cmgosnell marked this pull request as ready for review November 7, 2023 18:31
Copy link
Member

@zaneselvans zaneselvans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker, but we might want to slip the test that requires reading 15M rows into another place where they're already in memory. How long does that test take to run with --live-dbs as it is?


def test_report_year_discrepency_in_demand_hourly_pa_ferc714(pudl_out_orig):
"""Test if the vast majority of the years in the two date columns line up."""
demand_hourly_pa_ferc714 = pudl_out_orig.demand_hourly_pa_ferc714()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will take a while to load ~15M rows from SQLite. If there's another test (or the transform) in which the table is already loaded, we might want to slip it in there rather than reading one of the longest tables multiple times.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it took one minute to run for me

(pudl-dev) [pudl] pytest test/validate/service_territory_test.py::test_report_year --live-dbs                 13:28:04ferc714_mystery_date_fix ☂ ⚡
================================================================== test session starts ==================================================================
platform darwin -- Python 3.11.6, pytest-7.4.3, pluggy-1.3.0 -- /Users/christinagosnell/miniforge3/envs/pudl-dev/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase(PosixPath('/Users/christinagosnell/code/pudl/.hypothesis/examples'))
rootdir: /Users/christinagosnell/code/pudl
configfile: pyproject.toml
plugins: hypothesis-6.88.1, anyio-4.0.0, console-scripts-1.4.1, cov-4.1.0, mock-3.12.0, typeguard-4.1.5
collected 1 item

test/validate/service_territory_test.py::test_report_year
-------------------------------------------------------------------- live log setup ---------------------------------------------------------------------
2023-11-07 13:28:13 [    INFO] test.conftest:346 Starting unit tests with output path /Users/christinagosnell/code/pudl_work/output
2023-11-07 13:28:13 [    INFO] catalystcoop.pudl.workspace.setup:175 Skipping existing file at /Users/christinagosnell/code/pudl/src/pudl/package_data/settings/etl_full.yml
2023-11-07 13:28:13 [    INFO] catalystcoop.pudl.workspace.setup:175 Skipping existing file at /Users/christinagosnell/code/pudl/src/pudl/package_data/settings/etl_fast.yml
2023-11-07 13:28:14 [    INFO] alembic.runtime.migration:213 Context impl SQLiteImpl.
2023-11-07 13:28:14 [    INFO] alembic.runtime.migration:216 Will assume non-transactional DDL.
2023-11-07 13:28:14 [    INFO] alembic.runtime.migration:619 Running stamp_revision  -> ec80dd91891a
2023-11-07 13:28:14 [   DEBUG] alembic.runtime.migration:827 new branch insert ec80dd91891a
2023-11-07 13:28:14 [    INFO] alembic.runtime.migration:213 Context impl SQLiteImpl.
add report year validation test
2023-11-07 13:28:14 [    INFO] alembic.runtime.migration:216 Will assume non-transactional DDL.
2023-11-07 13:28:14 [    INFO] alembic.runtime.migration:619 Running stamp_revision  -> ec80dd91891a
2023-11-07 13:28:14 [   DEBUG] alembic.runtime.migration:827 new branch insert ec80dd91891a
2023-11-07 13:28:14 [    INFO] alembic.runtime.migration:213 Context impl SQLiteImpl.
2023-11-07 13:28:14 [    INFO] alembic.runtime.migration:216 Will assume non-transactional DDL.
2023-11-07 13:28:14 [    INFO] alembic.runtime.migration:619 Running stamp_revision  -> ec80dd91891a
2023-11-07 13:28:14 [   DEBUG] alembic.runtime.migration:827 new branch insert ec80dd91891a
2023-11-07 13:28:14 [    INFO] alembic.runtime.migration:213 Context impl SQLiteImpl.
2023-11-07 13:28:14 [    INFO] alembic.runtime.migration:216 Will assume non-transactional DDL.
2023-11-07 13:28:14 [    INFO] alembic.runtime.migration:619 Running stamp_revision  -> ec80dd91891a
2023-11-07 13:28:14 [   DEBUG] alembic.runtime.migration:827 new branch insert ec80dd91891a
2023-11-07 13:28:15 [    INFO] test.conftest:288 setting up the pudl_engine fixture
2023-11-07 13:28:15 [    INFO] alembic.runtime.migration:213 Context impl SQLiteImpl.
2023-11-07 13:28:15 [    INFO] alembic.runtime.migration:216 Will assume non-transactional DDL.
2023-11-07 13:28:15 [    INFO] alembic.runtime.migration:619 Running stamp_revision  -> ec80dd91891a
2023-11-07 13:28:15 [   DEBUG] alembic.runtime.migration:827 new branch insert ec80dd91891a
2023-11-07 13:28:15 [    INFO] alembic.runtime.migration:213 Context impl SQLiteImpl.
2023-11-07 13:28:15 [    INFO] alembic.runtime.migration:216 Will assume non-transactional DDL.
2023-11-07 13:28:15 [    INFO] alembic.runtime.migration:619 Running stamp_revision  -> ec80dd91891a
2023-11-07 13:28:15 [   DEBUG] alembic.runtime.migration:827 new branch insert ec80dd91891a
2023-11-07 13:28:15 [    INFO] alembic.runtime.migration:213 Context impl SQLiteImpl.
2023-11-07 13:28:15 [    INFO] alembic.runtime.migration:216 Will assume non-transactional DDL.
PASSED                                                                                                                                            [100%]

=================================================================== warnings summary ====================================================================
test/validate/service_territory_test.py::test_report_year
  /Users/christinagosnell/miniforge3/envs/pudl-dev/lib/python3.11/site-packages/dagster/_core/storage/sqlalchemy_compat.py:13: RemovedIn20Warning: Deprecated API features detected! These feature(s) are not compatible with SQLAlchemy 2.0. To prevent incompatible upgrades prior to updating applications, ensure requirements files are pinned to "sqlalchemy<2.0". Set environment variable SQLALCHEMY_WARN_20=1 to show all deprecation warnings.  Set environment variable SQLALCHEMY_SILENCE_UBER_WARNING=1 to silence this message. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
    return db.select(items)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
======================================================== 1 passed, 1 warning in 83.09s (0:01:23) ========================================================

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i couuuuld add it into test_minmax_rows but that feels... wrong.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh hm there is currently no validation test for think chonker of a table. Since we are now loading it i can convert the test to check for both things.

test/validate/service_territory_test.py Show resolved Hide resolved
Copy link
Collaborator

@rousik rousik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the automated diffing (I know) it's hard to asses the actual impact on the outputs. Looking at the code this seems sane and reasonable.

Copy link

codecov bot commented Nov 7, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (5a96335) 88.7% compared to head (f25e921) 88.7%.
Report is 1 commits behind head on dev.

Additional details and impacted files
@@          Coverage Diff          @@
##             dev   #3023   +/-   ##
=====================================
  Coverage   88.7%   88.7%           
=====================================
  Files         91      91           
  Lines      11010   11010           
=====================================
  Hits        9768    9768           
  Misses      1242    1242           
Files Coverage Δ
src/pudl/output/ferc714.py 96.2% <ø> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cmgosnell cmgosnell merged commit fcf4ccc into dev Nov 7, 2023
9 checks passed
@cmgosnell cmgosnell deleted the ferc714_mystery_date_fix branch November 7, 2023 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Fix FERC714 validation failure
3 participants