Strip leading zeros from CEMS emission_unit_id_epa
#402
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
We had noticed that in 2023, the PJM power sector results file had a missing fuel category. I traced this to plant 50852, and it turns out the cause of the issue was that the fuel type backfill using the PSDC was failing because we were not stripping leading zeros from the unit ID when loading the table, so the merge was not working.
After digging into this more, I realized that pudl does not automatically strip leading zeros from
emissions_unit_id_epa
(see catalyst-cooperative/pudl#3992). This PR adds functionality to always strip these leading zeros whenever loading CEMS dataThis PR also updates the sandbox notebook to use 2023 as the year, and also adds some starter functions for loading intermediate output data for exploration since these frequently need to be loaded.
Testing
Ran pipeline for 2023
Where to look
It's helpful to clarify where your new code lives if you moved files around or there could be confusion/
What files are most important?
Usage Example/Visuals
How the code can be used and/or images of any graphs, tables or other visuals (not always applicable).
Review estimate
How long will it take for reviewers and observers to understand this code change?
Future work
What issues were identified that are not being addressed in this PR but should be addressed in future work?
Checklist
black