-
-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename FERC Form 1 core and output assets #2992
Comments
Okay I have two FERC1 naming questions that I'd love some feedback on:
This whole comment will make more sense if you go looks at the three iterations of suggestions in this tab. its
|
Removing To me having the explicit The old DBF table names were hopelessly illegible with their inconsistent and highly abbreviated names. The XBRL table names are almost ridiculously long and descriptive, but I think that does help identify what the heck is inside them -- especially if someone is coming in with a familiarity with the "paper" FERC Form 1, table names that somewhat correspond to the schedule titles in the PDF will help them find what they're looking for, even if they aren't the 150 character long titles that they use in XBRL. There are also some limited cases in which there's non-electric utility information being reported in the FERC 1 tables. If we were to remove both Another option that could help folks navigate the PDF to database chasm is using the schedule number in the table names, as the XBRL table names do. That makes it very easy to find the corresponding information in the XBRL Taxonomy viewer or to find the pages in the PDF that correspond to the information in the table. They also seem to be pretty stable over the years. I think having e.g. .tables -- get a list of all the tables in the DB
SHOW ALL TABLES; -- gives more information about the tables, but often gets truncated.
DESCRIBE table_name; -- list the columns in the table and their types, nullability, etc.
SUMMARIZE table_name; -- calculate some summary statistics for the table. Min/max median, null-ness, etc. It seems like the tension here is between the inconvenience and potentially duplicative nature of longer names, and the desire for the table names to be legible and connected to information users are likely to already be familiar with. |
There's also our colloquial usage of "plants" in the case of generation plant vs. FERC's use of the capital accounting "plant" and if we're going to remove "plant" then should we also remove "plants?" Or should we specify that the tables containing "plants" now are specifically |
I mostly dislike the |
ElectricAre there utilities in these tables where However, if there are tables that don't describe electric utilities, we should probably include "electric" in the table names of tables that do describe electric utilities. PlantI agree with @zaneselvans that users should get a sense of what is in the table without having to look at columns or values. If the entity the tables describe is a "plant" and not all Form 1 tables describe plants, then I think we should include "plant" in the table name so they can be differentiated. |
Electric
I believe all of the tables with the suggestion for
Every respondent to FERC1 is an electric utility, but some of the tables like the income statement table include income from non-electric portions of their business. Out of our current FERC tables, 7 of them include non-electric portions of the FERC1 respondents. Plant
I agree with this for the plant in service table. But for the three depreciation tables, it seems duplicative because depreciation is necessarily about plant assets. |
Desires (some of which are conflicting)
|
We had a synchronous call: Decisions:
|
thanks y'all for the chat yesterday to come to a good decision on these names! Here are the new name suggestions:
|
okay two output table name questions regarding the schedule # suffixes:
|
These table names look great thank you!
I say we keep the schedule numbers around until the tables are combined with other tables from different datasets or schedules. |
That makes total sense to me @bendnorman and is my inclination as well. These are the only FERC names that i think had any weirdness in regards to the schedule name and where i think this lands:
|
oop i realized there is still one downstream ferc1 naming question:
Options:
basically how do we deal w/ double datasets? |
By double datasets, do you mean assets that come from multiple sources? We typically make |
Yes double datasets as in two or more input sources.
The two (?) analogous tables I can think of are How about using PUDL as the source name?
How about using FERC as the source name?This table effectively is the because of
if we end up making an EIA generators table with FERC data scaled down (re #2946) we could name that |
We made some I think it's important to include the names of the entities and sources that an Is each entity in these tables a plant or a plant part? |
I think in many of our assets there's more than one input source and at that point it's okay for the notion of the "source" to get diluted and left out of the name. But in the case of association / glue tables whose whole point is to link two tables from different data sources together it's probably good to refer to both of those data sources somehow and include I think that whatever this table is named it should definitely include:
So maybe that's If we are going to use |
Also for the love of god I wish we could standardize on the ORDERING of multiple data sources. Like if there's more than one always do them in alphabetical order. |
@cmgosnell and I just chatted about how to name this asset. We settled on @cmgosnell can probably better describe why we chose |
okay we're going with: why:
okay i love the alphabetization. that's a good call. we had ferc1_eia... in lots and lots of places! I preserved the eia/epathis isn't FERC1 related but there are lots of
for these, I think that |
For our first pass at renaming the FERC Form 1 core and output assets, we just applied the new naming convention to the existing asset name. We later realized some of the original asset_names were not consistent and did not accurately describe the data. Our Form 1 masters @zaneselvans and @cgosnell created more descriptive and consistent names in the Naming Conventions spreadsheet.
We should make these name changes before we merge in #2818 so we don't have two rounds of widespread name changes.
I started to pluralize some of the FERC 1 core tables in #2914, but it would probably be easier to cut a new branch off of #2818 and replace the asset names in the "Asset name in rename-core-asset branch" column of the spreadsheet with the names with y'alls suggestions.
Don't forget to update the asset names in the transformation metadata csvs! I forgot to do this in the first round of renaming and ran into some unexpected errors.
Tasks
The text was updated successfully, but these errors were encountered: