-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: option to load wind and solar resource data from HPC #414
Conversation
…dated upstream functions
…year being set to 2012
… and updated doc strings
nsrdb_source_path (Union[str,Path], optional): directory where NSRDB data is hosted on HPC. Defaults to "". | ||
filepath (str, optional): filepath to NSRDB h5 file on HPC. Defaults to "". | ||
- should be formatted as: /path/to/file/name_of_file.h5 | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elenya-grant in the windtoolkit file you have verbiage for Raises: FileNotFoundError: in the docstrings. For consistency I'd add it into this files docstrings as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed this!
|
||
if filepath == "" and wtk_source_path=="": | ||
# use default filepaths based on resource year | ||
if self.year < 2014: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks good, I had a thought when looking at the if, elif statements for the year < or = 2014. Is it worth raising an error if the year is > 2014? I don't believe the normal WindData class has that functionality, but could be a good item to add just in case? no strong feelings either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a value error if the resource year is outside of valid options for that dataset (also added tests for this)
This looks good to me, made two comments with possible additions / edits, other than that I think this can be approved and merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to merge from my perspective, all comments from previous review have been addressed.
tests/hopp/test_resource_download.py
Outdated
def test_wtk_resource_filenotfound(): | ||
wtk_fake_dir = str(ROOT_DIR) | ||
resource_year = 2012 | ||
wtk_fake_fpath = os.path.join(str(ROOT_DIR),f"wtk_conus_{resource_year}.h5") | ||
with pytest.raises(FileNotFoundError) as err: | ||
HPCWindData(lat = 35.201, lon = -101.945, year = resource_year, wind_turbine_hub_ht = 110, wtk_source_path=wtk_fake_dir) | ||
assert str(err.value) == f"Cannot find Wind Toolkit .h5 file, filepath {wtk_fake_fpath} does not exist" | ||
|
||
def test_wtk_resource_invalid_year(): | ||
wtk_fake_dir = str(ROOT_DIR) | ||
resource_year = 2006 | ||
wtk_fake_fpath = os.path.join(str(ROOT_DIR),f"wtk_conus_{resource_year}.h5") | ||
with pytest.raises(ValueError) as err: | ||
HPCWindData(lat = 35.201, lon = -101.945, year = resource_year, wind_turbine_hub_ht = 110, wtk_source_path=wtk_fake_dir) | ||
assert str(err.value) == f"Resource year for WIND Toolkit Data must be between 2007 and 2014 but {resource_year} was provided" | ||
|
||
def test_nsrdb_resource_filenotfound(): | ||
nsrdb_fake_dir = str(ROOT_DIR) | ||
resource_year = 2012 | ||
nsrdb_fake_fpath = os.path.join(str(ROOT_DIR),f"nsrdb_{resource_year}.h5") | ||
with pytest.raises(FileNotFoundError) as err: | ||
HPCSolarData(lat = 35.201, lon = -101.945, year = resource_year, nsrdb_source_path=nsrdb_fake_dir) | ||
assert str(err.value) == f"Cannot find NSRDB .h5 file, filepath {nsrdb_fake_fpath} does not exist" | ||
|
||
def test_nsrdb_resource_invalid_year(): | ||
nsrdb_fake_dir = str(ROOT_DIR) | ||
resource_year = 2023 | ||
nsrdb_fake_fpath = os.path.join(str(ROOT_DIR),f"nsrdb_{resource_year}.h5") | ||
with pytest.raises(ValueError) as err: | ||
HPCSolarData(lat = 35.201, lon = -101.945, year = resource_year, nsrdb_source_path=nsrdb_fake_dir) | ||
assert str(err.value) == f"Resource year for NSRDB Data must be between 1998 and 2022 but {resource_year} was provided" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love these tests, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of my comments are recommendations, including those involving the changes to the initialization routines to allow for more testing. As it stands this looks pretty good to me and could be merged as-is.
NSRDB_DEP = "/datasets/NSRDB/deprecated_v3/nsrdb_" | ||
|
||
# NOTE: Current version of PSM v3.2.2 which corresponds to /api/nsrdb/v2/solar/psm3-2-2-download | ||
NSRDB_NEW = "/datasets/NSRDB/current/nsrdb_" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend converting these to Path
objects (Path("file_path")
) to make some later manipulations easier.
|
||
if filepath == "" and nsrdb_source_path=="": | ||
# use default filepath | ||
self.nsrdb_file = NSRDB_NEW + "{}.h5".format(self.year) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend the use of f-strings in cases like this so it reads a bit easier. f"{self.year}.h5"
. If you decide to make NSRDB_NEW
a path, this becomes NSRDB_NEW / f"{self.year}.h5"
as an example
self.nsrdb_file = filepath | ||
elif filepath == "" and nsrdb_source_path != "": | ||
# directory of h5 files (nsrdb_source_path) is provided by user | ||
self.nsrdb_file = os.path.join(str(nsrdb_source_path),"nsrdb_{}.h5".format(self.year)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, for using path this would look like: Path(nsrdb_source_path) / f"nsrdb_{self.year}.h5"
. I also passed nsrdb_source_path
to Path()
because it could either be a Path or string, but passing a Path object to Path again won't cause any errors, so it's not worth checking first.
# extract remaining datapoints: year, month, day, hour, minute, dn, df, gh, wspd,tdry, pres, tdew | ||
# NOTE: datasets have readings at 0 and 30 minutes each hour, HOPP/SAM workflow requires only 30 minute reading values -> filter 0 minute readings with [1::2] | ||
# NOTE: datasets are not auto shifted by timezone offset -> wrap extraction in SAMResource.roll_timeseries(input_array, timezone, #steps in an hour=1) to roll timezones | ||
# NOTE: solar_resource.py code references solar_zenith_angle and RH = relative_humidity but I couldn't find them actually being utilized. Captured them below just in case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move these extra long comments to multiple lines? I know we don't have any enforcement on this, but it's nice to be able to read comments without sideways scrolling.
|
||
# round to desired precision and convert to desired data type | ||
# NOTE: unsure if SAM/PySAM is sensitive to data types and decimal precision. | ||
# If not sensitive, can remove .astype() and round() to increase computational efficiency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a good way to find out? It would be nice to have it settled prior to merging, but if this is pretty small potatoes, I also get it.
import os | ||
from hopp.simulation.technologies.resource.resource import Resource | ||
|
||
WTK_V10_BASE = "/datasets/WIND/conus/v1.0.0/wtk_conus_" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about Path
: you don't need to but it's recommended to use them in place of strings to make manipulations easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not done this yet because the base paths include the first part of the filename as well. I am also not sure whether rex (the tool used to load the resource data from .h5 files) is compatible with Path
objects or if it requires strings (I will need to check on this).
self.format_data() | ||
|
||
|
||
def calculate_heights_to_download(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an example of a good method to check in the test suite. Basically anything leading up to the download_resources()
because we won't have access in the CI pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test should be added when the ability to not download resource data is added to these classes (to avoid the file not found error)
self.download_resource() | ||
|
||
# Set wind resource data into SAM/PySAM digestible format | ||
self.format_data() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want these in the __init__()
method? I would assume it's that it's initialized, then downloaded, or would this pattern be more helpful to the intended workflow? An alternative is to have a flag in the __init__
that triggers downloading in the initialization or not, such as download: bool = True
. This would also for passing false and testing paths and other surrounding items for correctness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you. I have these in the __init__()
method because it follows the same workflow as solar_resource.py and wind_resource.py. I think that if we want this change - this may be better suited for PR #415 since that includes the ability to initialize solar_resource.py and wind_resource.py with resource data (and therefore it's not downloaded). Or - I think this could be part of a future PR that includes other development to SiteInfo and the resource functions.
I created an issue (issue #423) to address this
if filepath == "" and wtk_source_path=="": | ||
# use default filepaths based on resource year | ||
if self.year < 2014 and self.year>=2007: | ||
self.wtk_file = WTK_V10_BASE + "{}.h5".format(self.year) | ||
elif self.year == 2014: | ||
self.wtk_file = WTK_V11_BASE + "{}.h5".format(self.year) | ||
elif filepath != "" and wtk_source_path == "": | ||
# filepath (full h5 filepath) is provided by user | ||
if ".h5" not in filepath: | ||
filepath = filepath + ".h5" | ||
self.wtk_file = filepath | ||
elif filepath == "" and wtk_source_path != "": | ||
# directory of h5 files (wtk_source_path) is provided by user | ||
self.wtk_file = os.path.join(str(wtk_source_path),"wtk_conus_{}.h5".format(self.year)) | ||
else: | ||
# use default filepaths | ||
if self.year < 2014: | ||
self.wtk_file = WTK_V10_BASE + "{}.h5".format(self.year) | ||
elif self.year == 2014: | ||
self.wtk_file = WTK_V11_BASE + "{}.h5".format(self.year) | ||
|
||
# Check for valid filepath for Wind Toolkit file | ||
if not os.path.isfile(self.wtk_file): | ||
raise FileNotFoundError(f"Cannot find Wind Toolkit .h5 file, filepath {self.wtk_file} does not exist") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking the produced file paths are what's expected would also be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is partially accomplished in the tests starting with test_wtk_resource_filenotfound_
and test_nsrdb_resource_filenotfound_
. This would also be easier to do if (like you suggested) - resource data was not automatically loaded in the __init__()
function which will always throw a FileNotFound error when running tests locally
if not os.path.isfile(self.nsrdb_file): | ||
raise FileNotFoundError(f"Cannot find NSRDB .h5 file, filepath {self.nsrdb_file} does not exist") | ||
|
||
# Pull data from HPC NSRDB dataset | ||
self.download_resource() | ||
|
||
# Set solar resource data into SAM/PySAM digestible format | ||
self.format_data() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment on these that I left in the WTK class.
Load wind and solar resource data from h5 datasets
Added the option and functionality to load wind and solar resource data from Wind Toolkit and NSRDB data files (usually hosted on HPC but can be downloaded locally too). This new feature is primarily based on work previously done by @dakotaramos. To use this functionality, the user must input
renewable_resource_origin = HPC
when creating theSiteInfo
object. Did not add new tests because this can only be used if the h5 type datafile are downloaded locally or code is run on an HPC that hosts these datasets.Also fixed bug in
SiteInfo
when settingyear
from input argument rather than from the inputdata
dictionary.PR Checklist
RELEASE.md
has been updated to describe the changes made in this PRdocs/
files are up-to-date, or added when necessaryRelated issues
Impacted areas of the software
hopp/simulation/technologies/resource/wind_toolkit_data.py
HPCWindData
: newResource
type class to load wind resource data from Wind Toolkit data files hosted on HPChopp/simulation/technologies/resource/nsrdb_data.py
HPCSolarData
: newResource
type class to load solar resource data from NSRDB data files hosted on HPChopp/simulation/technologies/resource/__init__.py
HPCSolarData
andHPCWindData
hopp/simulation/sites/site_info.py
renewable_resource_origin
attribute to specify whether to download resource data from API call or load from HPC datasetssolar_resource
andwind_resource
to optional inputs to prevent redundant resource file loadingwtk_source_path
andnsrdb_source_path
as optional inputs to specify WTK or NSRDB data set directoriesSiteInfo
as argumentpyproject.toml
Additional supporting information
Test results, if applicable