Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pin dependency versions #43

Open
thwllms opened this issue Jun 3, 2024 · 1 comment
Open

Pin dependency versions #43

thwllms opened this issue Jun 3, 2024 · 1 comment
Labels
testing Related to unit tests, functional testing, etc.

Comments

@thwllms
Copy link
Contributor

thwllms commented Jun 3, 2024

As of early June 2024, rashdf relies on three major dependencies:

  • h5py
  • geopandas
  • pyarrow

We should figure out what the minimum version numbers of these dependencies should be and set them in pyproject.toml. GeoPandas is probably the most sensitive one.

Pinning dependency versions for tests and docs would be good, too.

@thwllms thwllms added the testing Related to unit tests, functional testing, etc. label Jun 3, 2024
@thwllms thwllms changed the title Set minumum dependency versions Pin dependency versions Jun 4, 2024
@thwllms
Copy link
Contributor Author

thwllms commented Jul 16, 2024

Relevant issue with the following file from the Kanawha bucket via @zherbz:

.../sims/ressim/383/ras/LowerKanawha/LowerKanawha.p01.hdf

When rashdf is installed with pandas==2.1.4 and numpy==1.26.0, there's an error extracting mesh cell points.

Python 3.10.14 (main, Apr 18 2024, 16:25:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from src.rashdf import RasPlanHdf
>>> plan_hdf = RasPlanHdf("LowerKanawha.p01.hdf")
>>> plan_hdf.mesh_cell_points()
Traceback (most recent call last):
  File "offsets.pyx", line 4548, in pandas._libs.tslibs.offsets.to_offset
ValueError: invalid literal for int() with base 10: '0.1'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/thomaswilliams/dev/rashdf/src/rashdf/plan.py", line 719, in mesh_cell_points
    return self._mesh_summary_outputs_gdf(
  File "/home/thomaswilliams/dev/rashdf/src/rashdf/plan.py", line 686, in _mesh_summary_outputs_gdf
    df = self.mesh_summary_output(var, round_to=round_to)
  File "/home/thomaswilliams/dev/rashdf/src/rashdf/plan.py", line 602, in mesh_summary_output
    df = methods_with_times[var](round_to=round_to)
  File "/home/thomaswilliams/dev/rashdf/src/rashdf/plan.py", line 567, in mesh_max_ws_err
    df = self._mesh_summary_output_min_max(
  File "/home/thomaswilliams/dev/rashdf/src/rashdf/plan.py", line 359, in _mesh_summary_output_min_max
    times = self._mesh_summary_output_min_max_times(
  File "/home/thomaswilliams/dev/rashdf/src/rashdf/plan.py", line 323, in _mesh_summary_output_min_max_times
    max_ws_times = ras_timesteps_to_datetimes(
  File "/home/thomaswilliams/dev/rashdf/src/rashdf/utils.py", line 307, in ras_timesteps_to_datetimes
    return [
  File "/home/thomaswilliams/dev/rashdf/src/rashdf/utils.py", line 308, in <listcomp>
    start_time + pd.Timedelta(timestep, unit=time_unit).round(round_to)
  File "timedeltas.pyx", line 1949, in pandas._libs.tslibs.timedeltas.Timedelta.round
  File "timedeltas.pyx", line 1912, in pandas._libs.tslibs.timedeltas.Timedelta._round
  File "offsets.pyx", line 4460, in pandas._libs.tslibs.offsets.to_offset
  File "offsets.pyx", line 4557, in pandas._libs.tslibs.offsets.to_offset
ValueError: Invalid frequency: 0.1 s

Sure enough, Pandas 2.1.x doesn't seem to like non-integer offsets:

>>> pd.tseries.frequencies.to_offset("1s")
<Second>
>>> pd.tseries.frequencies.to_offset("10s")
<10 * Seconds>
>>> pd.tseries.frequencies.to_offset("1.0s")
Traceback (most recent call last):
  File "offsets.pyx", line 4548, in pandas._libs.tslibs.offsets.to_offset
ValueError: invalid literal for int() with base 10: '1.0'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "offsets.pyx", line 4460, in pandas._libs.tslibs.offsets.to_offset
  File "offsets.pyx", line 4557, in pandas._libs.tslibs.offsets.to_offset
ValueError: Invalid frequency: 1.0s
>>> pd.tseries.frequencies.to_offset("0.001s")
Traceback (most recent call last):
  File "offsets.pyx", line 4548, in pandas._libs.tslibs.offsets.to_offset
ValueError: invalid literal for int() with base 10: '0.001'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "offsets.pyx", line 4460, in pandas._libs.tslibs.offsets.to_offset
  File "offsets.pyx", line 4557, in pandas._libs.tslibs.offsets.to_offset
ValueError: Invalid frequency: 0.001s

But things work fine with pandas==2.2.2 and numpy==2.0.0:

>>> pd.tseries.frequencies.to_offset("1.0s")
<Second>
>>> pd.tseries.frequencies.to_offset("0.1s")
<100 * Millis>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Related to unit tests, functional testing, etc.
Projects
None yet
Development

No branches or pull requests

1 participant