Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downcasting temporal units silently overflows #21493

Open
2 tasks done
FHTMitchell opened this issue Feb 27, 2025 · 0 comments
Open
2 tasks done

Downcasting temporal units silently overflows #21493

FHTMitchell opened this issue Feb 27, 2025 · 0 comments
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars

Comments

@FHTMitchell
Copy link

FHTMitchell commented Feb 27, 2025

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

In [13]: df = pl.DataFrame(
    ...:     [pl.Series("dt", [datetime.max, datetime.now()], dtype=pl.Datetime("us"))]
    ...: ).with_columns(td=pl.col("dt") - datetime.fromtimestamp(0))

In [14]: df
Out[14]: 
shape: (2, 2)
┌────────────────────────────┬───────────────────────────────┐
│ dttd                            │
│ ------                           │
│ datetime[μs]               ┆ duration[μs]                  │
╞════════════════════════════╪═══════════════════════════════╡
│ 9999-12-31 23:59:59.9999992932896d 22h 59m 59s 999999µs │
│ 2025-02-27 12:06:06.40359120146d 11h 6m 6s 403591µs     │
└────────────────────────────┴───────────────────────────────┘

In [15]: df.select(
    ...:     pl.col("dt").cast(pl.Datetime("ns"), strict=False),
    ...:     pl.col("td").cast(pl.Duration("ns"), strict=False),
    ...: )
Out[15]: 
shape: (2, 2)
┌───────────────────────────────┬─────────────────────────────────┐
│ dttd                              │
│ ------                             │
│ datetime[ns]                  ┆ duration[ns]                    │
╞═══════════════════════════════╪═════════════════════════════════╡
│ 1816-03-30 05:56:08.066276376-56158d -19h -3m -51s -9337236… │
│ 2025-02-27 12:06:06.40359120146d 11h 6m 6s 403591µs       │
└───────────────────────────────┴─────────────────────────────────┘

In [16]: df.select(
    ...:     pl.col("dt").cast(pl.Datetime("ns"), strict=True),
    ...:     pl.col("td").cast(pl.Duration("ns"), strict=True),
    ...: )
Out[16]: 
shape: (2, 2)
┌───────────────────────────────┬─────────────────────────────────┐
│ dttd                              │
│ ------                             │
│ datetime[ns]                  ┆ duration[ns]                    │
╞═══════════════════════════════╪═════════════════════════════════╡
│ 1816-03-30 05:56:08.066276376-56158d -19h -3m -51s -9337236… │
│ 2025-02-27 12:06:06.40359120146d 11h 6m 6s 403591µs       │
└───────────────────────────────┴─────────────────────────────────┘

In [17]: df.select(pl.col("dt").dt.cast_time_unit("ns"), pl.col("td").dt.cast_time_unit("ns"))
Out[17]: 
shape: (2, 2)
┌───────────────────────────────┬─────────────────────────────────┐
│ dttd                              │
│ ------                             │
│ datetime[ns]                  ┆ duration[ns]                    │
╞═══════════════════════════════╪═════════════════════════════════╡
│ 1816-03-30 05:56:08.066276376-56158d -19h -3m -51s -9337236… │
│ 2025-02-27 12:11:25.43201520146d 11h 11m 25s 432015µs     │
└───────────────────────────────┴─────────────────────────────────┘

Log output

Issue description

When casting from pl.Datetime("us") to pl.Datetime("ns") (or from "ms" to "us" or "ns"), polars will not respect its own documentation about invalid values, where if strict=True the above should error and strict=False the above should return [null]. Instead it appears to overflow for values out of bounds for the more precise time_unit.

The same can be seen for pl.Duration.

Expected behavior

Polars should error, return null or wrap depending on the option passed to .cast for , rather than always wrapping, for temporal types.

Installed versions

pl.show_versions()
--------Version info---------
Polars:              1.22.0
Index type:          UInt32
Platform:            Linux-4.18.0-553.30.1.el8_10.x86_64-x86_64-with-glibc2.28
Python:              3.12.5 | packaged by conda-forge | (main, Aug  8 2024, 18:36:51) [GCC 12.4.0]
LTS CPU:             False

----Optional dependencies----
Azure CLI            <not installed>
adbc_driver_manager  1.2.0
altair               <not installed>
azure.identity       <not installed>
boto3                <not installed>
cloudpickle          2.2.1
connectorx           <not installed>
deltalake            <not installed>
fastexcel            <not installed>
fsspec               2025.2.0
gevent               <not installed>
google.auth          <not installed>
great_tables         <not installed>
matplotlib           3.8.4
numpy                1.26.4
openpyxl             <not installed>
pandas               2.1.4
pyarrow              16.1.0
pydantic             2.7.1
pyiceberg            <not installed>
sqlalchemy           1.4.49
torch                <not installed>
xlsx2csv             <not installed>
@FHTMitchell FHTMitchell added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Feb 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

1 participant