You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.
Reproducible example
importpolarsaspld=pl.DataFrame({
'year': [1970],
'value': [None]
}, schema={'year': pl.Int32(), 'value': pl.Int32()})
# This line doesn't affect things at all# d = d.lazy()d=d.filter(pl.col('value').is_not_null())
final=d.select(time=pl.datetime(pl.col('year'), 1, 1), value=pl.col('value'))
print(final)
print(final.collect())
Log output
---------------------------------------------------------------------------
ShapeError Traceback (most recent call last)
<ipython-input-24-2a2004cbf8f4>in<cell line: 16>()
14
15
---> 16 final = d.select(time=pl.datetime(pl.col('year'), 1, 1), value=pl.col('value'))
17
18
1 frames
/usr/local/lib/python3.10/dist-packages/polars/lazyframe/frame.py in collect(self, type_coercion, predicate_pushdown, projection_pushdown, simplify_expression, slice_pushdown, comm_subplan_elim, comm_subexpr_elim, cluster_with_columns, collapse_joins, no_optimization, streaming, engine, background, _eager, **_kwargs)
2053 # Only for testing purposes
2054 callback = _kwargs.get("post_opt_callback", callback)
-> 2055 return wrap_df(ldf.collect(callback))
2056
2057 @overload
ShapeError: Series time, length 1 doesn't match the DataFrame height of 0If you want expression: col("year").dt.datetime([dyn int: 1, dyn int: 1, dyn int: 0, dyn int: 0, dyn int: 0, dyn int: 0, String(raise)]) to be broadcasted, ensure it is a scalar (for instance by adding '.first()').
Issue description
Polars filtering and pl.datetime don't seem to work together at all. When you filter a dataset and then use datetime to construct a result you get the error seen above.
This error seems to happen in both lazy and eager mode, the main requirement seems to be the combination of filter and datetime.
Checks
Reproducible example
Log output
Issue description
Polars filtering and pl.datetime don't seem to work together at all. When you filter a dataset and then use datetime to construct a result you get the error seen above.
This error seems to happen in both lazy and eager mode, the main requirement seems to be the combination of filter and datetime.
https://colab.research.google.com/drive/1N6ByBcHuOKYuE5YNJwtPCzAlv5N4U8qF?usp=sharing is an interactive example of the bug.
Expected behavior
The expected behavior is to get an empty dataframe result.
Installed versions
--------Version info---------
Polars: 1.12.0
Index type: UInt32
Platform: Linux-6.1.85+-x86_64-with-glibc2.35
Python: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0]
LTS CPU: False
----Optional dependencies----
adbc_driver_manager
altair 4.2.2
cloudpickle 3.1.0
connectorx
deltalake
fastexcel
fsspec 2024.10.0
gevent
great_tables
matplotlib 3.8.0
nest_asyncio 1.6.0
numpy 1.26.4
openpyxl 3.1.5
pandas 2.2.2
pyarrow 17.0.0
pydantic 2.9.2
pyiceberg
sqlalchemy 2.0.36
torch 2.5.0+cu121
xlsx2csv
xlsxwriter
The text was updated successfully, but these errors were encountered: