Coalesce with null dtype column #21506

jesusestevez · 2025-02-27T17:25:09Z

Checks

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.

Reproducible example

Minimal reproducible example by cmdlineuser:

(pl.DataFrame({"x": [[None, None]], "y": [[1, 2]]})
   #.cast(pl.List(pl.Int64))
   .group_by(1)
   .agg(pl.coalesce(pl.all().flatten()))
)

# shape: (1, 2)
# ┌─────────┬────────────┐
# │ literal ┆ x          │
# │ ---     ┆ ---        │
# │ i32     ┆ list[null] │
# ╞═════════╪════════════╡
# │ 1       ┆ null       │  # [1, 2] is the output with .cast()
# └─────────┴────────────┘

Full example:

import polars as pl

df = pl.DataFrame(
    {
        "missing": [[None, None], [None, None], [None, None]],
        "values1": [[1, None], [3, 4], [5, 6]],
        "values2": [[1, 2], [3, 4], [5, 6]],
    }
)

comparison = lambda x, y: x.eq_missing(y)
coalesce = lambda x: pl.coalesce(x)


df.select(
    compare=comparison(
        pl.col("values1").explode(), pl.col("values2").explode()).implode().over(
            pl.int_range(pl.len())
        ),
    coalesce_values=coalesce(
        (pl.col("values1").explode(), pl.col("values2").explode())).implode().over(
            pl.int_range(pl.len())
        ),
    coalesce_missing=coalesce(
        (pl.col("missing").explode(), pl.col("values2").explode())).implode().over(
            pl.int_range(pl.len())
        )
    )

Log output

Issue description

Following https://discord.com/channels/908022250106667068/957930511999832064/1344616117204680744 We have noticed that there seems to be a bug on the treatment of null values for Null dtype columns.

Expected behavior

I would expect the Null dtype column to be accepted in the coalesce function.

Installed versions

--------Version info---------
Polars:              1.22.0
Index type:          UInt32
Platform:            Windows-11-10.0.22621-SP0
Python:              3.12.3 | packaged by conda-forge | (main, Apr 15 2024, 18:20:11) [MSC v.1938 64 bit (AMD64)]
LTS CPU:             False

----Optional dependencies----
Azure CLI            <not installed>
adbc_driver_manager  <not installed>
altair               5.4.1
azure.identity       <not installed>
boto3                <not installed>
cloudpickle          <not installed>
connectorx           <not installed>
deltalake            <not installed>
fastexcel            0.10.4
fsspec               2024.10.0
gevent               <not installed>
google.auth          <not installed>
great_tables         0.16.1
matplotlib           3.9.1
numpy                2.0.1
openpyxl             3.1.3
pandas               2.2.2
pyarrow              16.1.0
pydantic             2.10.6
pyiceberg            <not installed>
sqlalchemy           1.4.52
torch                <not installed>
xlsx2csv             0.8.2
xlsxwriter           3.1.9

The text was updated successfully, but these errors were encountered:

jesusestevez added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Feb 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Coalesce with null dtype column #21506

Coalesce with null dtype column #21506

jesusestevez commented Feb 27, 2025 •

edited

Loading

Coalesce with null dtype column #21506

Coalesce with null dtype column #21506

Comments

jesusestevez commented Feb 27, 2025 • edited Loading

Checks

Reproducible example

Log output

Issue description

Expected behavior

Installed versions

jesusestevez commented Feb 27, 2025 •

edited

Loading