Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic when creating simple nested dataframe with numpy #17531

Open
2 tasks done
coastalwhite opened this issue Jul 9, 2024 · 1 comment
Open
2 tasks done

Panic when creating simple nested dataframe with numpy #17531

coastalwhite opened this issue Jul 9, 2024 · 1 comment
Labels
A-interop-numpy Area: interoperability with NumPy A-panic Area: code that results in panic exceptions bug Something isn't working P-low Priority: low python Related to Python Polars

Comments

@coastalwhite
Copy link
Collaborator

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl
import numpy as np

arr2 = np.random.randint(0, 32, size=(10, 1))
arr2 = np.append(arr2, [[None]], axis=0)
df = pl.DataFrame({ 'x': arr2 }, schema={'x': pl.List(pl.Int8)})

Log output

No response

Issue description

Creating this simple dataframe will always give an error with fixedsizelists

thread '<unnamed>' panicked at crates/polars-core/src/series/ops/reshape.rs:159:26:
called `Result::unwrap()` on an `Err` value: ComputeError(ErrString("FixedSizeListArray's child's DataType must match. However, the expected DataType is Unknown while it got FixedSizeBinary(8)."))
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "/home/johndoe/Projects/polars/t.py", line 12, in <module>
    df = pl.DataFrame({ 'x': arr2 }, schema={'x': pl.List(pl.Int8)})
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/johndoe/Projects/polars/py-polars/polars/dataframe/frame.py", line 360, in __init__
    self._df = dict_to_pydf(
               ^^^^^^^^^^^^^
  File "/home/johndoe/Projects/polars/py-polars/polars/_utils/construction/dataframe.py", line 159, in dict_to_pydf
    for s in _expand_dict_values(
             ^^^^^^^^^^^^^^^^^^^^
  File "/home/johndoe/Projects/polars/py-polars/polars/_utils/construction/dataframe.py", line 388, in _expand_dict_values
    updated_data[name] = pl.Series(
                         ^^^^^^^^^^
  File "/home/johndoe/Projects/polars/py-polars/polars/series/series.py", line 300, in __init__
    self._s = numpy_to_pyseries(
              ^^^^^^^^^^^^^^^^^^
  File "/home/johndoe/Projects/polars/py-polars/polars/_utils/construction/series.py", line 465, in numpy_to_pyseries
    return wrap_s(py_s).reshape(original_shape)._s
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/johndoe/Projects/polars/py-polars/polars/series/series.py", line 6790, in reshape
    return self._from_pyseries(self._s.reshape(dimensions))
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: ComputeError(ErrString("FixedSizeListArray's child's DataType must match. However, the expected DataType is Unknown while it got FixedSizeBinary(8)."))

Expected behavior

No panic

Installed versions

--------Version info---------
Polars:               1.1.0
Index type:           UInt32
Platform:             Linux-6.6.32-x86_64-with-glibc2.39
Python:               3.11.9 (main, Apr  2 2024, 08:25:04) [GCC 13.2.0]

----Optional dependencies----
adbc_driver_manager:  0.11.0
cloudpickle:          3.0.0
connectorx:           0.3.3
deltalake:            0.17.4
fastexcel:            <not installed>
fsspec:               2024.3.0
gevent:               24.2.1
great_tables:         <not installed>
hvplot:               0.9.2
matplotlib:           3.8.4
nest_asyncio:         1.6.0
numpy:                1.26.4
openpyxl:             3.1.2
pandas:               2.2.1
pyarrow:              16.0.0
pydantic:             2.6.3
pyiceberg:            <not installed>
sqlalchemy:           2.0.30
torch:                <not installed>
xlsx2csv:             0.8.2
xlsxwriter:           3.2.0
@coastalwhite coastalwhite added bug Something isn't working python Related to Python Polars needs triage Awaiting prioritization by a maintainer A-interop-numpy Area: interoperability with NumPy A-panic Area: code that results in panic exceptions P-low Priority: low and removed needs triage Awaiting prioritization by a maintainer labels Jul 9, 2024
@github-project-automation github-project-automation bot moved this to Ready in Backlog Jul 9, 2024
@coastalwhite coastalwhite changed the title Panic when creating simple dataframe with numpy Panic when creating simple nested dataframe with numpy Jul 9, 2024
@coastalwhite
Copy link
Collaborator Author

Another, more minimal, example:

import polars as pl
import numpy as np

arr = np.array([[None]])
df = pl.DataFrame({ 'x': arr })

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-interop-numpy Area: interoperability with NumPy A-panic Area: code that results in panic exceptions bug Something isn't working P-low Priority: low python Related to Python Polars
Projects
Status: Ready
Development

No branches or pull requests

1 participant