Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write_delta failed pyarrow.lib.ArrowInvalid: Integer value 133 not in range: 0 to 127 #3086

Closed
MDGSF opened this issue Dec 26, 2024 · 2 comments
Labels
binding/python Issues for the Python package bug Something isn't working mre-needed Whether an MRE needs to be provided

Comments

@MDGSF
Copy link

MDGSF commented Dec 26, 2024

Environment

Delta-rs version:
deltalake 0.22.3

Binding:

Environment: python 3.12

  • Cloud provider:
  • OS: wsl ubuntu20.04
  • Other:

Bug

What happened:

import rerun as rr
from deltalake import write_deltalake

recording = rr.dataframe.load_recording("test.rrd")
view = recording.view(index="log_time", contents="/**")
df = view.select().read_pandas()

for col in df.select_dtypes(include=["datetime64[ns]"]).columns:
    df[col] = df[col].astype("datetime64[us]")

write_deltalake("tmp/demo", df)

error log:

Traceback (most recent call last):
  File "/home/dev/git/dev/bicv/rerun/demos/demo03_rrd_to_deltalake_pandas/demo03.py", line 11, in <module>
    write_deltalake("tmp/demo", df)
  File "/home/dev/anaconda3/envs/rerun/lib/python3.12/site-packages/deltalake/writer.py", line 317, in write_deltalake
    data, schema = _convert_data_and_schema(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dev/anaconda3/envs/rerun/lib/python3.12/site-packages/deltalake/writer.py", line 679, in _convert_data_and_schema
    data = convert_pyarrow_table(pa.Table.from_pandas(data), conversion_mode)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dev/anaconda3/envs/rerun/lib/python3.12/site-packages/deltalake/schema.py", line 187, in convert_pyarrow_table
    data = data.cast(schema).to_reader()
           ^^^^^^^^^^^^^^^^^
  File "pyarrow/table.pxi", line 4683, in pyarrow.lib.Table.cast
  File "pyarrow/table.pxi", line 593, in pyarrow.lib.ChunkedArray.cast
  File "/home/dev/anaconda3/envs/rerun/lib/python3.12/site-packages/pyarrow/compute.py", line 405, in cast
    return call_function("cast", [arr], options, memory_pool)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pyarrow/_compute.pyx", line 598, in pyarrow._compute.call_function
  File "pyarrow/_compute.pyx", line 393, in pyarrow._compute.Function.call
  File "pyarrow/error.pxi", line 155, in pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Integer value 133 not in range: 0 to 127

What you expected to happen:

I hope it can write delta success

How to reproduce it:

More details:

@MDGSF MDGSF added the bug Something isn't working label Dec 26, 2024
@ion-elgreco
Copy link
Collaborator

Please share an MRE that can be executed on its own instead of uploading a zip file.

@ion-elgreco ion-elgreco added binding/python Issues for the Python package mre-needed Whether an MRE needs to be provided labels Dec 26, 2024
@MDGSF
Copy link
Author

MDGSF commented Dec 27, 2024

I think I know why, because Delta Lake does not support unsigned integers. Thanks so much.

@MDGSF MDGSF closed this as completed Dec 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package bug Something isn't working mre-needed Whether an MRE needs to be provided
Projects
None yet
Development

No branches or pull requests

2 participants