Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: incorrect iloc behavior in modin when assigning index values based on row indices #7405

Open
2 of 3 tasks
SchwurbeI opened this issue Oct 14, 2024 · 1 comment
Open
2 of 3 tasks
Labels
bug 🦗 Something isn't working Triage 🩹 Issues that need triage

Comments

@SchwurbeI
Copy link

SchwurbeI commented Oct 14, 2024

Modin version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest released version of Modin.

  • I have confirmed this bug exists on the main branch of Modin. (In order to do this you can follow this guide.)

Reproducible Example

import pandas as pd
import modin.pandas as mpd

dict1 = {
    'index_test': [-1, -1, -1]
}
df1 = pd.DataFrame(dict1)
mdf1 = mpd.DataFrame(dict1)

row_indices = [2, 0]
df1.iloc[row_indices, 0] = df1.iloc[row_indices].index
mdf1.iloc[row_indices, 0] = mdf1.iloc[row_indices].index

print(df1)  # as expected: 0, -1, 2
print('-------------')
print(mdf1)  # NOT as expected: 2, -1, 0

#    index_test
# 0           0
# 1          -1
# 2           2
# -------------
#    index_test
# 0           2
# 1          -1
# 2           0

Issue Description

When assigning values using iloc in modin, the behavior deviates from the expected behavior seen with pandas. Specifically, assigning index values to a subset of rows works correctly in pandas, but modin assigns values in wrong order.

Expected Behavior

This issue occurs consistently when trying to assign values based on row indices using iloc in modin. The expected behavior is for modin to mirror pandas behavior, but instead, the values are assigned in a different order.

expected output produced with pandas:

    index_test
 0           0
 1          -1
 2           2

actual output produced with modin:

    index_test
 0           2
 1          -1
 2           0

Error Logs

No response

Installed Versions

PyDev console: using IPython 8.23.0
INSTALLED VERSIONS

commit : 3e951a6
python : 3.11.8
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.22631
machine : AMD64
processor : Intel64 Family 6 Model 186 Stepping 2, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_Austria.1252
Modin dependencies

modin : 0.32.0
ray : 2.20.0
dask : 2024.5.2
distributed : 2024.5.2
pandas dependencies

pandas : 2.2.3
numpy : 1.26.4
pytz : 2024.1
dateutil : 2.9.0.post0
pip : 24.2
Cython : 3.0.10
sphinx : None
IPython : 8.23.0
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.3
blosc : None
bottleneck : 1.3.7
dataframe-api-compat : None
fastparquet : None
fsspec : 2023.10.0
html5lib : None
hypothesis : None
gcsfs : None
jinja2 : 3.1.3
lxml.etree : None
matplotlib : 3.8.2
numba : None
numexpr : 2.8.7
odfpy : None
openpyxl : None
pandas_gbq : None
psycopg2 : None
pymysql : None
pyarrow : 14.0.2
pyreadstat : None
pytest : 8.1.1
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.14.0
sqlalchemy : 2.0.25
tables : 3.9.2
tabulate : 0.9.0
xarray : None
xlrd : None
xlsxwriter : None
zstandard : 0.23.0
tzdata : 2024.1
qtpy : 2.4.1
pyqt5 : None

@SchwurbeI SchwurbeI added bug 🦗 Something isn't working Triage 🩹 Issues that need triage labels Oct 14, 2024
@Liquidmasl
Copy link

This seams to only happen if the order of row indices is random.
Taking the same row indices and sorting them before doing the indexing operation the result is whats expected.

Finding this issue was a huge pain in the ass

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🦗 Something isn't working Triage 🩹 Issues that need triage
Projects
None yet
Development

No branches or pull requests

2 participants