Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WASM group by 32bit/64bit conversion bugfix #17793

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

daBlesr
Copy link

@daBlesr daBlesr commented Jul 22, 2024

On WASB, group_by breaks because usize is by default 32 bit. For encoding rows to i64 type, elements need only be casted, but on WASM it also needs creation of new vector.
Fixes: #17192

@@ -87,8 +93,13 @@ impl RowsEncoded {

unsafe {
let (_, values, _) = mmap::slice(&self.values).into_inner();
let offsets = bytemuck::cast_slice::<usize, i64>(self.offsets.as_slice());
let (_, offsets, _) = mmap::slice(offsets).into_inner();
let offsets = match bytemuck::try_cast_slice::<usize, i64>(self.offsets.as_slice()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should then just keep a Vec<u64> instead of a Vec<usize>.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will update the code 👍

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ritchie46 , I updated the code, I am not all too happy about it because I introduced lots of casting from u64 to usize and vice versa. Imo there is more significant refactoring needed to reduce the casting - by abstracting the RowsEncoded such that the other structs are no longer able to directly write on the offset vector. If you agree with that, I can do it, let me know.

Also, on more point, the WASM build also fails on utf8_to module. I misled the compiler, but it isn't a real fix. Please take a look at that part as well.

@daBlesr
Copy link
Author

daBlesr commented Jul 26, 2024

Hi @ritchie46 , would you mind taking another look at the PR? Much appreciated 😄 .

@kgv
Copy link
Contributor

kgv commented Oct 21, 2024

Hi, @ritchie46, is there any reason why this PR has not yet been merged? Is it sufficient to update it to its current state?

@ritchie46
Copy link
Member

Ci was never green. It was never in a state for me to consider it merging.

# Conflicts:
#	Cargo.lock
#	crates/polars-row/src/encode.rs
#	crates/polars-row/src/fixed.rs
#	crates/polars-row/src/row.rs
#	crates/polars-row/src/variable.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Group By no longer supported in Rust WASM
3 participants