Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib] SIMD.from_bytes() and SIMD.as_bytes() #3966

Closed
wants to merge 9 commits into from

Conversation

msaelices
Copy link
Contributor

Similar to the Python's int.from_bytes() and int.to_bytes() one.

@@ -100,6 +100,10 @@ what we publish.

### Standard library changes

- New `SIMD.from_bytes()` and `SIMD.as_bytes()` functions to convert a list of bytes
to an scalars and vice versa, accepting the endianess as an argument. Similar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to an scalars -> to scalars or to a list of scalars

Copy link
Contributor Author

@msaelices msaelices Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Fixed: msaelices@399b460

@@ -1956,6 +1959,60 @@ struct SIMD[type: DType, size: Int](

return bitcast[_integral_type_of[type](), size](self).cast[int_dtype]()

@staticmethod
fn from_bytes[
big_endian: Bool = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably default to is_big_endian().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, this is should be sys.info.is_big_endian()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion! Done: msaelices@5852cb9

return value

fn as_bytes[
big_endian: Bool = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably default to is_big_endian().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@staticmethod
fn from_bytes[
big_endian: Bool = False
](bytes: InlineArray[Byte, Self.type_len]) -> Scalar[type]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this to be an InlineArray? Those are annoying to construct from a List or other type, thoughts on a Span variant?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can be a follow up, for now the stricter one is fine with me

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to Span for the API type here.

Copy link
Contributor Author

@msaelices msaelices Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we approach this? The Python int.from_bytes() allows an arbitrary large input because its output is an infinite sized int, like so:

>>> int.from_bytes(b'12321323123212133211213',"big")
4712049637989858926791184397276765665808674789488996659

If we allow an arbitrary large input, should the output stop being an scalar but a n-sized SIMD, being n power or two?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would only consider the first Self.type_len bytes, ignoring everything else, and erroring if not enough bytes are present.

Copy link
Contributor Author

@msaelices msaelices Feb 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try this and open a new PR if I feel useful. I don't like the current ergonomics using InlineArray

value = byte_swap(value)

var ptr = UnsafePointer.address_of(value)
var array = InlineArray[Byte, Self.type_len](fill=0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to leave this uninitalized so that LLVM can clearly see it can remove the zeroing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like more of an optimization, I'm not going to block on it right now

@JoeLoser
Copy link
Collaborator

@lsh do you mind TAL at this since you reviewed the original one before it got reverted internally? Thanks!

@@ -268,6 +269,8 @@ struct SIMD[type: DType, size: Int](
alias _Mask = SIMD[DType.bool, size]

alias element_type = type
alias type_len = type.sizeof()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quibble: I think this alias makes the methods more confusing rather than more clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skongum02 skongum02 deleted the branch modular:main January 29, 2025 18:59
@skongum02 skongum02 closed this Jan 29, 2025
@skongum02 skongum02 reopened this Jan 29, 2025
@skongum02 skongum02 changed the base branch from nightly to main January 29, 2025 20:31
@msaelices msaelices force-pushed the simd-int-from-bytes branch from 6a6ad76 to 67ba5cd Compare January 30, 2025 22:35
Signed-off-by: Manuel Saelices <[email protected]>
@msaelices msaelices force-pushed the simd-int-from-bytes branch from 67ba5cd to 399b460 Compare January 30, 2025 22:36
@lsh
Copy link
Contributor

lsh commented Jan 30, 2025

!sync

@modularbot modularbot added the imported-internally Signals that a given pull request has been imported internally. label Jan 30, 2025
@modularbot
Copy link
Collaborator

✅🟣 This contribution has been merged 🟣✅

Your pull request has been merged to the internal upstream Mojo sources. It will be reflected here in the Mojo repository on the main branch during the next Mojo nightly release, typically within the next 24-48 hours.

We use Copybara to merge external contributions, click here to learn more.

@modularbot modularbot added the merged-internally Indicates that this pull request has been merged internally label Feb 1, 2025
@modularbot
Copy link
Collaborator

Landed in eb56605! Thank you for your contribution 🎉

@modularbot modularbot added the merged-externally Merged externally in public mojo repo label Feb 2, 2025
@modularbot modularbot closed this in eb56605 Feb 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
imported-internally Signals that a given pull request has been imported internally. merged-externally Merged externally in public mojo repo merged-internally Indicates that this pull request has been merged internally
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants