Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tentative] Adding new intrinsics for gemm. #98

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Narsil
Copy link

@Narsil Narsil commented Jul 6, 2023

Hi here.

I am attempting to port basically ggml matrix multiplication into a standalone crate: https://github.com/Narsil/ggblas

For most of the operations, I was able to leverage intrinsics: https://doc.rust-lang.org/core/arch/arm/index.html
However for M1 (so arm aarch64), it's missing some SIMD f16 intrinsics.

https://developer.arm.com/documentation/101028/0012/13--Advanced-SIMD--Neon--intrinsics

Not sure if the approach I suggest here is viable, my understanding of low level primitives such as these is fairly limited.

Happy to run a more complete set of operations if this is indeed deemed interesting.

Seems the proper implementation into the compiler itself would be something like : rust-lang/stdarch#344

That's why I felt the intrinsics would have their place here.

Cheers !

Other refS: rust-lang/rfcs#3451

HuggingFace-MacMini-Wozniak and others added 5 commits July 6, 2023 22:00
@Narsil Narsil changed the title [Tentative] Adding new intrinsics for ggblas. [Tentative] Adding new intrinsics for gemm. Aug 1, 2023
@starkat99
Copy link
Owner

starkat99 commented Aug 5, 2023

I'm fine with putting these in the crate, maybe make sure that existing aarch64 assembly in crate doesn't overlap though, and make any existing code use the new names if there is any overlap.

However, I don't want to publicly expose the binary16 module, that's an internal structural implementation detail. Perhaps just expose these at half::arch::aarch64?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants