[Tentative] Adding new intrinsics for gemm. #98
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi here.
I am attempting to port basically ggml matrix multiplication into a standalone crate: https://github.com/Narsil/ggblas
For most of the operations, I was able to leverage intrinsics: https://doc.rust-lang.org/core/arch/arm/index.html
However for M1 (so arm aarch64), it's missing some SIMD f16 intrinsics.
https://developer.arm.com/documentation/101028/0012/13--Advanced-SIMD--Neon--intrinsics
Not sure if the approach I suggest here is viable, my understanding of low level primitives such as these is fairly limited.
Happy to run a more complete set of operations if this is indeed deemed interesting.
Seems the proper implementation into the compiler itself would be something like : rust-lang/stdarch#344
That's why I felt the intrinsics would have their place here.
Cheers !
Other refS: rust-lang/rfcs#3451