Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64] NEON, SVE2 and SME2 instruction support with tests #439

Open
wants to merge 65 commits into
base: sme2-support
Choose a base branch
from

Conversation

FinnWilkinson
Copy link
Contributor

This PR adds a wide range of different NEON, SVE2, SME2 instructions with regressions tests. These facilitate a subset of some internal SME-based GEMM and GEMV codes.

There is some BF16 prototypical instruction support which by default is disabled (using a new build option and an if statement in each appropriate switch statement case) due to some usage of __bf16 which is not compiler agnostic, some hacky usage of memcpy to re-interpret uint16_t, and a lack of regression tests for the BF16 instructions in question.

These BF16 instructions can be enabled through a new CMake option -DSIMENG_ENABLE_BF16=ON. I have deliberately not included this in the documentation given the possible instibility of the BF16 implementation and to keep it for (mainly) internal usage only.

This branch is based on sme2-support (PR #429 ) and so should be merged after this brnch has been merged into dev.

Some SM2 instructions which use multi-vector operands can be non-trivial to read or understand. Please ask for clarification and suggest any additional comments that may help future understanding.

@FinnWilkinson FinnWilkinson added the enhancement New feature or request label Nov 4, 2024
@FinnWilkinson FinnWilkinson self-assigned this Nov 4, 2024
@FinnWilkinson FinnWilkinson changed the base branch from dev to sme2-support November 4, 2024 18:12
@FinnWilkinson FinnWilkinson force-pushed the sme-loops-support branch 2 times, most recently from f9a759f to f2b86fa Compare November 7, 2024 19:58
…ged address generation logic for ST2W and ST4W.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: ToDo
Development

Successfully merging this pull request may close these issues.

1 participant