Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate KleidiAI for MatMulNBits via MlasQNBitGemm #23627

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

MichaelTylerArm
Copy link
Contributor

Description

This PR integrates Arm® KleidiAI™ to provide optimized assembly kernels for matrix multiplication with 4-bit quantized weights. These changes target the MlasQNBitGemm functions, and can be utilized via the MatMulNBits operator.

Motivation and Context

These optimized assembly kernels lead to significant performance improvements on Arm-based devices.

@MichaelTylerArm MichaelTylerArm requested a review from a team as a code owner February 10, 2025 10:53
@@ -99,6 +100,10 @@ function(setup_mlas_source_for_windows)
${MLAS_SRC_DIR}/halfgemm_kernel_neon_fp16.cpp
)

setup_kleidiai()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means that kleidiai will be a new dependency for all ONNX Runtime build configs. For such changes the onnx runtime team needs to hold an internal discussion with the leadership of this project.

Copy link
Member

@snnn snnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot move forward until the internal review is complete, since this PR adds a new dependency.

@snnn
Copy link
Member

snnn commented Feb 10, 2025

Please fix the iOS build errors.

Signed-off-by: Michael Tyler <[email protected]>
Signed-off-by: Michael Tyler <[email protected]>
Signed-off-by: Michael Tyler <[email protected]>
@jywu-msft
Copy link
Member

/azp run Linux CPU CI Pipeline, Windows CPU CI Pipeline, Linux QNN CI Pipeline

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@MichaelTylerArm
Copy link
Contributor Author

Can the workflows be retriggered please?

@jywu-msft
Copy link
Member

/azp run Linux CPU CI Pipeline, Windows CPU CI Pipeline, Linux QNN CI Pipeline

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

Signed-off-by: Michael Tyler <[email protected]>
Signed-off-by: Michael Tyler <[email protected]>
@MichaelTylerArm
Copy link
Contributor Author

Can the workflows be retriggered please?

@edgchen1
Copy link
Contributor

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Linux CPU CI Pipeline

@edgchen1
Copy link
Contributor

/azp run Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Linux Android Emulator QNN CI Pipeline,Android CI Pipeline

@edgchen1
Copy link
Contributor

/azp run iOS CI Pipeline,ONNX Runtime React Native CI Pipeline,CoreML CI Pipeline,Linux DNNL CI Pipeline,Linux MIGraphX CI Pipeline,Linux ROCm CI Pipeline

Copy link

Azure Pipelines successfully started running 6 pipeline(s).

Copy link

Azure Pipelines successfully started running 10 pipeline(s).

1 similar comment
Copy link

Azure Pipelines successfully started running 10 pipeline(s).

MichaelTylerArm and others added 10 commits February 20, 2025 16:05
Signed-off-by: Michael Tyler <[email protected]>
Signed-off-by: Michael Tyler <[email protected]>
Signed-off-by: Michael Tyler <[email protected]>
Signed-off-by: Michael Tyler <[email protected]>
Signed-off-by: Michael Tyler <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
Signed-off-by: Michael Tyler <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants