add renormalize param for FusedMOE #671

tangleintel · 2025-01-09T06:22:34Z

Add renormalize parameter for FusedMOE cause whether to normalize routing_weights depends on the norm_topk_prob attrib in model's config.json file. Some models such as Qwen2-MoE is set to false.

Note: This PR should depend on this one in vllm-hpu-extension. So pls also help review it first.

michalkuligowski · 2025-01-14T09:35:43Z

vllm/model_executor/layers/fused_moe/layer.py

@@ -235,7 +235,7 @@ def __init__(
            from vllm.model_executor.layers.quantization.inc import INCConfig
            selected_fused_moe = (StaticFusedMOE if isinstance(
                quant_config, INCConfig) else DynamicFusedMOE)
-            self.hpu_static_fused_moe = selected_fused_moe(self.num_experts)
+            self.hpu_static_fused_moe = selected_fused_moe(self.num_experts, renormalize=renormalize)


Hi, you will need to update sha for vllm-hpu-extension in requirements-hpu.txt when additional PR is merged

OK, thx for the reminder, I will do it.

add renormalize param for FusedMOE

53a92c1

tangleintel requested review from kzawora-intel, madamczykhabana, michalkuligowski, mgawarkiewicz and vivekgoe as code owners January 9, 2025 06:22

michalkuligowski requested a review from kwisniewski98 January 13, 2025 09:17

kwisniewski98 approved these changes Jan 13, 2025

View reviewed changes

michalkuligowski requested changes Jan 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add renormalize param for FusedMOE #671

add renormalize param for FusedMOE #671

tangleintel commented Jan 9, 2025 •

edited by github-actions bot

Loading

michalkuligowski Jan 14, 2025

tangleintel Jan 14, 2025

add renormalize param for FusedMOE #671

Are you sure you want to change the base?

add renormalize param for FusedMOE #671

Conversation

tangleintel commented Jan 9, 2025 • edited by github-actions bot Loading

michalkuligowski Jan 14, 2025

Choose a reason for hiding this comment

tangleintel Jan 14, 2025

Choose a reason for hiding this comment

tangleintel commented Jan 9, 2025 •

edited by github-actions bot

Loading