Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QST] When to use MainloopSm90TmaGmmaWarpSpecializedFP8? #2001

Closed
ginowu opened this issue Dec 19, 2024 · 3 comments
Closed

[QST] When to use MainloopSm90TmaGmmaWarpSpecializedFP8? #2001

ginowu opened this issue Dec 19, 2024 · 3 comments

Comments

@ginowu
Copy link

ginowu commented Dec 19, 2024

Hi, there
In sm90_mma_tma_gmma_ss_warpspecialized_fp8.hpp, mma_promotion_interval=4 means will add current 4 MMA's result sum to the ultimate result. My question how could this non-default behavior could improve FP8 accuracy? And could you share some best practice on using this specialized implementation, like how to set mma_promotion_interval according to activation input range?

Thanks!

@ginowu
Copy link
Author

ginowu commented Dec 19, 2024

By the way, all the collective mainloop specializations under include/cutlass/gemm/collective/ have the "mma_promotion_interval" member in Arguments, I understand this treatment makes uniform mainloop argument at host side possible. So the missing of mma_promotion_interval in below file is unexpected?

include/cutlass/gemm/collective/sm90_mma_array_tma_gmma_ss_warpspecialized.hpp

Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@ginowu
Copy link
Author

ginowu commented Jan 22, 2025

I've got the reason of this whole mma promotion by reading the deepseek v3 paper.

@ginowu ginowu closed this as completed Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant