Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue]: GEMM throw unspecified run-time error #1652

Open
cognaiger9 opened this issue Nov 11, 2024 · 2 comments
Open

[Issue]: GEMM throw unspecified run-time error #1652

cognaiger9 opened this issue Nov 11, 2024 · 2 comments

Comments

@cognaiger9
Copy link

Problem Description

I need to run GEMM for the configuration M×N×K=216000×4608×1152 in the example/01_gemm directory. To make this configuration compile successfully, I've adjusted the block size to 384. Although this change allows compilation, running the configuration results in an "unspecified launch failure." It appears that using any block size other than 256 or other powers of two greater than 256 triggers this error. Could you confirm if this limitation exists, and if so, could you provide an example of parameters that would work for this configuration? Here are my current parameter settings:

using DeviceGemmInstance = ck::tensor_operation::device::DeviceGemm_Xdl_CShuffle
         < ALayout, BLayout, CLayout, ADataType, BDataType, CDataType, AccDataType, CShuffleDataType,  AElementOp,  BElementOp,  CElementOp,    GemmDefault,        1,   384,   384,   128,    32,   8,   2,   32,  32,    4,     2,     S<4, 64, 1>,     S<1, 0, 2>,     S<1, 0, 2>,              2,              2,            2,         1,     S<8, 32, 1>,     S<0, 2, 1>,     S<0, 2, 1>,             1,              4,              2,         1,           2,           2,              S<1, 16, 1, 16>,               8, ck::LoopScheduler::Interwave, ck::PipelineVersion::v1>;

Operating System

Ubuntu 20.04.6 LTS (Focal Fossa)

CPU

AMD EPYC 7413 24-Core Processor

GPU

AMD Instinct MI250X

Other

No response

ROCm Version

ROCm 6.0.0

ROCm Component

Composable Kernel

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

@ppanchad-amd
Copy link

Hi @cognaiger9. Internal ticket is created to investigate your issue. Thanks!

@tcgu-amd
Copy link

Hi @cognaiger9. Yes despite the parameters such as block size being exposed, many of them are not meant to be modified by end users since they are hardware/algorithm specific (as per #630). Please use existing kernel instances's parameters for block size and M, N, K per block parameters. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants