-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
profiling execute_multipass issues #2238
Comments
rjodinchr
added a commit
to rjodinchr/OpenCL-CTS
that referenced
this issue
Jan 22, 2025
- fix clGetDeviceInfo(CL_DEVICE_MAX_WORK_ITEM_SIZES) by using the proper size - clamp localThreads[2] as for localThreads[0] and localThreads[2] - clamp all localThreads elements in regard of CL_MAX_WORK_GROUP_SIZE - fix the size using to create/read the output buffer Fix KhronosGroup#2238
rjodinchr
added a commit
to rjodinchr/OpenCL-CTS
that referenced
this issue
Jan 22, 2025
- fix clGetDeviceInfo(CL_DEVICE_MAX_WORK_ITEM_SIZES) by using the proper size - clamp all localThreads elements with regard to CL_MAX_WORK_GROUP_SIZE - fix the size using to create/read the output buffer Fix KhronosGroup#2238
rjodinchr
added a commit
to rjodinchr/OpenCL-CTS
that referenced
this issue
Jan 22, 2025
- fix clGetDeviceInfo(CL_DEVICE_MAX_WORK_ITEM_SIZES) by using the proper size - clamp all localThreads elements with regard to CL_MAX_WORK_GROUP_SIZE - fix the size using to create/read the output buffer Fix KhronosGroup#2238
rjodinchr
added a commit
to rjodinchr/OpenCL-CTS
that referenced
this issue
Jan 22, 2025
- fix clGetDeviceInfo(CL_DEVICE_MAX_WORK_ITEM_SIZES) by using the proper size - clamp all localThreads elements with regard to CL_MAX_WORK_GROUP_SIZE - fix the size using to create/read the output buffer Fix KhronosGroup#2238
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
That test is failing using
clvk
and various Vulkan drivers (llvmpipe
,swiftshader
, mesa-based drivers).I am seeing multiple issues in the test itself:
OpenCL-CTS/test_conformance/profiling/execute_multipass.cpp
Line 107 in 5b35180
localThreads
type issize_t[3]
. Thus the size given is not right, and we proper implementation should end up with undefined values inlocalThreads[1]
&localThreads[2]
at least. Most probably alsolocalThreads[0]
ascl_uint
might be smaller thansize_t
...each element of
localThreads
is clamp to its max value, except forlocalThreads[2]
. I'm not sure I understand why, and would also clamp itWe end up with a workgroup using the maximum work-item sizes per dimensions (
CL_DEVICE_MAX_WORK_ITEM_SIZES
). This can be higher thanCL_DEVICE_MAX_WORK_GROUP_SIZE
on some devices, thus we would also need to make sure not to get higher than it.The input and output buffer are declared as
cl_uchar[w * h * d * nChannels]
. But4.1.
OpenCL-CTS/test_conformance/profiling/execute_multipass.cpp
Line 136 in 5b35180
The output is created with a bigger size (
*sizeof(cl_float)
)4.2.
OpenCL-CTS/test_conformance/profiling/execute_multipass.cpp
Line 240 in 5b35180
The output is read with a bigger size (
*4
)This is trigger segfault because the host buffer where the output should be stored is smaller than the CL buffer.
The text was updated successfully, but these errors were encountered: