-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCLomatic-test] Block store test #725
base: SYCLomatic
Are you sure you want to change the base?
[SYCLomatic-test] Block store test #725
Conversation
Co-authored-by: Dan Hoeflinger <[email protected]>
thread_data[i] = dacc_read[global_index * 4 + i]; | ||
} | ||
|
||
auto *d_r = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you clarify why we are storing to buffer
here and then storing the thread_data
to buffer_out
? To me, it looks like we are not verifying the group_store
operation since buffer_out
is being passed to the verification routine which is just the value of thread_data
.
item.get_local_id(2); // Each thread_data has 4 elements | ||
#pragma unroll | ||
for (int i = 0; i < 4; ++i) { | ||
dacc_write[global_index * 4 + i] = thread_data[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar question to above about the purpose of this and how we are verifying the store routines at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the recent changes, we are now doing group_store(tmp).store(item, d_w, thread_data);
where d_w
is referring to buffer_out
. This part seems correct, but my comment about these lines still apply.
It seems like the contents of buffer_out
after the group_store
operation are being overwritten by the writes in this loop. Can we just remove this entire loop? I am not sure why it is here unless I misunderstand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes modified to prevent the overwrite, could you please take a look . Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks to resolve the issue
int expected[512]; | ||
int num_threads = 128; | ||
int items_per_thread = 4; | ||
uint32_t sg_sz_val = *sg_sz; | ||
for (int i = 0; i < num_threads; ++i) { | ||
for (int j = 0; j < items_per_thread; ++j) { | ||
expected[items_per_thread * i + j] = | ||
(i / sg_sz_val) * sg_sz_val * items_per_thread + sg_sz_val * j + | ||
i % sg_sz_val; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we are setting up expected
and then never using it. Am I missing something?
} | ||
|
||
for (int i = 0; i < 512; ++i) { | ||
if (ptr[i] != i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the actual check is just against the index i
.
That also seems to be the source data for the input buffer, does this pass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the test is passing since we are overwriting the results of the subgroup store operation in test_store_subgroup_striped_standalone
sg_sz_acc[0] = item.get_sub_group().get_local_linear_range(); | ||
} | ||
dpct::group::store_subgroup_striped<4, int>(item, d_w, thread_data); | ||
// reapply global mapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are overwriting the result of the sub-group store here from lines 173 to 179. If we remove this the test fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting... seems like this should not be here.
Co-authored-by: Dan Hoeflinger <[email protected]>
Co-authored-by: Dan Hoeflinger <[email protected]>
Co-authored-by: Dan Hoeflinger <[email protected]>
In line with block store PR (oneapi-src/SYCLomatic#1819), this tests only store api.
WIP. Linked to composite PR : #680
cc @danhoeflinger @mmichel11 @yihanwg