Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Fix the barrier dependency for OOO profiling tags #16112

Merged

Conversation

steffenlarsen
Copy link
Contributor

@steffenlarsen steffenlarsen commented Nov 18, 2024

This commit fixes an issue where the barrier before the timestamp enqueued for the profiling tag in an out-of-order queue did not prevent future work from being enqueued prior to the start/end of the profiling tag.

This commit fixes an issue where the barrier before the timestamp
enqueued for the profiling tag in an out-of-order queue did not prevent
future work from being enqueued prior to the start/end of the profiling
tag.

Since the implementation now needs a marker, a barrier and a timestamp
enqueued onto the queue, this patch also optimizes the case of profiling
tags on out-of-order queues with profiling enabled, as we can achieve
the same with just a single barrier, similar to the fallback case for
devices that do not support timestamps natively.

Signed-off-by: Larsen, Steffen <[email protected]>
Copy link
Contributor

@againull againull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me, though I am not sure what may cause memory leak detected by CI.

@steffenlarsen
Copy link
Contributor Author

steffenlarsen commented Nov 19, 2024

Changes look good to me, though I am not sure what may cause memory leak detected by CI.

Looks like there is a bug in the L0 adapter for marker events, requiring 2 frees. For now I think we can use a barrier and switch to markers when the problem has been fixed. See oneapi-src/unified-runtime#2347.

Note also that there was a failing test on Gen12, which seems to be due to the timestamps related to barriers. We already knew that there were such issues for barrier events in OpenCL, so it may be related. That said, it was just an optimization and not as urgent as the rest of the changes in this patch, so I will split it into a separate patch and investigate further. (#16116)

Signed-off-by: Larsen, Steffen <[email protected]>
@againull againull merged commit b7607f0 into intel:sycl Nov 19, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants