Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add asynchronous copy #45

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

PearCoding
Copy link
Contributor

Add asynchronous copy operation anydsl_copy_async.

The "async" is only a hint and only works on CUDA and OpenCL. Did not find a suitable method for HSA.
CPU could have async, but usually the host is handled as a single unit without async capabilities, therefore it was not added intentionally.

Tested with Rodent (Artic).

@Hugobros3
Copy link
Contributor

If the copy is asynchronous, how do you know it's finished ? Device-wide barrier ?

@PearCoding
Copy link
Contributor Author

Yes. Unfortunately, there is no access to streams or other finer-grade barriers in the API. Having a common set between all the device types we support is quite difficult. Especially because of OpenCL. :/

If you have an idea for finer-grade barriers, feel free to mention it. I am very interested in that :D

@richardmembarth
Copy link
Member

richardmembarth commented Sep 13, 2023

For HSA, you can use hsa_amd_memory_async_copy on AMD GPUs.

@PearCoding
Copy link
Contributor Author

The hsa function requires signals (which might be useful for events [other PR]). What would be the best practice to provide them for each call without exposing it to the AnyDSL user? Having a platform / device specific list of current signals?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants