[WIP] introduce CUDA managed memory and use it for a matching function #157

griwodz · 2024-07-28T10:26:03Z

Description

Introduce the use of CUDA Managed Memory (CMM) and use it in a new feature matching function.

The main change of this PR adds the malloc_mgd and free_mgd functions and the malloc_mgdT template.
As a secondary change, add a new feature matching function that uses CMM.
The demo program popsift-match has been changed to use the new function, making it easy to print results on screen.

Features list

add functions malloc_mgd and free_mgd
add template malloc_mgdT
use managed memory to store extracted features
add FeaturesDev::matchAndReturn, a new matching function that use CMM
change demo program popsift-match to use FeaturesDev::matchAndReturn and print results on screen

Implementation remarks

CMM allows the programmer to allocate flat 1D memory that is accessible for both the CPU and the GPU. The CUDA device driver guesses on which side the memory is needed next, and performs the transfer in the background. On devices where CPU and GPU share physical memory, this is even better because memory copies can be avoided altogether. Using CMM started to make sense with CUDA CC 6.0 ("Pascal").

Using CMM is purportedly safe, but that is not true. If the programmer doesn't keep track of the side that control the memory at any time, there will be race conditions. On the NVidia Tegra, a shared memory architecture, we have seen race conditions spanning several allocated memory regions when those regions are so small that they fit into the same memory page, e.g. control structures. The simple way of preventing race conditions is cudaDeviceSynchronize. It is more efficient to use cudaMemAdvise to tell the driver that the CPU uses it next (don't forget to unset the location hint after the CPU is finished using the memory!), and cudaMemPrefetchAsync to inform the driver about the specific stream on the GPU that will need the memory (the other streams don't have to wait for it).

introduce CUDA managed memory and use it for a matching function

9d1e86d

griwodz self-assigned this Jul 28, 2024

griwodz added type:feature cuda issues related to cuda versions labels Jul 28, 2024

griwodz changed the title ~~introduce CUDA managed memory and use it for a matching function~~ [WIP] introduce CUDA managed memory and use it for a matching function Jul 28, 2024

Merge branch 'develop' into dev/add-cuda-managed-memory

02977aa

griwodz added the in progress label Aug 12, 2024

griwodz marked this pull request as draft August 12, 2024 10:57

Carsten Griwodz added 2 commits August 14, 2024 13:36

Merge branch 'develop' into dev/add-cuda-managed-memory

23811c2

Merge branch 'develop' into dev/add-cuda-managed-memory

29f8f35

griwodz mentioned this pull request Aug 15, 2024

Pypopsift crashes. Investigate. #164

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] introduce CUDA managed memory and use it for a matching function #157

[WIP] introduce CUDA managed memory and use it for a matching function #157

griwodz commented Jul 28, 2024 •

edited

Loading

[WIP] introduce CUDA managed memory and use it for a matching function #157

Are you sure you want to change the base?

[WIP] introduce CUDA managed memory and use it for a matching function #157

Conversation

griwodz commented Jul 28, 2024 • edited Loading

Description

Features list

Implementation remarks

griwodz commented Jul 28, 2024 •

edited

Loading