-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add roctx ranges to TinyProfiler #2057
Add roctx ranges to TinyProfiler #2057
Conversation
On spock, one can also compile with |
Src/Base/AMReX_GpuDevice.cpp
Outdated
@@ -343,6 +353,9 @@ Device::Initialize () | |||
amrex::Print() << "HIP initialized.\n"; | |||
} | |||
} | |||
#if defined(AMREX_USE_ROCTX) | |||
roctracer_start(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason, we call profileStart
twice in ifdef AMREX_USE_CUDA
. I assume that's uncessary. (@maxpkatz) So I suggest we move roctracer_start
into profileStart
, remove the two calls to profileStart
in #ifdef AMREX_USE_CUDA
, and put a call to profileStart
at the end of Device::Initialize
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why it's there twice but I agree that we only need one of those two., and it makes sense to put it at the end of Initialize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might not be remembering this right, but did we try to put a nvtx region around the Initialization to ID if there was unneeded start-up at some point? Is that why it's in there twice, we tried to move the init back to allow wrapping Init?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are only calls to Print() between the two calls to cudaProfilerStart.
Src/Base/AMReX_GpuDevice.cpp
Outdated
@@ -358,6 +371,11 @@ Device::Finalize () | |||
cudaProfilerStop(); | |||
#endif | |||
|
|||
#if defined(AMREX_USE_HIP) && defined(AMREX_USE_ROCTX) | |||
roctracer_stop(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we move roctracer_stop to Device::profilerStop
, remove the call to cudaProfilerStop
above, and call Device::profileStop()
.
Src/Base/AMReX_TinyProfiler.cpp
Outdated
@@ -98,6 +98,12 @@ TinyProfiler::start () noexcept | |||
} | |||
nvtxRangePush(fname.c_str()); | |||
#endif | |||
#if defined(AMREX_USE_HIP) && defined(AMREX_USE_ROCTX) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be better to use #elif defined()
Src/Base/AMReX_TinyProfiler.cpp
Outdated
@@ -178,6 +184,12 @@ TinyProfiler::stop () noexcept | |||
amrex::Gpu::Device::synchronize(); | |||
} | |||
nvtxRangePop(); | |||
#endif | |||
#if defined(AMREX_USE_HIP) && defined(AMREX_USE_ROCTX) | |||
if (device_synchronize_around_region) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also we probably should move
if (device_synchronize_around_region) {
amrex::Gpu::Device::synchronize();
}
outside the ifdefs so that it's called for dpcpp too.
Once the rocm issue is resolved on spock, we should remove AMREX_USE_ROCTX to be consistent with how we do things in CUDA. |
* Make nvtx, roctx, and roctracer_ext header includes have no quotes * Rearrange profiler calls in Initialize and Finalize * Change to elif and move input-flag controlled Gpu synchronize in TinyProfiler * Remove hip version warning as the latest version should always be used
Is there already an
|
roctracer-dev should be a dependency of rocm-dev, which we already include |
Co-authored-by: Axel Huebl <[email protected]>
Ok, feel free to add |
@@ -473,6 +473,8 @@ The list of available options is reported in the :ref:`table <tab:cmakevar>` bel | |||
+------------------------------+-------------------------------------------------+-------------------------+-----------------------+ | |||
| AMReX_PROFPARSER | Build with profile parser support | NO | YES, NO | | |||
+------------------------------+-------------------------------------------------+-------------------------+-----------------------+ | |||
| AMReX_ROCTX | Build with roctx markup profiling support | NO | YES, NO | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on default choice above:
| AMReX_ROCTX | Build with roctx markup profiling support | NO | YES, NO | | |
| AMReX_ROCTX | Build with roctx markup profiling support | ON (if HIP) | YES, NO | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My view is we should simply remove this option in the future once it's fixed on spock.
@@ -114,6 +115,15 @@ ifeq ($(HIP_COMPILER),clang) | |||
# rocThrust - Header only | |||
# SYSTEM_INCLUDE_LOCATIONS += $(ROC_PATH)/rocthrust/include | |||
|
|||
ifeq ($(USE_ROCTX),TRUE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also turn this on by default with GNUmake builds, as soon as HIP is found.
HIPCC_FLAGS += -DAMREX_USE_ROCTX | ||
SYSTEM_INCLUDE_LOCATIONS += $(ROC_PATH)/roctracer/include $(ROC_PATH)/rocprofiler/include | ||
LIBRARY_LOCATIONS += $(ROC_PATH)/roctracer/lib $(ROC_PATH)/rocprofiler/lib | ||
LIBRARIES += -Wl,--rpath=${ROC_PATH}/roctracer/lib -lroctracer64 -lroctx64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we add RPATHS automatically these days from the PR I added in #590 ?
We should probably rely on the USE_RPATH
GNUmake option for this lib as well, so users can explicitly control RPATHs with a central option alone:
LIBRARIES += -Wl,--rpath=${ROC_PATH}/roctracer/lib -lroctracer64 -lroctx64 | |
LIBRARIES += -lroctracer64 -lroctx64 |
Co-authored-by: Axel Huebl <[email protected]>
Co-authored-by: Axel Huebl <[email protected]>
Co-authored-by: Axel Huebl <[email protected]>
Just rely on USE_RPATH instead for GNUmake. Follow-up to #2057
Just rely on USE_RPATH instead for GNUmake. Follow-up to #2057
Summary
Adds roctx markup similarly to existing nvtx markup.
Additional background
New features are controlled by added compile flag USE_ROCTX. This assumes the location of roctracer/libroctracer64 and roctracer/libroctx64 are similar to other ROCm library installations such as rocrand.
Example run to generate results.json
This results.json file can be viewed with a browser as described in the documentation using chrome://tracing/
https://github.com/ROCm-Developer-Tools/rocprofiler/blob/amd-master/doc/rocprof.md#43rd-party-visualization-tools
TODO:
Add CMake support for the new flag
Minimum ROCm version enforcement should be updated (or removed)
Checklist
The proposed changes: