Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add roctx ranges to TinyProfiler #2057

Merged
merged 15 commits into from
May 27, 2021

Conversation

jmsexton03
Copy link
Contributor

@jmsexton03 jmsexton03 commented May 26, 2021

Summary

Adds roctx markup similarly to existing nvtx markup.

Additional background

New features are controlled by added compile flag USE_ROCTX. This assumes the location of roctracer/libroctracer64 and roctracer/libroctx64 are similar to other ROCm library installations such as rocrand.

Example run to generate results.json

export HIP_PATH=/path/to/rocm/root
cd amrex/Tests/GPU/CNS/Exec/Sod
make USE_HIP=TRUE TINY_PROFILE=TRUE USE_ROCTX=TRUE USE_MPI=FALSE NO_CONFIG_CHECKING=TRUE
srun -n 1 ${HIP_PATH}/bin/rocprof  --hsa-trace --stats --timestamp on --roctx-trace  ./CNS3d.*.TPROF.HIP.ex inputs

This results.json file can be viewed with a browser as described in the documentation using chrome://tracing/
https://github.com/ROCm-Developer-Tools/rocprofiler/blob/amd-master/doc/rocprof.md#43rd-party-visualization-tools

TODO:
Add CMake support for the new flag
Minimum ROCm version enforcement should be updated (or removed)

Checklist

The proposed changes:

  • fix a bug or incorrect behavior in AMReX
  • add new capabilities to AMReX
  • changes answers in the test suite to more than roundoff level
  • are likely to significantly affect the results of downstream AMReX users
  • are described in the proposed changes to the AMReX documentation, if appropriate

@jmsexton03 jmsexton03 marked this pull request as ready for review May 26, 2021 21:46
@jmsexton03 jmsexton03 requested review from WeiqunZhang and ax3l May 26, 2021 21:46
@WeiqunZhang
Copy link
Member

On spock, one can also compile with make USE_HIP=TRUE TINY_PROFILE=TRUE USE_ROCTX=TRUE HIP_PATH=/opt/rocm-4.1.0/hip without using NO_CONFIG_CHECKING=TRUE.

@@ -343,6 +353,9 @@ Device::Initialize ()
amrex::Print() << "HIP initialized.\n";
}
}
#if defined(AMREX_USE_ROCTX)
roctracer_start();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, we call profileStart twice in ifdef AMREX_USE_CUDA. I assume that's uncessary. (@maxpkatz) So I suggest we move roctracer_start into profileStart, remove the two calls to profileStart in #ifdef AMREX_USE_CUDA, and put a call to profileStart at the end of Device::Initialize.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why it's there twice but I agree that we only need one of those two., and it makes sense to put it at the end of Initialize.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might not be remembering this right, but did we try to put a nvtx region around the Initialization to ID if there was unneeded start-up at some point? Is that why it's in there twice, we tried to move the init back to allow wrapping Init?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are only calls to Print() between the two calls to cudaProfilerStart.

@@ -358,6 +371,11 @@ Device::Finalize ()
cudaProfilerStop();
#endif

#if defined(AMREX_USE_HIP) && defined(AMREX_USE_ROCTX)
roctracer_stop();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we move roctracer_stop to Device::profilerStop, remove the call to cudaProfilerStop above, and call Device::profileStop().

@@ -98,6 +98,12 @@ TinyProfiler::start () noexcept
}
nvtxRangePush(fname.c_str());
#endif
#if defined(AMREX_USE_HIP) && defined(AMREX_USE_ROCTX)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to use #elif defined()

@@ -178,6 +184,12 @@ TinyProfiler::stop () noexcept
amrex::Gpu::Device::synchronize();
}
nvtxRangePop();
#endif
#if defined(AMREX_USE_HIP) && defined(AMREX_USE_ROCTX)
if (device_synchronize_around_region) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we probably should move

        if (device_synchronize_around_region) {
            amrex::Gpu::Device::synchronize();
        }

outside the ifdefs so that it's called for dpcpp too.

@WeiqunZhang
Copy link
Member

Once the rocm issue is resolved on spock, we should remove AMREX_USE_ROCTX to be consistent with how we do things in CUDA.

 * Make nvtx, roctx, and roctracer_ext header includes have no quotes
 * Rearrange profiler calls in Initialize and Finalize
 * Change to elif and move input-flag controlled Gpu synchronize in TinyProfiler
 * Remove hip version warning as the latest version should always be used
@ax3l
Copy link
Member

ax3l commented May 27, 2021

Is there already an apt package that we can install in CI for roctracer?

@jmsexton03
Copy link
Contributor Author

Is there already an apt package that we can install in CI for roctracer?

roctracer-dev should be a dependency of rocm-dev, which we already include

@ax3l
Copy link
Member

ax3l commented May 27, 2021

Ok, feel free to add roctracer-dev explicitly anyway to make the dependency more visible.

Src/Base/AMReX_TinyProfiler.cpp Outdated Show resolved Hide resolved
Tools/CMake/AMReXOptions.cmake Outdated Show resolved Hide resolved
@@ -473,6 +473,8 @@ The list of available options is reported in the :ref:`table <tab:cmakevar>` bel
+------------------------------+-------------------------------------------------+-------------------------+-----------------------+
| AMReX_PROFPARSER | Build with profile parser support | NO | YES, NO |
+------------------------------+-------------------------------------------------+-------------------------+-----------------------+
| AMReX_ROCTX | Build with roctx markup profiling support | NO | YES, NO |
Copy link
Member

@ax3l ax3l May 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on default choice above:

Suggested change
| AMReX_ROCTX | Build with roctx markup profiling support | NO | YES, NO |
| AMReX_ROCTX | Build with roctx markup profiling support | ON (if HIP) | YES, NO |

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My view is we should simply remove this option in the future once it's fixed on spock.

@@ -114,6 +115,15 @@ ifeq ($(HIP_COMPILER),clang)
# rocThrust - Header only
# SYSTEM_INCLUDE_LOCATIONS += $(ROC_PATH)/rocthrust/include

ifeq ($(USE_ROCTX),TRUE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also turn this on by default with GNUmake builds, as soon as HIP is found.

HIPCC_FLAGS += -DAMREX_USE_ROCTX
SYSTEM_INCLUDE_LOCATIONS += $(ROC_PATH)/roctracer/include $(ROC_PATH)/rocprofiler/include
LIBRARY_LOCATIONS += $(ROC_PATH)/roctracer/lib $(ROC_PATH)/rocprofiler/lib
LIBRARIES += -Wl,--rpath=${ROC_PATH}/roctracer/lib -lroctracer64 -lroctx64
Copy link
Member

@ax3l ax3l May 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we add RPATHS automatically these days from the PR I added in #590 ?
We should probably rely on the USE_RPATH GNUmake option for this lib as well, so users can explicitly control RPATHs with a central option alone:

Suggested change
LIBRARIES += -Wl,--rpath=${ROC_PATH}/roctracer/lib -lroctracer64 -lroctx64
LIBRARIES += -lroctracer64 -lroctx64

@WeiqunZhang WeiqunZhang merged commit 2c2c699 into AMReX-Codes:development May 27, 2021
ax3l added a commit that referenced this pull request May 27, 2021
Just rely on USE_RPATH instead for GNUmake.
Follow-up to #2057
@ax3l ax3l mentioned this pull request May 27, 2021
5 tasks
WeiqunZhang pushed a commit that referenced this pull request Jun 1, 2021
Just rely on USE_RPATH instead for GNUmake. Follow-up to #2057
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants