We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Octo-Tiger completes.
Octo-Tiger (or HPX) occasionally complains some performance counters are not found.
Run Octo-Tiger with the following counters enabled:
--hpx:print-counter=/octotiger*/compute/gpu*kokkos* --hpx:print-counter=/arithmetics/add@/octotiger*/compute/gpu/hydro_kokkos --hpx:print-counter=/arithmetics/add@/octotiger*/compute/gpu/hydro_kokkos_aggregated
Since there counters are created by Octo-Tiger, I think it is an Octo-Tiger problem rather than an HPX problem.
I suspected there were some data races between the counter registration and usage.
{stack-trace}: 11 frames: 0x7ffbe76bc29a : /u/jiakuny/workspace/spack/opt/spack/linux-rhel8-zen3/gcc-11.2.1/hpx-master-ivytitn2twobtla6duwsdubshisph5z4/lib64/libhpx.so.1(+0x4b629a) [0x7ffbe76bc29a] in /u/jiakuny/workspace/spack/opt/spack/linux-rhel8-zen3/gcc-11.2.1/hpx-master-ivytitn2twobtla6duwsdubshisph5z4/lib64/libhpx.so.1 0x7ffbe6e15d65 : std::__exception_ptr::exception_ptr hpx::detail::get_exception<hpx::exception>(hpx::exception const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [0x95] in /u/jiakuny/workspace/spack/opt/spack/linux-rhel8-zen3/gcc-11.2.1/hpx-master-ivytitn2twobtla6duwsdubshisph5z4/lib64/libhpx_core.so 0x7ffbe6e15e35 : void hpx::detail::throw_exception<hpx::exception>(hpx::exception const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) [0x55] in /u/jiakuny/workspace/spack/opt/spack/linux-rhel8-zen3/gcc-11.2.1/hpx-master-ivytitn2twobtla6duwsdubshisph5z4/lib64/libhpx_core.so 0x7ffbe6e0b854 : hpx::detail::throw_exception(hpx::error, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) [0x84] in /u/jiakuny/workspace/spack/opt/spack/linux-rhel8-zen3/gcc-11.2.1/hpx-master-ivytitn2twobtla6duwsdubshisph5z4/lib64/libhpx_core.so 0x7ffbe77be0f8 : hpx::performance_counters::detail::create_counter_local(hpx::performance_counters::counter_info const&) [0x3f8] in /u/jiakuny/workspace/spack/opt/spack/linux-rhel8-zen3/gcc-11.2.1/hpx-master-ivytitn2twobtla6duwsdubshisph5z4/lib64/libhpx.so.1 0x7ffbe77f80fd : hpx::components::server::runtime_support::create_performance_counter(hpx::performance_counters::counter_info const&) [0xd] in /u/jiakuny/workspace/spack/opt/spack/linux-rhel8-zen3/gcc-11.2.1/hpx-master-ivytitn2twobtla6duwsdubshisph5z4/lib64/libhpx.so.1 0x7ffbe785c8fc : /u/jiakuny/workspace/spack/opt/spack/linux-rhel8-zen3/gcc-11.2.1/hpx-master-ivytitn2twobtla6duwsdubshisph5z4/lib64/libhpx.so.1(+0x6568fc) [0x7ffbe785c8fc] in /u/jiakuny/workspace/spack/opt/spack/linux-rhel8-zen3/gcc-11.2.1/hpx-master-ivytitn2twobtla6duwsdubshisph5z4/lib64/libhpx.so.1 0x7ffbe78113bd : /u/jiakuny/workspace/spack/opt/spack/linux-rhel8-zen3/gcc-11.2.1/hpx-master-ivytitn2twobtla6duwsdubshisph5z4/lib64/libhpx.so.1(+0x60b3bd) [0x7ffbe78113bd] in /u/jiakuny/workspace/spack/opt/spack/linux-rhel8-zen3/gcc-11.2.1/hpx-master-ivytitn2twobtla6duwsdubshisph5z4/lib64/libhpx.so.1 0x7ffbe6e03866 : hpx::threads::coroutines::detail::coroutine_impl::operator()() [0xd6] in /u/jiakuny/workspace/spack/opt/spack/linux-rhel8-zen3/gcc-11.2.1/hpx-master-ivytitn2twobtla6duwsdubshisph5z4/lib64/libhpx_core.so 0x7ffbe6e02a29 : /u/jiakuny/workspace/spack/opt/spack/linux-rhel8-zen3/gcc-11.2.1/hpx-master-ivytitn2twobtla6duwsdubshisph5z4/lib64/libhpx_core.so(+0x113a29) [0x7ffbe6e02a29] in /u/jiakuny/workspace/spack/opt/spack/linux-rhel8-zen3/gcc-11.2.1/hpx-master-ivytitn2twobtla6duwsdubshisph5z4/lib64/libhpx_core.so {locality-id}: 2 {hostname}: [ (mpi:2) ] {process-id}: 68905 {os-thread}: 2, locality#0/worker-thread#16 {thread-id}: 0000000008897540 {thread-description}: <unknown> {state}: state::startup {auxinfo}: {file}: /u/jiakuny/workspace/hpx-lcw/libs/full/performance_counters/src/counters.cpp {line}: 808 {function}: create_counter_local {what}: no create function for performance counter found: /octotiger{locality#2/total}/compute/gpu/multipole_kokkos (counter type /octotiger/compute/gpu/multipole_kokkos is not defined, known counter types: /agas/count/allocate /agas/count/begin_migration /agas/count/bind /agas/count/bind_gid /agas/count/cache/entries /agas/count/cache/erase_entry /agas/count/cache/evictions /agas/count/cache/get_entry /agas/count/cache/hits /agas/count/cache/insert_entry /agas/count/cache/insertions /agas/count/cache/misses /agas/count/cache/update_entry /agas/count/decrement_credit /agas/count/end_migration /agas/count/increment_credit /agas/count/iterate_names /agas/count/on_symbol_namespace_event /agas/count/resolve /agas/count/resolve_gid /agas/count/route /agas/count/unbind /agas/count/unbind_gid /agas/primary/count /agas/primary/time /agas/symbol/count /agas/symbol/time /agas/time/allocate /agas/time/begin_migration /agas/time/bind /agas/time/bind_gid /agas/time/cache/erase_entry /agas/time/cache/get_entry /agas/time/cache/insert_entry /agas/time/cache/update_entry /agas/time/decrement_credit /agas/time/end_migration /agas/time/increment_credit /agas/time/iterate_names /agas/time/on_symbol_namespace_event /agas/time/resolve /agas/time/resolve_gid /agas/time/route /agas/time/unbind /agas/time/unbind_gid /arithmetics/add /arithmetics/count /arithmetics/divide /arithmetics/max /arithmetics/mean /arithmetics/median /arithmetics/min /arithmetics/multiply /arithmetics/subtract /arithmetics/variance /octotiger/amr_bounds /octotiger/compute/cpu/hydro_kokkos /octotiger/compute/cpu/hydro_kokkos_aggregated /octotiger/compute/cpu/hydro_kokkos_aggregation_rate /octotiger/compute/cpu/hydro_legacy /octotiger/compute/cpu/p2p_kokkos /octotiger/compute/gpu/hydro_cuda /octotiger/compute/gpu/hydro_cuda_aggregated /octotiger/compute/gpu/hydro_cuda_aggregation_rate /octotiger/compute/gpu/hydro_kokkos /octotiger/compute/gpu/hydro_kokkos_aggregated /octotiger/compute/gpu/hydro_kokkos_aggregation_rate /octotiger/compute/gpu/p2p_cuda /octotiger/compute/gpu/p2p_kokkos /octotiger/subgrid_leaves /octotiger/subgrids /parcelport/count/mpi/cache-evictions /parcelport/count/mpi/cache-hits /parcelport/count/mpi/cache-insertions /parcelport/count/mpi/cache-misses /parcelport/count/mpi/cache-reclaims /parcelqueue/length/receive /parcelqueue/length/send /parcels/count/routed /runtime/count/action-invocation /runtime/count/component /runtime/count/remote-action-invocation /runtime/uptime /scheduler/utilization/instantaneous /statistics/average /statistics/max /statistics/median /statistics/min /statistics/rolling_average /statistics/rolling_max /statistics/rolling_min /statistics/rolling_stddev /statistics/stddev /threadqueue/length /threads/busy-loop-count/instantaneous /threads/count/cumulative /threads/count/cumulative-phases /threads/count/instantaneous/active /threads/count/instantaneous/all /threads/count/instantaneous/pending /threads/count/instantaneous/staged /threads/count/instantaneous/suspended /threads/count/instantaneous/terminated /threads/idle-loop-count/instantaneous /threads/time/overall : HPX(bad_parameter)): HPX(bad_parameter):
This is on NCSA Delta. I think @diehlpk also encountered this problem on Ookami.
The text was updated successfully, but these errors were encountered:
@G-071 What should we do about that?
Sorry, something went wrong.
@G-071 can we close that pull request?
No branches or pull requests
Expected Behavior
Octo-Tiger completes.
Actual Behavior
Octo-Tiger (or HPX) occasionally complains some performance counters are not found.
Steps to Reproduce the Problem
Run Octo-Tiger with the following counters enabled:
Specifications
Since there counters are created by Octo-Tiger, I think it is an Octo-Tiger problem rather than an HPX problem.
I suspected there were some data races between the counter registration and usage.
This is on NCSA Delta. I think @diehlpk also encountered this problem on Ookami.
The text was updated successfully, but these errors were encountered: