Skip to content

Commit

Permalink
i#7113 decode cache: Add analyzer library for decode_cache_t (#7114)
Browse files Browse the repository at this point in the history
Adds a new drmemtrace_decode_cache library to cache information about
decoded instructions using decode_cache_t. This can be used by analysis
tools that need to decode the instr encodings in the trace, to avoid
overhead of redundant decodes which can get expensive, and also to avoid
duplication of various related logic.

The library allows the tools to decide what information they need to
cache and add implementation for how to obtain it. Also, it uses
instr_noalloc_t when possible to reduce heap usage and
allocation/deallocation overhead.

If the trace does not include embedded encodings or if the user wants to
get encodings from the app binaries using module_mapper_t instead, they
can provide the module file path to the init API on the decode_cache_t
object. decode_cache_t keeps a single initialized module_mapper_t at any
time, which is shared between all decode_cache_t objects (even the ones
of different template types); this is done by tracking the count of
active objects that use the module mapper.

decode_cache_t provides the clear_cache() API which can be used in
parallel_shard_exit() to keep memory consumption in check by freeing up
cached decode info that may not be needed for result computation in
later print_results() which has to wait until all shards are done.

Refactors the invariant checker and opcode mix tools to use this
library.

Modifies add_encodings_to_memrefs to support a mode where encodings are
not set in the generated test memrefs but only the instr addr and size
fields are set.

Makes the opcode cache in opcode_mix_t per-shard instead of per-worker.
Decode info must not be cached per-worker as that may cause stale
encodings for non-first shards processed by the worker. This means the
worker init and worker exit APIs can be removed now from opcode_mix_t.

Adds decode_cache_test and opcode_mix_test unit tests that verify
operation of the decode_cache_t.

Issue: #7113
  • Loading branch information
abhinav92003 authored Jan 24, 2025
1 parent 6549e88 commit 3c6c167
Show file tree
Hide file tree
Showing 13 changed files with 1,574 additions and 281 deletions.
5 changes: 5 additions & 0 deletions api/docs/release.dox
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,11 @@ changes:
Further non-compatibility-affecting changes include:
- Added support for reading a single drmemtrace trace file from stdin
via "-infile -".
- Added the #dynamorio::drmemtrace::decode_cache_t library to make it easier and more
efficient for drmemtrace analysis tools to obtain decoded information about
instructions in the trace. This works for traces that have embedded instruction
encodings in them, and also for legacy traces without embedded encodings where the
encodings are obtained from the application binaries instead.

**************************************************
<hr>
Expand Down
38 changes: 35 additions & 3 deletions clients/drcachesim/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -163,19 +163,27 @@ add_exported_library(drmemtrace_reuse_distance STATIC tools/reuse_distance.cpp)
add_exported_library(drmemtrace_histogram STATIC tools/histogram.cpp)
add_exported_library(drmemtrace_reuse_time STATIC tools/reuse_time.cpp)
add_exported_library(drmemtrace_basic_counts STATIC tools/basic_counts.cpp)
add_exported_library(drmemtrace_opcode_mix STATIC
tools/opcode_mix.cpp tracer/raw2trace_shared.cpp)
add_exported_library(drmemtrace_opcode_mix STATIC tools/opcode_mix.cpp)
add_exported_library(drmemtrace_syscall_mix STATIC tools/syscall_mix.cpp)
add_exported_library(drmemtrace_view STATIC
tools/view.cpp tracer/raw2trace_shared.cpp)
add_exported_library(drmemtrace_func_view STATIC tools/func_view.cpp)
add_exported_library(drmemtrace_invariant_checker STATIC tools/invariant_checker.cpp)
add_exported_library(drmemtrace_schedule_stats STATIC tools/schedule_stats.cpp)
add_exported_library(drmemtrace_decode_cache STATIC
tools/common/decode_cache.cpp
# XXX: Possibly create a library for raw2trace_shared, to avoid
# multiple build overhead.
tracer/raw2trace_shared.cpp)
add_exported_library(drmemtrace_schedule_file STATIC common/schedule_file.cpp)
add_exported_library(drmemtrace_mutex_dbg_owned STATIC common/mutex_dbg_owned.cpp)

target_link_libraries(drmemtrace_invariant_checker drdecode drmemtrace_schedule_file)
target_link_libraries(drmemtrace_invariant_checker drdecode drmemtrace_schedule_file
drmemtrace_decode_cache)
target_link_libraries(drmemtrace_decode_cache drcovlib_static)
target_link_libraries(drmemtrace_opcode_mix drmemtrace_decode_cache)

configure_DynamoRIO_standalone(drmemtrace_decode_cache)
configure_DynamoRIO_standalone(drmemtrace_opcode_mix)
configure_DynamoRIO_standalone(drmemtrace_view)
configure_DynamoRIO_standalone(drmemtrace_invariant_checker)
Expand Down Expand Up @@ -322,6 +330,7 @@ include_directories(${CMAKE_CURRENT_SOURCE_DIR}/reader)
# so that we can more cleanly separate tracer and raw2trace code.
include_directories(${CMAKE_CURRENT_SOURCE_DIR}/tracer)
include_directories(${CMAKE_CURRENT_SOURCE_DIR}/scheduler)
include_directories(${CMAKE_CURRENT_SOURCE_DIR}/tools/common)
include_directories(${CMAKE_CURRENT_SOURCE_DIR})

if (BUILD_PT_POST_PROCESSOR)
Expand Down Expand Up @@ -369,6 +378,7 @@ install_client_nonDR_header(drmemtrace reader/reader.h)
install_client_nonDR_header(drmemtrace reader/record_file_reader.h)
install_client_nonDR_header(drmemtrace analysis_tool.h)
install_client_nonDR_header(drmemtrace analyzer.h)
install_client_nonDR_header(drmemtrace tools/common/decode_cache.h)
install_client_nonDR_header(drmemtrace tools/reuse_distance_create.h)
install_client_nonDR_header(drmemtrace tools/histogram_create.h)
install_client_nonDR_header(drmemtrace tools/reuse_time_create.h)
Expand Down Expand Up @@ -633,6 +643,7 @@ restore_nonclient_flags(drmemtrace_analyzer)
restore_nonclient_flags(drmemtrace_invariant_checker)
restore_nonclient_flags(drmemtrace_schedule_stats)
restore_nonclient_flags(drmemtrace_schedule_file)
restore_nonclient_flags(drmemtrace_decode_cache)

# We need to pass /EHsc and we pull in libcmtd into drcachesim from a dep lib.
# Thus we need to override the /MT with /MTd.
Expand Down Expand Up @@ -706,6 +717,7 @@ add_win32_flags(drmemtrace_analyzer)
add_win32_flags(drmemtrace_invariant_checker)
add_win32_flags(drmemtrace_schedule_stats)
add_win32_flags(drmemtrace_schedule_file)
add_win32_flags(drmemtrace_decode_cache)
add_win32_flags(directory_iterator)
add_win32_flags(test_helpers)
add_win32_flags(drmemtrace_mutex_dbg_owned)
Expand Down Expand Up @@ -964,6 +976,26 @@ if (BUILD_TESTS)
add_win32_flags(tool.drcacheoff.burst_aarch64_sys)
endif ()

add_executable(tool.drcachesim.decode_cache_test tests/decode_cache_test.cpp)
configure_DynamoRIO_standalone(tool.drcachesim.decode_cache_test)
add_win32_flags(tool.drcachesim.decode_cache_test)
target_link_libraries(tool.drcachesim.decode_cache_test
drmemtrace_decode_cache test_helpers)
add_test(NAME tool.drcachesim.decode_cache_test
COMMAND tool.drcachesim.decode_cache_test)
set_tests_properties(tool.drcachesim.decode_cache_test PROPERTIES
TIMEOUT ${test_seconds})

add_executable(tool.drcacheoff.opcode_mix_test tests/opcode_mix_test.cpp)
configure_DynamoRIO_standalone(tool.drcacheoff.opcode_mix_test)
add_win32_flags(tool.drcacheoff.opcode_mix_test)
target_link_libraries(tool.drcacheoff.opcode_mix_test
drmemtrace_opcode_mix drmemtrace_decode_cache test_helpers)
add_test(NAME tool.drcacheoff.opcode_mix_test
COMMAND tool.drcacheoff.opcode_mix_test)
set_tests_properties(tool.drcacheoff.opcode_mix_test PROPERTIES
TIMEOUT ${test_seconds})

# XXX i#1997: dynamorio_static is not supported on Mac yet
# FIXME i#2949: gcc 7.3 fails to link certain configs
# TODO i#3544: Port tests to RISC-V 64
Expand Down
9 changes: 8 additions & 1 deletion clients/drcachesim/docs/drcachesim.dox.in
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/* **********************************************************
* Copyright (c) 2015-2024 Google, Inc. All rights reserved.
* Copyright (c) 2015-2025 Google, Inc. All rights reserved.
* **********************************************************/

/*
Expand Down Expand Up @@ -142,6 +142,13 @@ copies of the binaries and reading the raw bytes for each instruction
in order to obtain the opcode and full operand information.
See also \ref sec_drcachesim_core.

drmemtrace analysis tools may use the
#dynamorio::drmemtrace::decode_cache_t library, which handles the
heavy lifting of decoding the trace instructions using either the embedded
encodings in the trace or the encodings from the app binaries, and manages
caching their decode info (including invalidating stale decode info based on
the `encoding_is_new` field for embedded encodings).

Whether conditional branches are taken or untaken is indicated by the
instruction types #dynamorio::drmemtrace::TRACE_TYPE_INSTR_TAKEN_JUMP
and #dynamorio::drmemtrace::TRACE_TYPE_INSTR_UNTAKEN_JUMP. The target
Expand Down
Loading

0 comments on commit 3c6c167

Please sign in to comment.