-
Notifications
You must be signed in to change notification settings - Fork 571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libipt decode fails due to missing mapped addresses #6486
Comments
Adds encodings for kernel system call instructions to the trace in raw2trace. Kernel system call traces are decoded using libipt which also provides the instruction encodings. We add support to drir_t to write these encodings to a new buffer which is re-used for all dynamic instances of that instr even across multiple system call traces. Fixes taken/not-taken detection for conditional branches in the syscall trace. Adds support in the syscall_mix tool to report the counts of each system call's traces also. Adds sysnum to system call trace start and end markers to achieve this. Ran all Intel-PT tests locally: ``` $ ctest -VV -R 'SUDO' ... The following tests passed: code_api|client.drpttracer_SUDO-test code_api|tool.drcachesim.phys_SUDO # not really PT. Just included because of ctest -R. code_api|tool.drcachesim.phys-threads_SUDO # not really PT. Just included because of ctest -R. code_api|tool.drcacheoff.phys_SUDO # not really PT. Just included because of ctest -R. code_api|tool.drcacheoff.kernel.simple_SUDO code_api|tool.drcacheoff.kernel.opcode-mix_SUDO code_api|tool.drcacheoff.kernel.syscall-mix_SUDO 100% tests passed, 0 tests failed out of 7 ``` Found some flakiness due to #6486 in local runs of the kernel sudo tests, which will be addressed separately. Issue: #5505
I have successfully tested the main branch on my local system and can confirm that it is functioning as expected. |
I hit this missing address error in every run in these 3 tests on my machine:
It is a Debian-ish 6.5.6 kernel. Should be same as @abhinav92003 's machine. |
FTR the following is the local workaround I'm using. It adds two addresses that showed up in error messages. They may be different on different machines of course.
|
There are now 4 tests and all fail every time on my machine:
|
They're all because of the same underlying reason: PT raw traces containing some instr not in the range of any /proc/modules. |
On the same system where the test failures reproduce, I tried using perf to trace the same test app (suite/tests/bin/simple_app), and it didn't fail (see https://perf.wiki.kernel.org/index.php/Perf_tools_support_for_Intel%C2%AE_Processor_Trace#Kernel-only_tracing for the detailed steps).
I was able to find the address that failed in our tests in the perf output:
So perf is doing it right. We need to see what we're missing. |
Here's the documented logic used by perf for copying kcore: https://github.com/torvalds/linux/blob/6d0dc8559c847e2dcd66c5dd93dbab3d3d887ff5/tools/perf/util/symbol-elf.c#L2473. Particularly,
Our kcore_copy looks only at
Based on the contents of my /proc/kallsyms, I see that if we also considered the address of the lowest and highest t,w,T,W symbols, the unmapped address reported by libipt would be covered. |
Following up on our offline discussion: @dolanzhao Can you provide details of the kernel version you were able to reproduce this issue with? Do you have a fix ready that we can review and commit? |
Yes. The kernel version is 6.2.0-39. # uname -a
Linux dolan-ubuntu 6.2.0-39-generic #40~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 16 10:53:04 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
I have a draft patch, but it's not yet complete. I attempted to test this solution on Linux kernel version 6.5. |
E.g., in this case, the missing IP is 0xffffffffc002bb08. The kernel map _stext, _etext is 0xffffffffaf800000, 0xffffffffb0600000, which does not cover the addr.
Our kcore_copy differs from perf also in that: we copy the specific ranges for each module4, instead of everything between the lowest and highest module symbols range3. IIRC this was intended as an optimization to reduce the size of the dumped kcore. @dolanzhao Let me know if you have any comments on this. It may be easier for me to patch this issue because I can actually reproduce this error on my workstation. I gather from our last discussion that your previous comment actually refers to a different issue that affects newer hardware (fixed by #6552 by updating the libipt version we use), and that your system does not reproduce this particular issue. Edit: striked out some incorrect observations. The missing symbol is actually not covered between the lowest and highest module-related function symbols from |
As a side note: perf also has this logic to copy the "entry trampolines" that we don't do4. I didn't observe any failures in our tracing because of missing this, so will just note it in a code comment for now. |
I tried using DR's dumped kcore with Also, just looking at perf's kcore copy logic, I couldn't find how it dumps the missing IP additionally. As noted in the comment above, the missing IP does lie between the lowest and highest function symbols in /proc/kallsyms; but perf does not dump that region if it finds stext and etext (which are indeed present in my /proc/kallsyms). Also, perf's kcore does not show the missing IP (0xffffffffc01c5754) in it when I use readelf:
Now I'm thinking: even though dumping that missing IP additionally in our kcore_copy helps workaround this issue, the real difference between perf and us may not be in the kcore copy logic. When I grep for "bpf_prog_fb33d7816e42d685_file_monitoring_open" (the perf script output says this is the symbol at the missing IP), I could find it in the perf.data/data binary, but not in perf.data/kcore_dir. Maybe perf dumps more information in the trace, which helps it decode later. Speculation: I'm reading that eBPF code may be JIT compiled (also the symbol above has that hex number in it which seems weird), and JIT code probably doesn't have symbols in /proc/kallsyms1. I don't know how we can reliably identify and dump such JIT regions though. |
I was able to find the symbol for the missing IP in /proc/kallsyms after relaxing /proc/sys/net/core/bpf_jit_harden. One possible workaround is that we special-case bpf, and copy the /proc/kcore regions that correspond to all "[bpf]"-related symbols in /proc/kallsyms. Still curious what perf does that allows it to get the BPF JIT code symbols/encodings in its trace. I verified that /proc/modules hasn't changed even after relaxing bpf_jit_harden. |
Fixes missing instruction encodings for some kernel code execution captured using Intel-PT. The root-cause seemed to be that JIT code executed by the kernel, eBPF code in this case, does not have entries in /proc/kallsyms, so our kcore dump logic did not include them. This fix looks for BPF related symbols in /proc/kallsyms and includes them in the copied regions from /proc/kcore. Note that BPF JIT symbols are not included in /proc/kallsyms by default. One needs to set /proc/sys/net/core/bpf_jit_harden and /proc/sys/net/core/bpf_jit_kallsyms appropriately (see https://docs.kernel.org/admin-guide/sysctl/net.html#proc-sys-net-core-network-core-options for more details). Added this suggestion to documentation. Tested PT tracing related tests locally on a machine that supports Intel-PT: $ ctest -R 'drpttracer|drcacheoff.kernel' ... Start 213: code_api|client.drpttracer_SUDO-test [sudo] password for sharmaabhinav: 1/5 Test #213: code_api|client.drpttracer_SUDO-test ..................... Passed 4.29 sec Start 412: code_api|tool.drcacheoff.kernel.simple_SUDO 2/5 Test #412: code_api|tool.drcacheoff.kernel.simple_SUDO .............. Passed 4.66 sec Start 413: code_api|tool.drcacheoff.kernel.opcode-mix_SUDO 3/5 Test #413: code_api|tool.drcacheoff.kernel.opcode-mix_SUDO .......... Passed 4.71 sec Start 414: code_api|tool.drcacheoff.kernel.syscall-mix_SUDO 4/5 Test #414: code_api|tool.drcacheoff.kernel.syscall-mix_SUDO ......... Passed 4.59 sec Start 415: code_api|tool.drcacheoff.kernel.invariant-checker_SUDO 5/5 Test #415: code_api|tool.drcacheoff.kernel.invariant-checker_SUDO ... Passed 5.75 sec 100% tests passed, 0 tests failed out of 5 Issue: #6486
Fixes drmemtrace kernel trace libipt post-processing failures caused by missing instruction encodings for some kernel code execution captured using Intel-PT. The root-cause seems to be that JIT code executed by the kernel, BPF code in this case, does not have entries in `/proc/modules`. So, our kcore dump logic did not include them. This fix looks for BPF related symbols in `/proc/kallsyms` and includes them in the copied regions from `/proc/kcore`. Note that BPF JIT symbols are not included in `/proc/kallsyms` by default. One needs to set `/proc/sys/net/core/bpf_jit_harden` and `/proc/sys/net/core/bpf_jit_kallsyms` appropriately (see https://docs.kernel.org/admin-guide/sysctl/net.html#proc-sys-net-core-network-core-options for more details). Added this suggestion to documentation. It may be better to not automatically make this possibly-too-intrusive change to the user's machine in cmake. This is probably fine because the issue is not widespread (not reproduced on public Linux distributions). Tested PT tracing related tests locally on a machine that supports Intel-PT: ``` $ ctest -R 'drpttracer|drcacheoff.kernel' ... Start 213: code_api|client.drpttracer_SUDO-test [sudo] password for sharmaabhinav: 1/5 Test #213: code_api|client.drpttracer_SUDO-test ..................... Passed 4.29 sec Start 412: code_api|tool.drcacheoff.kernel.simple_SUDO 2/5 Test #412: code_api|tool.drcacheoff.kernel.simple_SUDO .............. Passed 4.66 sec Start 413: code_api|tool.drcacheoff.kernel.opcode-mix_SUDO 3/5 Test #413: code_api|tool.drcacheoff.kernel.opcode-mix_SUDO .......... Passed 4.71 sec Start 414: code_api|tool.drcacheoff.kernel.syscall-mix_SUDO 4/5 Test #414: code_api|tool.drcacheoff.kernel.syscall-mix_SUDO ......... Passed 4.59 sec Start 415: code_api|tool.drcacheoff.kernel.invariant-checker_SUDO 5/5 Test #415: code_api|tool.drcacheoff.kernel.invariant-checker_SUDO ... Passed 5.75 sec 100% tests passed, 0 tests failed out of 5 ``` Unfortunately the decode errors do not go away completely even after this fix, but they have become very less frequent now (tool.kernel.simple in release build failed after 40 successful runs with this fix, which failed every run before). Issue: #6486
Fixes drmemtrace kernel trace libipt post-processing failures caused by missing instruction encodings for some kernel code execution captured using Intel-PT. The root-cause seems to be that JIT code executed by the kernel, BPF code in this case, does not have entries in `/proc/modules`. So, our kcore dump logic did not include them. This fix looks for BPF related symbols in `/proc/kallsyms` and includes them in the copied regions from `/proc/kcore`. Note that BPF JIT symbols are not included in `/proc/kallsyms` by default. One needs to set `/proc/sys/net/core/bpf_jit_harden` and `/proc/sys/net/core/bpf_jit_kallsyms` appropriately (see https://docs.kernel.org/admin-guide/sysctl/net.html#proc-sys-net-core-network-core-options for more details). Added this suggestion to documentation. It may be better to not automatically make this possibly-too-intrusive change to the user's machine in cmake. This is probably fine because the issue is not widespread (not reproduced on public Linux distributions). Tested PT tracing related tests locally on a machine that supports Intel-PT: ``` $ ctest -R 'drpttracer|drcacheoff.kernel' ... Start 213: code_api|client.drpttracer_SUDO-test [sudo] password for sharmaabhinav: 1/5 Test #213: code_api|client.drpttracer_SUDO-test ..................... Passed 4.29 sec Start 412: code_api|tool.drcacheoff.kernel.simple_SUDO 2/5 Test #412: code_api|tool.drcacheoff.kernel.simple_SUDO .............. Passed 4.66 sec Start 413: code_api|tool.drcacheoff.kernel.opcode-mix_SUDO 3/5 Test #413: code_api|tool.drcacheoff.kernel.opcode-mix_SUDO .......... Passed 4.71 sec Start 414: code_api|tool.drcacheoff.kernel.syscall-mix_SUDO 4/5 Test #414: code_api|tool.drcacheoff.kernel.syscall-mix_SUDO ......... Passed 4.59 sec Start 415: code_api|tool.drcacheoff.kernel.invariant-checker_SUDO 5/5 Test #415: code_api|tool.drcacheoff.kernel.invariant-checker_SUDO ... Passed 5.75 sec 100% tests passed, 0 tests failed out of 5 ``` Unfortunately the decode errors do not go away completely even after this fix, but they have become very less frequent now (tool.kernel.simple in release build failed after 40 successful runs with this fix, which failed every run before). Issue: #6486
I'm seeing errors like the following happen sporadically during decoding of Intel-PT kernel traces:
When it happens, it's always the same address that's unmapped. When I hard-coded that address (and another that was revealed when this error was fixed) to be copied in kcore_copy, the error went away. I couldn't find details of the address in /proc/modules or /proc/kallsyms but it is a part of the kcore code section (we copy only the memory that corresponds to the live kernel modules from /proc/modules). It's presumably some kernel code that's executing during system calls.
The errors don't happen every time but they seem to have started happening frequently enough. Not sure if it was some change in my machine's kernel that caused the unmapped instrs to be executed.
The text was updated successfully, but these errors were encountered: