Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ilc doesn't match expectations from SourceBuiltSdkContainsNativeDebugSymbols. #4472

Closed
tmds opened this issue Jun 20, 2024 · 26 comments · Fixed by dotnet/sdk#41752
Closed

ilc doesn't match expectations from SourceBuiltSdkContainsNativeDebugSymbols. #4472

tmds opened this issue Jun 20, 2024 · 26 comments · Fixed by dotnet/sdk#41752
Assignees
Labels
area-testing Improvements in CI and testing

Comments

@tmds
Copy link
Member

tmds commented Jun 20, 2024

With dotnet/sdk#41198, we've added the ilc ILCompiler as part of the .NET installation.

ilc is .NET application that is NativeAOT compiled.

The Microsoft.DotNet.SourceBuild.SmokeTests include a test, SourceBuiltSdkContainsNativeDebugSymbols, which checks the native binaries in the .NET installation.

The test is failing for the ilc binary with:

[xUnit.net 00:00:19.93]     Microsoft.DotNet.SourceBuild.SmokeTests.DebugTests.SourceBuiltSdkContainsNativeDebugSymbols [FAIL]
[xUnit.net 00:00:19.93]       missing .debug_info section in /home/tmds/repos/dotnet/artifacts/bin/Microsoft.DotNet.SourceBuild.SmokeTests/Release/extracted-sdk/packs/runtime.fedora.39-x64.Microsoft.DotNet.ILCompiler/9.0.0-preview.6.24307.2/tools/ilc
[xUnit.net 00:00:19.93]       missing .debug_abbrev section in /home/tmds/repos/dotnet/artifacts/bin/Microsoft.DotNet.SourceBuild.SmokeTests/Release/extracted-sdk/packs/runtime.fedora.39-x64.Microsoft.DotNet.ILCompiler/9.0.0-preview.6.24307.2/tools/ilc
[xUnit.net 00:00:19.93]       missing FILE symbols in /home/tmds/repos/dotnet/artifacts/bin/Microsoft.DotNet.SourceBuild.SmokeTests/Release/extracted-sdk/packs/runtime.fedora.39-x64.Microsoft.DotNet.ILCompiler/9.0.0-preview.6.24307.2/tools/ilc
[xUnit.net 00:00:19.93]       
[xUnit.net 00:00:19.93]       Stack Trace:
[xUnit.net 00:00:19.93]         /home/tmds/repos/dotnet/test/Microsoft.DotNet.SourceBuild.SmokeTests/DebugTests.cs(64,0): at Microsoft.DotNet.SourceBuild.SmokeTests.DebugTests.SourceBuiltSdkContainsNativeDebugSymbols()
[xUnit.net 00:00:19.93]         
[xUnit.net 00:00:19.93]            at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
[xUnit.net 00:00:19.93]            at System.Reflection.MethodBaseInvoker.InvokeWithNoArgs(Object obj, BindingFlags invokeAttr)

We've updated the test to skip ilc: https://github.com/dotnet/sdk/blob/01aab3448c9185f2a5b7e66e7e327fe4b7841408/src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/DebugTests.cs#L77-L78

Are there things we can/should do to make ilc pass the test?

cc @MichalStrehovsky @filipnavara @jkotas @omajid @ViktorHofer @MichaelSimons

Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

1 similar comment
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@jkotas
Copy link
Member

jkotas commented Jun 20, 2024

What is the raw output of eu-readelf that the test is trying to check?

@tmds
Copy link
Member Author

tmds commented Jun 20, 2024

Running on my system against a source-built ilc gives:

$ eu-readelf -S ilc 
There are 39 section headers, starting at offset 0xd66d00:

Section Headers:
[Nr] Name                 Type         Addr             Off      Size     ES Flags Lk Inf Al
[ 0]                      NULL         0000000000000000 00000000 00000000  0        0   0  0
[ 1] .interp              PROGBITS     0000000000000350 00000350 0000001c  0 A      0   0  1
[ 2] .note.gnu.property   NOTE         0000000000000370 00000370 00000020  0 A      0   0  8
[ 3] .note.gnu.build-id   NOTE         0000000000000390 00000390 00000024  0 A      0   0  4
[ 4] .note.ABI-tag        NOTE         00000000000003b4 000003b4 00000020  0 A      0   0  4
[ 5] .gnu.hash            GNU_HASH     00000000000003d8 000003d8 00000028  0 A      6   0  8
[ 6] .dynsym              DYNSYM       0000000000000400 00000400 00001320 24 A      7   1  8
[ 7] .dynstr              STRTAB       0000000000001720 00001720 00000915  0 A      0   0  1
[ 8] .gnu.version         GNU_versym   0000000000002036 00002036 00000198  2 A      6   0  2
[ 9] .gnu.version_d       GNU_verdef   00000000000021d0 000021d0 00000038  0 A      7   2  8
[10] .gnu.version_r       GNU_verneed  0000000000002208 00002208 00000140  0 A      7   3  8
[11] .rela.dyn            RELA         0000000000002348 00002348 00012120 24 A      6   0  8
[12] .rela.plt            RELA         0000000000014468 00014468 00001200 24 AI     6  29  8
[13] .init                PROGBITS     0000000000016000 00016000 0000001b  0 AX     0   0  4
[14] .plt                 PROGBITS     0000000000016020 00016020 00000c10 16 AX     0   0 16
[15] .text                PROGBITS     0000000000016c30 00016c30 000f3ec2  0 AX     0   0 16
[16] __managedcode        PROGBITS     000000000010ab00 0010ab00 0067ee0b  0 AX     0   0 32
[17] __unbox              PROGBITS     000000000078990c 0078990c 00004be1  0 AX     0   0  4
[18] .fini                PROGBITS     000000000078e4f0 0078e4f0 0000000d  0 AX     0   0  4
[19] .rodata              PROGBITS     000000000078f000 0078f000 0033f340  0 A      0   0 128
[20] .dotnet_eh_table     PROGBITS     0000000000ace340 00ace340 000d1ddf  0 A      0   0  8
[21] .eh_frame_hdr        PROGBITS     0000000000ba0120 00ba0120 00047ffc  0 A      0   0  4
[22] .eh_frame            PROGBITS     0000000000be8120 00be8120 00155aa8  0 A      0   0  8
[23] .tdata               PROGBITS     0000000000d3eb00 00d3eb00 00000018  0 WAT    0   0  8
[24] .tbss                NOBITS       0000000000d3eb18 00d3eb18 00000100  0 WAT    0   0  8
[25] .init_array          INIT_ARRAY   0000000000d3eb18 00d3eb18 00000050  8 WA     0   0  8
[26] .fini_array          FINI_ARRAY   0000000000d3eb68 00d3eb68 00000008  8 WA     0   0  8
[27] .data.rel.ro         PROGBITS     0000000000d3eb70 00d3eb70 00000b38  0 WA     0   0 16
[28] .dynamic             DYNAMIC      0000000000d3f6a8 00d3f6a8 00000250 16 WA     7   0  8
[29] .got                 PROGBITS     0000000000d3f8f8 00d3f8f8 00000708  8 WA     0   0  8
[30] .data                PROGBITS     0000000000d40000 00d40000 0002692c  0 WA     0   0 16
[31] __modules            PROGBITS     0000000000d66930 00d66930 00000008  0 WA     0   0  8
[32] .bss                 NOBITS       0000000000d66980 00d66938 00036890  0 WA     0   0 128
[33] .hydrated            NOBITS       0000000000d9d210 00d66938 002f9a00  0 WA     0   0 16
[34] .comment             PROGBITS     0000000000000000 00d66938 00000071  1 MS     0   0  1
[35] .annobin.notes       STRTAB       0000000000000000 00d669a9 00000001  1 MS     0   0  1
[36] .gnu.build.attributes NOTE         0000000001098c10 00d669ac 000001b0  0        0   0  4
[37] .shstrtab            STRTAB       0000000000000000 00d66b5c 00000195  0        0   0  1
[38] .gnu_debuglink       PROGBITS     0000000000000000 00d66cf4 0000000c  0        0   0  4

@jkotas
Copy link
Member

jkotas commented Jun 20, 2024

Naot publish strip symbols by default. Can you try to set <StripSymbols>false</StripSymbols> for source build to see whether it will make this test pass?

@tmds
Copy link
Member Author

tmds commented Jun 20, 2024

I'll give that a try.

@tmds
Copy link
Member Author

tmds commented Jun 20, 2024

Can you try to set <StripSymbols>false</StripSymbols> for source build to see whether it will make this test pass?

This makes things better, and I'll open up a PR for it.

The test is still failing for ilc not meeting this expectation:

https://github.com/dotnet/sdk/blob/01aab3448c9185f2a5b7e66e7e327fe4b7841408/src/SourceBuild/content/test/Microsoft.DotNet.SourceBuild.SmokeTests/DebugTests.cs#L99C1-L102

@MichalStrehovsky
Copy link
Member

The test is still failing for ilc not meeting this expectation:

What is the output of readelf on the unstripped ilc? The line is looking for c|cc|cpp|cxx file extension, but despite of that I would expect we have at least some of those since we link with some unmanaged code.

@tmds
Copy link
Member Author

tmds commented Jun 20, 2024

What is the output of readelf on the unstripped ilc?

readelf.zip

@MichalStrehovsky
Copy link
Member

Is that readelf -S? I don't see the other strings the test is looking for either.

@MichalStrehovsky
Copy link
Member

Oh, never mind, the test runs both -s and -S. I'm surprised FILE is part of -s output.

@MichaelSimons
Copy link
Member

cc @NikolaMilosavljevic who is familiar with these tests.

@MichaelSimons MichaelSimons added area-testing Improvements in CI and testing and removed untriaged labels Jun 20, 2024
@MichaelSimons MichaelSimons moved this from Backlog to 9.0 in .NET Source Build Jun 20, 2024
@MichaelSimons MichaelSimons moved this from 9.0 to In Progress in .NET Source Build Jun 20, 2024
@MichalStrehovsky
Copy link
Member

The FILE symbols are dropped because we pass -Wl,--gc-sections to the clang invocation that links the output executable. Removing that switch restores FILE entries for various cpp files.

This switch is a 5% size saving on an unstripped hello world.

My question would be why we don't specify --gc-sections for the rest of the product. It would break this test, and then we can delete it :).

@NikolaMilosavljevic NikolaMilosavljevic removed their assignment Jun 20, 2024
@tmds
Copy link
Member Author

tmds commented Jun 21, 2024

I don't know what the effects are of dropping the FILE symbols, and why the test goes looking for those specifically.

Perhaps it would make sense to use -Wl,--gc-sections conditionally based on StripSymbols?

@MichalStrehovsky
Copy link
Member

--gc-sections is not related to symbol stripping. This would break an invariant that one can build an executable with StripSymbols=false and run the symbol stripping tool manually and get identical output to StripSymbols=true.

cc @am11 who did a lot of the work on StripSymbols.

@am11
Copy link
Member

am11 commented Jun 21, 2024

--gc-sections is not related to symbol stripping.

Yup, it is also required by lld (llvm linker) 13+ regardless of symbol stripping.

@tmds, does the test pass with -p:LinkerFlavor=gold (after e.g. dnf install binutils-gold)? It's not a solution, just to understand the impact.

@am11
Copy link
Member

am11 commented Jun 21, 2024

I think the test is not very useful. We should export a symbol in test (via [UnmanagedCallersOnly(EntryPoint = "SymbolUnderTest")]) and then expect/assert that symbol to be exported by the binary instead. If the FILE symbol is not kept and the app runs without a problem, then expectation that the symbol would be exported is incorrect (it depends on linker, version, platform which is non-deterministic and doesn't indicate whether the app actually works).

@tmds
Copy link
Member Author

tmds commented Jun 21, 2024

Yup, it is also required by lld (llvm linker) 13+ regardless of symbol stripping.

Is it required to add --gc-sections, or to omit --gc-sections? And what breaks if we don't?

If the FILE symbol is not kept and the app runs without a problem, then expectation that the symbol would be exported is incorrect.

I don't know what uses the FILE symbols are for. I imagine it is related to debugging rather than running.
The test itself is focused on debugging.

@omajid you authored this test, any thoughts on omitting the FILE check?

@am11
Copy link
Member

am11 commented Jun 21, 2024

Is it required to add --gc-sections, or to omit --gc-sections? And what breaks if we don't?

Targets has <CustomLinkerArg Include="-Wl,--gc-sections" Condition="'$(LinkerFlavor)' == '' or '$(LinkerFlavor)' == 'bfd' or '$(LinkerFlavor)' == 'lld'" />, so certainly not an omission. As to what breaks, I don't remember but there was some issue getting this right with the linker script we emit for lld 13+.

I think we should not base these decisions off of FILE symbol test (alone), which is not a tangible way to validate the functionality of AOT applications.

@MichalStrehovsky
Copy link
Member

The test is trying to ensure we have the debugging symbols in the executable (SourceBuiltSdkContainsNativeDebugSymbols()). It's already testing for .debug_info and .debug_abbrev sections. I'm not sure what scenario the additional check for FILE symbols is covering and whether there's a better way to cover the additional thing. We don't emit FILE symbols and source stepping works, so these are not used for that. I think it's just trying to test a side effect that happens to be not present for executables built with gc-sections.

@am11
Copy link
Member

am11 commented Jun 21, 2024

DotNetRuntimeDebugHeader is probably a better symbol to test (which dotnet/diagnostics code actually uses).

@MichalStrehovsky
Copy link
Member

SourceBuiltSdkContainsNativeDebugSymbols() should ideally work for both C++ generated and ILC generated files. There's a possibility CoreCLR will also have a DotNetRuntimeDebugHeader export in the future, so it's not a great replacement of the FILE record check.

I think the key is understanding what is the FILE check protecting from. I would think checking for .debug_info and .debug_abbrev sections is sufficient, but there must have been a reason for the extra check.

@omajid
Copy link
Member

omajid commented Jun 21, 2024

I think the key is understanding what is the FILE check protecting from. I would think checking for .debug_info and .debug_abbrev sections is sufficient, but there must have been a reason for the extra check.

I don't really have more context than the comment in the test:

        // Test FILE symbols. These will most likely be removed by anything that
        // manipulates symbol tables because it's generally useless. So a nice test
        // that nothing has messed with symbols.

The comment pretty much summarizes what a gdb developer explained to me ~10 years ago. The test is just trying to making sure nothing has stripped symbols or otherwise manipulated debug-info for unmanaged binaries. In other words, that the debuginfo is generated by the original (C/C++) compiler and has not be modified by any other tool. This seemed like a good invariant and has held up true for .NET Core pretty much since 1.0.

If this is not a good invariant for managed/NativeAOT generated code, I would be okay with disabling it for those cases (which is just ilc for now?).

@am11
Copy link
Member

am11 commented Jun 21, 2024

After some experiments, I haven't found FILE symbol being exported on Ubuntu or Alpine with .NET 8 and .NET 9 P6 with/without symbol stripping, which suggests we don't export it and yet gdb and lldb debugging is working fine. .gnu_debuglink is how debuggers link .dbg file in stripped mode.

@MichalStrehovsky
Copy link
Member

If this is not a good invariant for managed/NativeAOT generated code, I would be okay with disabling it for those cases (which is just ilc for now?).

It's not a good invariant for anything compiled with -Wl,--gc-sections -Wl,--discard-all. Native AOT passes both arguments. I don't observe any effect on the debuggability of the output with these switches. It the purpose of the test is to check for debuggability, it's possible to not have these FILE symbol and still be debuggable.

You can try with:

clang -g test.c -o test -Wl,--gc-sections -Wl,--discard-all

and run readelf -s on the output.

@am11
Copy link
Member

am11 commented Jun 22, 2024

ChatGPT:

Effect on Debugging:

While FILE symbols are helpful for debugging, their absence does not prevent debugging entirely. Debuggers can still work with other symbols (such as function and variable names) and line number information if it is preserved. However, without FILE symbols, the debugger might lack detailed source file context, making it harder to navigate and understand the code during a debug session.

and with DWARF format (which we are using), it captures the filename and line number info in standard format, rendering FILE symbol redundant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-testing Improvements in CI and testing
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

7 participants