Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: VS Code extension support #8

Open
MathiasMagnus opened this issue Aug 30, 2021 · 6 comments
Open

Feature request: VS Code extension support #8

MathiasMagnus opened this issue Aug 30, 2021 · 6 comments

Comments

@MathiasMagnus
Copy link

It would be mighty useful if there were visual debugging support akin to nSight VS Code Edition allowing VS Code users to set breakpoints in kernels and inspect variable values like on the host.

I don't know if this repo or the rocr_debug_agent repo is more appropriate to enable this functionality.

@t-tye
Copy link
Contributor

t-tye commented Aug 30, 2021

ROCgdb supports the standard gdb MI interface so should function with existing GUI tools. Have you tried it?

I believe someone working on Theia which is using the same debugger interface as VS Code did get ROCgdb working with it.

Currently the ROCm compiler is not generating symbolic variable information so variable inspection is not yet possible. But a future release is planned to add that support.

@MathiasMagnus
Copy link
Author

@t-tye Thanks for the reply. I have tried using rocgdb as a drop-in replacement to GDB and it worked well, although the primary reason I'd do so is kernel debugging. That's something currently I rely on HIP-CPU for, and while it is unquestionably valuable, it requires all my dependencies to compile with it, which is still a long way from happening (rocFFT, rocBLAS, etc.), or I have to write small dispatch shims to all such libraries to a host implementation, which is a lot of work. Currently it's comfort zone is very limited, hence my request.

@bohemondcouka
Copy link

Hi, I'm the one working with Theia on a debug view for ROCgdb.
As @t-tye said, it's quite limited for the moment due to ROCm compiler limitation and it's more a proof of concept than a finished product. As Theia extension work on VSCode it should be able to work, but I haven't tested it yet.
As ROCgdb uses the same protocol as gdb, when ROCm compiler will be updated, you should be able to use a C++ GDB extension and use ROCgdb instead of GDB to get the job done.

@t-tye
Copy link
Contributor

t-tye commented Aug 31, 2021

@t-tye Thanks for the reply. I have tried using rocgdb as a drop-in replacement to GDB and it worked well, although the primary reason I'd do so is kernel debugging. That's something currently I rely on HIP-CPU for, and while it is unquestionably valuable, it requires all my dependencies to compile with it, which is still a long way from happening (rocFFT, rocBLAS, etc.), or I have to write small dispatch shims to all such libraries to a host implementation, which is a lot of work. Currently it's comfort zone is very limited, hence my request.

@MathiasMagnus not sure I am following. ROCgdb debugs both host and device code in a unified way. So it should work with however you compile the program. Where are you having a problem?

lmoriche pushed a commit that referenced this issue Oct 28, 2021
When loading the debug info package
libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug from openSUSE Leap 15.2, we
run into a dwarf error:
...
$ gdb -q -batch libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug
Dwarf Error: Cannot not find DIE at 0x18a936e7 \
  [from module libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug]
...
The DIE @ 0x18a936e7 does in fact exist, and is part of a CU @ 0x18a23e52.
No error message is printed when using -readnow.

What happens is the following:
- a dwarf2_per_cu_data P is created for the CU.
- a dwarf2_cu A is created for the same CU.
- another dwarf2_cu B is created for the same CU.
- the dwarf2_cu B is set in per_objfile->m_dwarf2_cus, such that
  per_objfile->get_cu (P) returns B.
- P->load_all_dies is set to 1.
- all dies are read into the A->partial_dies htab
- dwarf2_cu A is destroyed.
- we try to find the partial_die for the DIE @ 0x18a936e7 in B->partial_dies.
  We can't find it, but do not try to load all dies, because P->load_all_dies
  is already set to 1.
- an error message is generated.

The question is why we're creating dwarf2_cu A and B for the same CU.

The dwarf2_cu A is created here:
...
 (gdb) bt
 #0  dwarf2_cu::dwarf2_cu (this=0x79a9660, per_cu=0x23c0b30,
     per_objfile=0x1ad01b0) at dwarf2/cu.c:38
 #1  0x0000000000675799 in cutu_reader::cutu_reader (this=0x7fffffffd040,
     this_cu=0x23c0b30, per_objfile=0x1ad01b0, abbrev_table=0x0,
     existing_cu=0x0, skip_partial=false) at dwarf2/read.c:6487
 #2  0x0000000000676eb3 in process_psymtab_comp_unit (this_cu=0x23c0b30,
      per_objfile=0x1ad01b0, want_partial_unit=false,
      pretend_language=language_minimal) at dwarf2/read.c:7028
...

And the dwarf2_cu B is created here:
...
 (gdb) bt
 #0  dwarf2_cu::dwarf2_cu (this=0x885e8c0, per_cu=0x23c0b30,
     per_objfile=0x1ad01b0) at dwarf2/cu.c:38
 #1  0x0000000000675799 in cutu_reader::cutu_reader (this=0x7fffffffcc50,
     this_cu=0x23c0b30, per_objfile=0x1ad01b0, abbrev_table=0x0,
     existing_cu=0x0, skip_partial=false) at dwarf2/read.c:6487
 #2  0x0000000000678118 in load_partial_comp_unit (this_cu=0x23c0b30,
     per_objfile=0x1ad01b0, existing_cu=0x0) at dwarf2/read.c:7436
 #3  0x000000000069721d in find_partial_die (sect_off=(unknown: 0x18a55054),
     offset_in_dwz=0, cu=0x0) at dwarf2/read.c:19391
 #4  0x000000000069755b in partial_die_info::fixup (this=0x9096900,
     cu=0xa6a85f0) at dwarf2/read.c:19512
 #5  0x0000000000697586 in partial_die_info::fixup (this=0x8629bb0,
     cu=0xa6a85f0) at dwarf2/read.c:19516
 #6  0x00000000006787b1 in scan_partial_symbols (first_die=0x8629b40,
     lowpc=0x7fffffffcf58, highpc=0x7fffffffcf50, set_addrmap=0, cu=0x79a9660)
     at dwarf2/read.c:7563
 #7  0x0000000000678878 in scan_partial_symbols (first_die=0x796ebf0,
     lowpc=0x7fffffffcf58, highpc=0x7fffffffcf50, set_addrmap=0, cu=0x79a9660)
     at dwarf2/read.c:7580
 #8  0x0000000000676b82 in process_psymtab_comp_unit_reader
     (reader=0x7fffffffd040, info_ptr=0x7fffc1b3f29b, comp_unit_die=0x6ea90f0,
     pretend_language=language_minimal) at dwarf2/read.c:6954
 #9  0x0000000000676ffd in process_psymtab_comp_unit (this_cu=0x23c0b30,
     per_objfile=0x1ad01b0, want_partial_unit=false,
     pretend_language=language_minimal) at dwarf2/read.c:7057
...

So in frame #9, a cutu_reader is created with dwarf2_cu A.  Then a fixup takes
us to the following CU @ 0x18aa33d6, in frame #5.  And a similar fixup in
frame #4 takes us back to CU @ 0x18a23e52.  At that point, there's no
information available that we're already trying to read that CU, and we end up
creating another cutu_reader with dwarf2_cu B.

It seems that there are two related problems:
- creating two dwarf2_cu's is not optimal
- the unoptimal case is not handled correctly

This patch addresses the last problem, by moving the load_all_dies flag from
dwarf2_per_cu_data to dwarf2_cu, such that it is paired with the partial_dies
field, which ensures that the two can be kept in sync.

Tested on x86_64-linux.

gdb/ChangeLog:

2021-05-27  Tom de Vries  <[email protected]>

	PR symtab/27898
	* dwarf2/cu.c (dwarf2_cu::dwarf2_cu): Add load_all_dies init.
	* dwarf2/cu.h (dwarf2_cu): Add load_all_dies field.
	* dwarf2/read.c (load_partial_dies, find_partial_die): Update.
	* dwarf2/read.h (dwarf2_per_cu_data::dwarf2_per_cu_data): Remove
	load_all_dies init.
	(dwarf2_per_cu_data): Remove load_all_dies field.
lmoriche pushed a commit that referenced this issue Oct 28, 2021
The past commit d1e93af ("gdb: set current thread in
sparc_{fetch,collect}_inferior_registers (PR gdb/27147)") changed
sparc_fetch_inferior_registers and sparc_store_inferior_registers to
look up the thread corresponding to the regcache's ptid and make it the
current thread.  The reason being that down the call chain, some
functions (like sparc_supply_rwindow) can do some memory reads or write,
through target_read_memory/target_write_memory, and those rely on the
current global context.

There is one small problem with this approach: when debugging a
multi-threaded program, the regcache for a new thread is created just
before the corresponding thread_info is created.  In fact, the regcache
is created somewhere during the call to thread_from_lwp, which is
responsible for creating the thread_info:

    #8  0x0000010000ab9968 in internal_error (file=0x10000bfca20 "/home/simark/src/binutils-gdb/gdb/thread.c", line=1346, fmt=0x10000bfc918 "%s: Assertion `%s' failed.") at /home/simark/src/binutils-gdb/gdbsupport/errors.cc:55
    #9  0x0000010000827f3c in switch_to_thread (thr=0x0) at /home/simark/src/binutils-gdb/gdb/thread.c:1346
    #10 0x0000010000753444 in sparc_fetch_inferior_registers (proc_target=0x10000fa8cb0 <the_sparc64_linux_nat_target>, regcache=0x10000ff03c0, regnum=-1) at /home/simark/src/binutils-gdb/gdb/sparc-nat.c:175
    #11 0x000001000075b908 in sparc64_linux_nat_target::fetch_registers (this=0x10000fa8cb0 <the_sparc64_linux_nat_target>, regcache=0x10000ff03c0, regnum=-1) at /home/simark/src/binutils-gdb/gdb/sparc64-linux-nat.c:38
    #12 0x00000100007fe6f4 in target_ops::fetch_registers (this=0x10000f7feb0 <the_thread_db_target>, arg0=0x10000ff03c0, arg1=-1) at /home/simark/src/binutils-gdb/gdb/target-delegates.c:496
    #13 0x00000100008162a0 in target_fetch_registers (regcache=0x10000ff03c0, regno=-1) at /home/simark/src/binutils-gdb/gdb/target.c:3287
    #14 0x000001000060a4bc in ps_lgetregs (ph=0x10001264368, lwpid=458727, gregset=0x7feff97d388) at /home/simark/src/binutils-gdb/gdb/proc-service.c:158
    #15 0xffff800103e32420 in __td_ta_lookup_th_unique (ta_arg=0x100012d7080, lwpid=<optimized out>, th=0x7feff97d7c8) at td_ta_map_lwp2thr.c:119
    #16 0xffff800103e32604 in td_ta_map_lwp2thr (ta_arg=0x100012d7080, lwpid=<optimized out>, th=0x7feff97d7c8) at td_ta_map_lwp2thr.c:207
    #17 0x000001000051fee8 in thread_from_lwp (stopped=0x100011a3650, ptid=...) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:415
    #18 0x0000010000520150 in thread_db_notice_clone (parent=..., child=...) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:446
    #19 0x00000100005068a8 in linux_handle_extended_wait (lp=0x10001230700, status=4479) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:1978
    #20 0x000001000050a278 in linux_nat_filter_event (lwpid=458724, status=198015) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:2913
    #21 0x000001000050b818 in linux_nat_wait_1 (ptid=..., ourstatus=0x7feff97e8d0, target_options=...) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:3194
    #22 0x000001000050ca4c in linux_nat_target::wait (this=0x10000fa8cb0 <the_sparc64_linux_nat_target>, ptid=..., ourstatus=0x7feff97e8d0, target_options=...) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:3432
    #23 0x00000100005237ec in thread_db_target::wait (this=0x10000f7feb0 <the_thread_db_target>, ptid=..., ourstatus=0x7feff97e8d0, options=...) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1379
    #24 0x00000100007fa668 in target_wait (ptid=..., status=0x7feff97e8d0, options=...) at /home/simark/src/binutils-gdb/gdb/target.c:2000
    #25 0x00000100004adb0c in do_target_wait_1 (inf=0x10001173170, ptid=..., status=0x7feff97e8d0, options=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:3464
    #26 0x00000100004add48 in operator() (__closure=0x7feff97e658, inf=0x10001173170) at /home/simark/src/binutils-gdb/gdb/infrun.c:3527
    #27 0x00000100004ae15c in do_target_wait (wait_ptid=..., ecs=0x7feff97e8a8, options=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:3540
    #28 0x00000100004af254 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:3880
    #29 0x0000010000486ef8 in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:42

The problem is that while sparc_fetch_inferior_registers runs and is
asked to read the registers of a given ptid, there isn't a thread_info
with that ptid yet.  So, find_thread_ptid returns nullptr, and
switch_to_thread gives an internal error.

Fix this by only setting inferior_ptid, instead of the whole global
context, as switch_to_thread does.  This is sufficient for
target_read_memory / target_write_memory to work down the line.

Ideally, it would be nice to be able to pass the ptid down the whole
call chain and to target_read_memory / target_write_memory, so that this
setting of inferior_ptid would not be necessary.  But this is not going
to happen soon.

This fixes running a multi-threaded program, which would hit the
internal error show in the call stack above.

gdb/ChangeLog:

	PR gdb/27899
	* sparc-nat.c (sparc_fetch_inferior_registers): Set
	inferior_ptid instead of using switch_to_thread.
	(sparc_store_inferior_registers): Likewise.

Change-Id: I0b6ddb3af9b11f67b10ee46a734fb82ecc6462d5
lmoriche pushed a commit that referenced this issue Oct 28, 2021
… when attaching / handling a fork child

When trying to attach to a pthread process on a Linux system with glibc 2.33,
we get:

    $ ./gdb -q -nx --data-directory=data-directory -p 1472010
    Attaching to process 1472010
    [New LWP 1472013]
    [New LWP 1472014]
    [New LWP 1472015]
    Error while reading shared library symbols for /usr/lib/libpthread.so.0:
    Cannot find user-level thread for LWP 1472015: generic error
    0x00007ffff6d3637f in poll () from /usr/lib/libc.so.6
    (gdb)

When attaching to a process (or handling a fork child, an operation very
similar to attaching), GDB reads the shared library list from the
process.  For each shared library (if "set auto-solib-add" is on), it
reads its symbols and calls the "new_objfile" observable.

The libthread-db code monitors this observable, and if it sees an
objfile named somewhat like "libpthread.so" go by, it tries to load
libthread_db.so in the GDB process itself.  libthread_db knows how to
navigate libpthread's data structures to get information about the
existing threads.

To locate these data structures, libthread_db calls ps_pglobal_lookup
(implemented in proc-service.c), passing in a symbol name and expecting
an address in return.

Before glibc 2.33, libthread_db always asked for symbols found in
libpthread.  There was no ordering problem: since we were always trying
to load libthread_db in reaction to processing libpthread (and reading
in its symbols) and libthread_db only asked symbols from libpthread, the
requested symbols could always be found.  Starting with glibc 2.33,
libthread_db now asks for a symbol name that can be found in
/lib/ld-linux-x86-64.so.2 (_rtld_global).  And the ordering in which GDB
reads the shared libraries from the inferior when attaching is
unfortunate, in that libpthread is processed before ld-linux.  So when
loading libthread_db in reaction to processing libpthread, and
libthread_db requests the symbol that is from ld-linux, GDB is not yet
able to supply it.

That problematic symbol lookup happens in the thread_from_lwp function,
when we call td_ta_map_lwp2thr_p, and an exception is thrown at this
point:

    #0  0x00007ffff6681012 in __cxxabiv1::__cxa_throw (obj=0x60e000006100, tinfo=0x555560033b50 <typeinfo for gdb_exception_error>, dest=0x55555d9404bc <gdb_exception_error::~gdb_exception_error()>) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_throw.cc:78
    #1  0x000055555e5d3734 in throw_it(return_reason, errors, const char *, typedef __va_list_tag __va_list_tag *) (reason=RETURN_ERROR, error=GENERIC_ERROR, fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", ap=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdbsupport/common-exceptions.cc:200
    #2  0x000055555e5d37d4 in throw_verror (error=GENERIC_ERROR, fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", ap=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdbsupport/common-exceptions.cc:208
    #3  0x000055555e0b0ed2 in verror (string=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", args=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdb/utils.c:171
    #4  0x000055555e5e898a in error (fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s") at /home/simark/src/binutils-gdb/gdbsupport/errors.cc:43
    #5  0x000055555d06b4bc in thread_from_lwp (stopped=0x617000035d80, ptid=...) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:418
    #6  0x000055555d07040d in try_thread_db_load_1 (info=0x60c000011140) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:912
    #7  0x000055555d071103 in try_thread_db_load (library=0x55555f0c62a0 "libthread_db.so.1", check_auto_load_safe=false) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1014
    #8  0x000055555d072168 in try_thread_db_load_from_sdir () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1091
    #9  0x000055555d072d1c in thread_db_load_search () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1146
    #10 0x000055555d07365c in thread_db_load () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1203
    #11 0x000055555d07373e in check_for_thread_db () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1246
    #12 0x000055555d0738ab in thread_db_new_objfile (objfile=0x61300000c0c0) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1275
    #13 0x000055555bd10740 in std::__invoke_impl<void, void (*&)(objfile*), objfile*> (__f=@0x616000068d88: 0x55555d073745 <thread_db_new_objfile(objfile*)>) at /usr/include/c++/10.2.0/bits/invoke.h:60
    #14 0x000055555bd02096 in std::__invoke_r<void, void (*&)(objfile*), objfile*> (__fn=@0x616000068d88: 0x55555d073745 <thread_db_new_objfile(objfile*)>) at /usr/include/c++/10.2.0/bits/invoke.h:153
    #15 0x000055555bce0392 in std::_Function_handler<void (objfile*), void (*)(objfile*)>::_M_invoke(std::_Any_data const&, objfile*&&) (__functor=..., __args#0=@0x7fffffffb4a0: 0x61300000c0c0) at /usr/include/c++/10.2.0/bits/std_function.h:291
    #16 0x000055555d3595c0 in std::function<void (objfile*)>::operator()(objfile*) const (this=0x616000068d88, __args#0=0x61300000c0c0) at /usr/include/c++/10.2.0/bits/std_function.h:622
    #17 0x000055555d356b7f in gdb::observers::observable<objfile*>::notify (this=0x555566727020 <gdb::observers::new_objfile>, args#0=0x61300000c0c0) at /home/simark/src/binutils-gdb/gdb/../gdbsupport/observable.h:106
    #18 0x000055555da3f228 in symbol_file_add_with_addrs (abfd=0x61200001ccc0, name=0x6190000d9090 "/usr/lib/libpthread.so.0", add_flags=..., addrs=0x7fffffffbc10, flags=..., parent=0x0) at /home/simark/src/binutils-gdb/gdb/symfile.c:1131
    #19 0x000055555da3f763 in symbol_file_add_from_bfd (abfd=0x61200001ccc0, name=0x6190000d9090 "/usr/lib/libpthread.so.0", add_flags=<error reading variable: Cannot access memory at address 0xffffffffffffffb0>, addrs=0x7fffffffbc10, flags=<error reading variable: Cannot access memory at address 0xffffffffffffffc0>, parent=0x0) at /home/simark/src/binutils-gdb/gdb/symfile.c:1167
    #20 0x000055555d95f9fa in solib_read_symbols (so=0x6190000d8e80, flags=...) at /home/simark/src/binutils-gdb/gdb/solib.c:681
    #21 0x000055555d96233d in solib_add (pattern=0x0, from_tty=0, readsyms=1) at /home/simark/src/binutils-gdb/gdb/solib.c:987
    #22 0x000055555d93646e in enable_break (info=0x608000008f20, from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:2238
    #23 0x000055555d93cfc0 in svr4_solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:3049
    #24 0x000055555d96610d in solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib.c:1195
    #25 0x000055555cdee318 in post_create_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:318
    #26 0x000055555ce00e6e in setup_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:2439
    #27 0x000055555ce59c34 in handle_one (event=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:4887
    #28 0x000055555ce5cd00 in stop_all_threads () at /home/simark/src/binutils-gdb/gdb/infrun.c:5064
    #29 0x000055555ce7f0da in stop_waiting (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:8006
    #30 0x000055555ce67f5c in handle_signal_stop (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:6062
    #31 0x000055555ce63653 in handle_inferior_event (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:5727
    #32 0x000055555ce4f297 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4105
    #33 0x000055555cdbe3bf in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:42
    #34 0x000055555d018047 in handle_target_event (error=0, client_data=0x0) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:4060
    #35 0x000055555e5ea77e in handle_file_event (file_ptr=0x60600008b1c0, ready_mask=1) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:575
    #36 0x000055555e5eb09c in gdb_wait_for_event (block=0) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:701
    #37 0x000055555e5e8d19 in gdb_do_one_event () at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:212
    #38 0x000055555dd6e0d4 in wait_sync_command_done () at /home/simark/src/binutils-gdb/gdb/top.c:528
    #39 0x000055555dd6e372 in maybe_wait_sync_command_done (was_sync=0) at /home/simark/src/binutils-gdb/gdb/top.c:545
    #40 0x000055555d0ec7c8 in catch_command_errors (command=0x55555ce01bb8 <attach_command(char const*, int)>, arg=0x7fffffffe28d "1472010", from_tty=1, do_bp_actions=false) at /home/simark/src/binutils-gdb/gdb/main.c:452
    #41 0x000055555d0f03ad in captured_main_1 (context=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1149
    #42 0x000055555d0f1239 in captured_main (data=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1232
    #43 0x000055555d0f1315 in gdb_main (args=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1257
    #44 0x000055555bb70cf9 in main (argc=7, argv=0x7fffffffde88) at /home/simark/src/binutils-gdb/gdb/gdb.c:32

The exception is caught here:

    #0  __cxxabiv1::__cxa_begin_catch (exc_obj_in=0x60e0000060e0) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_catch.cc:84
    #1  0x000055555d95fded in solib_read_symbols (so=0x6190000d8e80, flags=...) at /home/simark/src/binutils-gdb/gdb/solib.c:689
    #2  0x000055555d96233d in solib_add (pattern=0x0, from_tty=0, readsyms=1) at /home/simark/src/binutils-gdb/gdb/solib.c:987
    #3  0x000055555d93646e in enable_break (info=0x608000008f20, from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:2238
    #4  0x000055555d93cfc0 in svr4_solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:3049
    #5  0x000055555d96610d in solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib.c:1195
    #6  0x000055555cdee318 in post_create_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:318
    #7  0x000055555ce00e6e in setup_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:2439
    #8  0x000055555ce59c34 in handle_one (event=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:4887
    #9  0x000055555ce5cd00 in stop_all_threads () at /home/simark/src/binutils-gdb/gdb/infrun.c:5064
    #10 0x000055555ce7f0da in stop_waiting (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:8006
    #11 0x000055555ce67f5c in handle_signal_stop (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:6062
    #12 0x000055555ce63653 in handle_inferior_event (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:5727
    #13 0x000055555ce4f297 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4105
    #14 0x000055555cdbe3bf in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:42
    #15 0x000055555d018047 in handle_target_event (error=0, client_data=0x0) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:4060
    #16 0x000055555e5ea77e in handle_file_event (file_ptr=0x60600008b1c0, ready_mask=1) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:575
    #17 0x000055555e5eb09c in gdb_wait_for_event (block=0) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:701
    #18 0x000055555e5e8d19 in gdb_do_one_event () at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:212
    #19 0x000055555dd6e0d4 in wait_sync_command_done () at /home/simark/src/binutils-gdb/gdb/top.c:528
    #20 0x000055555dd6e372 in maybe_wait_sync_command_done (was_sync=0) at /home/simark/src/binutils-gdb/gdb/top.c:545
    #21 0x000055555d0ec7c8 in catch_command_errors (command=0x55555ce01bb8 <attach_command(char const*, int)>, arg=0x7fffffffe28d "1472010", from_tty=1, do_bp_actions=false) at /home/simark/src/binutils-gdb/gdb/main.c:452
    #22 0x000055555d0f03ad in captured_main_1 (context=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1149
    #23 0x000055555d0f1239 in captured_main (data=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1232
    #24 0x000055555d0f1315 in gdb_main (args=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1257
    #25 0x000055555bb70cf9 in main (argc=7, argv=0x7fffffffde88) at /home/simark/src/binutils-gdb/gdb/gdb.c:32

Catching the exception at this point means that the thread_db_info
object for this inferior will be left in place, despite the failure to
load libthread_db.  This means that there won't be further attempts at
loading libthread_db, because thread_db_load will think that
libthread_db is already loaded for this inferior and will always exit
early.  To fix this, add a try/catch around calling try_thread_db_load_1
in try_thread_db_load, such that if some exception is thrown while
trying to load libthread_db, we reset / delete the thread_db_info for
that inferior.  That alone makes attach work fine again, because
check_for_thread_db is called again in the thread_db_inferior_created
observer (that happens after we learned about all shared libraries and
their symbols), and libthread_db is successfully loaded then.

When attaching, I think that the inferior_created observer is a good
place to try to load libthread_db: it is called once everything has
stabilized, when we learned about all shared libraries.

The only problem then is that when we first try (and fail) to load
libthread_db, in reaction to learning about libpthread, we show this
warning:

    warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.

This is misleading, because we do succeed in loading it later.  So when
attaching, I think we shouldn't try to load libthread_db in reaction to
the new_objfile events, we should wait until we have learned about all
shared libraries (using the inferior_created observable).  To do so, add
an `in_initial_library_scan` flag to struct inferior.  This flag is used
to postpone loading libthread_db if we are attaching or handling a fork
child.

When debugging remotely with GDBserver, the same problem happens, except
that the qSymbol mechanism (allowing the remote side to ask GDB for
symbols values) is involved.  The fix there is the same idea, we make
GDB wait until all shared libraries and their symbols are known before
sending out a qSymbol packet.  This way, we never present the remote
side a state where libpthread.so's symbols are known but ld-linux's
symbols aren't.

gdb/ChangeLog:

	* inferior.h (class inferior) <in_initial_library_scan>: New.
	* infcmd.c (post_create_inferior): Set in_initial_library_scan.
	* infrun.c (follow_fork_inferior): Likewise.
	* linux-thread-db.c (try_thread_db_load): Catch exception thrown
	by try_thread_db_load_1
	(thread_db_load): Return early if in_initial_library_scan is
	set.
	* remote.c (remote_new_objfile): Return early if
	in_initial_library_scan is set.

Change-Id: I7a279836cfbb2b362b4fde11b196b4aab82f5efb
lmoriche pushed a commit that referenced this issue Oct 28, 2021
When loading a mach-o (macOS) executable and trying to set a breakpoint,
a GDB built with ASan or -D_GLIBCXX_DEBUG will crash with an
out-of-bound vector access.  This can be reproduced on Linux using the
repro files in bug 28017 [1]:

    $ ./gdb -nx --data-directory=data-directory -q repro/test -ex "b main" -batch
    /usr/include/c++/11.1.0/debug/vector:445:
    In function:
        std::__debug::vector<_Tp, _Allocator>::const_reference
        std::__debug::vector<_Tp,
        _Allocator>::operator[](std::__debug::vector<_Tp,
        _Allocator>::size_type) const [with _Tp = long unsigned int; _Allocator
        = std::allocator<long unsigned int>; std::__debug::vector<_Tp,
        _Allocator>::const_reference = const long unsigned int&;
        std::__debug::vector<_Tp, _Allocator>::size_type = long unsigned int]

    Error: attempt to subscript container with out-of-bounds index 13, but
    container only holds 13 elements.

    Objects involved in the operation:
        sequence "this" @ 0x0x61300000a590 {
          type = std::__debug::vector<unsigned long, std::allocator<unsigned long> >;
        }

The out-of-bound access happens here:

    #0  0x00007ffff6405d22 in raise () from /usr/lib/libc.so.6
    #1  0x00007ffff63ef862 in abort () from /usr/lib/libc.so.6
    #2  0x00007ffff664e21e in __gnu_debug::_Error_formatter::_M_error() const [clone .cold] from /usr/lib/libstdc++.so.6
    #3  0x000055555699e5ff in std::__debug::vector<unsigned long, std::allocator<unsigned long> >::operator[] (this=0x61300000a590, __n=13) at /usr/include/c++/11.1.0/debug/vector:445
    #4  0x0000555556a58c17 in objfile::section_offset (this=0x61300000a4c0, section=0x55555bbe4ac0 <_bfd_std_section>) at /home/simark/src/binutils-gdb/gdb/objfiles.h:644
    #5  0x0000555556a58cac in obj_section::offset (this=0x62100016d2a8) at /home/simark/src/binutils-gdb/gdb/objfiles.h:838
    #6  0x0000555556a58cfa in obj_section::addr (this=0x62100016d2a8) at /home/simark/src/binutils-gdb/gdb/objfiles.h:850
    #7  0x000055555779f5f7 in sort_cmp (sect1=0x62100016d2a8, sect2=0x62100016d170) at /home/simark/src/binutils-gdb/gdb/objfiles.c:902
    #8  0x00005555577aae35 in __gnu_cxx::__ops::_Iter_comp_iter<bool (*)(obj_section const*, obj_section const*)>::operator()<obj_section**, obj_section**> (this=0x7fffffffa9e0, __it1=0x60c000015970, __it2=0x60c000015940) at /usr/include/c++/11.1.0/bits/predefined_ops.h:158
    #9  0x00005555577aa2b8 in std::__insertion_sort<obj_section**, __gnu_cxx::__ops::_Iter_comp_iter<bool (*)(obj_section const*, obj_section const*)> > (__first=0x60c000015940, __last=0x60c0000159c0, __comp=...) at /usr/include/c++/11.1.0/bits/stl_algo.h:1826
    #10 0x00005555577a8e26 in std::__final_insertion_sort<obj_section**, __gnu_cxx::__ops::_Iter_comp_iter<bool (*)(obj_section const*, obj_section const*)> > (__first=0x60c000015940, __last=0x60c0000159c0, __comp=...) at /usr/include/c++/11.1.0/bits/stl_algo.h:1871
    #11 0x00005555577a723c in std::__sort<obj_section**, __gnu_cxx::__ops::_Iter_comp_iter<bool (*)(obj_section const*, obj_section const*)> > (__first=0x60c000015940, __last=0x60c0000159c0, __comp=...) at /usr/include/c++/11.1.0/bits/stl_algo.h:1957
    #12 0x00005555577a50f4 in std::sort<obj_section**, bool (*)(obj_section const*, obj_section const*)> (__first=0x60c000015940, __last=0x60c0000159c0, __comp=0x55555779f4e7 <sort_cmp(obj_section const*, obj_section const*)>) at /usr/include/c++/11.1.0/bits/stl_algo.h:4875
    #13 0x00005555577a147e in update_section_map (pspace=0x61200001d2c0, pmap=0x6030000d40b0, pmap_size=0x6030000d40b8) at /home/simark/src/binutils-gdb/gdb/objfiles.c:1165
    #14 0x00005555577a19a0 in find_pc_section (pc=0x100003fa0) at /home/simark/src/binutils-gdb/gdb/objfiles.c:1212
    #15 0x00005555576dd39e in lookup_minimal_symbol_by_pc_section (pc_in=0x100003fa0, section=0x0, prefer=lookup_msym_prefer::TEXT, previous=0x0) at /home/simark/src/binutils-gdb/gdb/minsyms.c:750
    #16 0x00005555576de552 in lookup_minimal_symbol_by_pc (pc=0x100003fa0) at /home/simark/src/binutils-gdb/gdb/minsyms.c:986
    #17 0x0000555557d44b54 in find_pc_sect_line (pc=0x100003fa0, section=0x62100016d170, notcurrent=0) at /home/simark/src/binutils-gdb/gdb/symtab.c:3163
    #18 0x0000555557d489fa in find_function_start_sal_1 (func_addr=0x100003fa0, section=0x62100016d170, funfirstline=true) at /home/simark/src/binutils-gdb/gdb/symtab.c:3650
    #19 0x0000555557d49015 in find_function_start_sal (sym=0x621000191670, funfirstline=true) at /home/simark/src/binutils-gdb/gdb/symtab.c:3706
    #20 0x0000555557485283 in symbol_to_sal (result=0x7fffffffbb30, funfirstline=1, sym=0x621000191670) at /home/simark/src/binutils-gdb/gdb/linespec.c:4460
    #21 0x00005555574728c2 in convert_linespec_to_sals (state=0x7fffffffc390, ls=0x7fffffffc3e0) at /home/simark/src/binutils-gdb/gdb/linespec.c:2335
    #22 0x0000555557475a8e in parse_linespec (parser=0x7fffffffc360, arg=0x60200007a550 "main", match_type=symbol_name_match_type::WILD) at /home/simark/src/binutils-gdb/gdb/linespec.c:2716
    #23 0x0000555557479027 in event_location_to_sals (parser=0x7fffffffc360, location=0x606000097be0) at /home/simark/src/binutils-gdb/gdb/linespec.c:3173
    #24 0x00005555574798f7 in decode_line_full (location=0x606000097be0, flags=1, search_pspace=0x0, default_symtab=0x0, default_line=0, canonical=0x7fffffffcca0, select_mode=0x0, filter=0x0) at /home/simark/src/binutils-gdb/gdb/linespec.c:3253
    #25 0x0000555556b4949f in parse_breakpoint_sals (location=0x606000097be0, canonical=0x7fffffffcca0) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:9134
    #26 0x0000555556b6ce95 in create_sals_from_location_default (location=0x606000097be0, canonical=0x7fffffffcca0, type_wanted=bp_breakpoint) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:13819
    #27 0x0000555556b645a6 in bkpt_create_sals_from_location (location=0x606000097be0, canonical=0x7fffffffcca0, type_wanted=bp_breakpoint) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:12631
    #28 0x0000555556b4badf in create_breakpoint (gdbarch=0x621000152d10, location=0x606000097be0, cond_string=0x0, thread=0, extra_string=0x0, force_condition=false, parse_extra=1, tempflag=0, type_wanted=bp_breakpoint, ignore_count=0, pending_break_support=AUTO_BOOLEAN_AUTO, ops=0x55555bd728a0 <bkpt_breakpoint_ops>, from_tty=0, enabled=1, internal=0, flags=0) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:9410
    #29 0x0000555556b4d3b1 in break_command_1 (arg=0x7fffffffe291 "", flag=0, from_tty=0) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:9590
    #30 0x0000555556b4dc1b in break_command (arg=0x7fffffffe28d "main", from_tty=0) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:9660
    #31 0x0000555556d24ca9 in do_const_cfunc (c=0x61100003a240, args=0x7fffffffe28d "main", from_tty=0) at /home/simark/src/binutils-gdb/gdb/cli/cli-decode.c:102
    #32 0x0000555556d2fcd3 in cmd_func (cmd=0x61100003a240, args=0x7fffffffe28d "main", from_tty=0) at /home/simark/src/binutils-gdb/gdb/cli/cli-decode.c:2160
    #33 0x0000555557e84e93 in execute_command (p=0x7fffffffe290 "n", from_tty=0) at /home/simark/src/binutils-gdb/gdb/top.c:674
    #34 0x00005555575a9933 in catch_command_errors (command=0x555557e84043 <execute_command(char const*, int)>, arg=0x7fffffffe28b "b main", from_tty=0, do_bp_actions=true) at /home/simark/src/binutils-gdb/gdb/main.c:523
    #35 0x00005555575a9fdb in execute_cmdargs (cmdarg_vec=0x7fffffffd910, file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND, ret=0x7fffffffd5b0) at /home/simark/src/binutils-gdb/gdb/main.c:618
    #36 0x00005555575ad48a in captured_main_1 (context=0x7fffffffdd00) at /home/simark/src/binutils-gdb/gdb/main.c:1322
    #37 0x00005555575ada9c in captured_main (data=0x7fffffffdd00) at /home/simark/src/binutils-gdb/gdb/main.c:1343
    #38 0x00005555575adb31 in gdb_main (args=0x7fffffffdd00) at /home/simark/src/binutils-gdb/gdb/main.c:1368
    #39 0x000055555681e179 in main (argc=8, argv=0x7fffffffde78) at /home/simark/src/binutils-gdb/gdb/gdb.c:32

The section being dealt with at that moment is the special *COM*
section:

    (top-gdb) p section.name
    $1 = 0x55555a1bbe60 "*COM*"
    (top-gdb) p section
    $2 = (bfd_section *) 0x55555bbe4ac0 <_bfd_std_section>

I'm not too sure what this section is for, but this is one of four
special BFD sections that GDB puts after the regular sections in the
objfile::sections and objfile::section_offsets lists.  You can check
gdb_bfd_section_index to see how they are handled.
gdb_bfd_count_sections returns "+ 4" to account for those sections.

The problem is that macho_symfile_offsets uses bfd_count_sections
instead of gdb_bfd_count_sections when allocating the
objfile::section_offsets vector.  The vector will therefore contain,
say, 13 elements instead of 17.  When trying to access the section
offset of the *COM* section, the first after the regular sections, we
access section_offsets[13], which is out of bounds.

Fix that by using gdb_bfd_count_sections instead of bfd_count_sections.
I'm fairly confident that this is correct, as this is what
default_symfile_offsets does.

With this patch, the command shown above terminates normally:

    $ ./gdb -nx --data-directory=data-directory -q repro/test -ex "b main" -batch
    Breakpoint 1 at 0x100003fad: file test.c, line 2.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=28017

gdb/ChangeLog:

	PR gdb/28017
	* machoread.c (macho_symfile_offsets): Use
	gdb_bfd_count_sections to allocate objfile::section_offsets.

Change-Id: Ic3a56f46f7232e9f24581f8255fc1ab981935450
lmoriche pushed a commit that referenced this issue Oct 28, 2021
When loading a file using the file command on macOS, we get:

    $ ./gdb -nx --data-directory=data-directory -q -ex "file ./test"
    Reading symbols from ./test...
    Reading symbols from /Users/smarchi/build/binutils-gdb/gdb/test.dSYM/Contents/Resources/DWARF/test...
    /Users/smarchi/src/binutils-gdb/gdb/thread.c:72: internal-error: struct thread_info *inferior_thread(): Assertion `current_thread_ != nullptr' failed.
    A problem internal to GDB has been detected,
    further debugging may prove unreliable.
    Quit this debugging session? (y or n)

The backtrace is:

    * frame #0: 0x0000000101fcb826 gdb`internal_error(file="/Users/smarchi/src/binutils-gdb/gdb/thread.c", line=72, fmt="%s: Assertion `%s' failed.") at errors.cc:52:3
      frame #1: 0x00000001018a2584 gdb`inferior_thread() at thread.c:72:3
      frame #2: 0x0000000101469c09 gdb`get_current_regcache() at regcache.c:421:31
      frame #3: 0x00000001015f9812 gdb`darwin_solib_get_all_image_info_addr_at_init(info=0x0000603000006d00) at solib-darwin.c:464:34
      frame #4: 0x00000001015f7a04 gdb`darwin_solib_create_inferior_hook(from_tty=1) at solib-darwin.c:515:5
      frame #5: 0x000000010161205e gdb`solib_create_inferior_hook(from_tty=1) at solib.c:1200:3
      frame #6: 0x00000001016d8f76 gdb`symbol_file_command(args="./test", from_tty=1) at symfile.c:1650:7
      frame #7: 0x0000000100abab17 gdb`file_command(arg="./test", from_tty=1) at exec.c:555:3
      frame #8: 0x00000001004dc799 gdb`do_const_cfunc(c=0x000061100000c340, args="./test", from_tty=1) at cli-decode.c:102:3
      frame #9: 0x00000001004ea042 gdb`cmd_func(cmd=0x000061100000c340, args="./test", from_tty=1) at cli-decode.c:2160:7
      frame #10: 0x00000001018d4f59 gdb`execute_command(p="t", from_tty=1) at top.c:674:2
      frame #11: 0x0000000100eee430 gdb`catch_command_errors(command=(gdb`execute_command(char const*, int) at top.c:561), arg="file ./test", from_tty=1, do_bp_actions=true)(char const*, int), char const*, int, bool) at main.c:523:7
      frame #12: 0x0000000100eee902 gdb`execute_cmdargs(cmdarg_vec=0x00007ffeefbfeba0 size=1, file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND, ret=0x00007ffeefbfec20) at main.c:618:9
      frame #13: 0x0000000100eed3a4 gdb`captured_main_1(context=0x00007ffeefbff780) at main.c:1322:3
      frame #14: 0x0000000100ee810d gdb`captured_main(data=0x00007ffeefbff780) at main.c:1343:3
      frame #15: 0x0000000100ee8025 gdb`gdb_main(args=0x00007ffeefbff780) at main.c:1368:7
      frame #16: 0x00000001000044f1 gdb`main(argc=6, argv=0x00007ffeefbff8a0) at gdb.c:32:10
      frame #17: 0x00007fff20558f5d libdyld.dylib`start + 1

The solib_create_inferior_hook call in symbol_file_command was added by
commit ea142fb ("Fix breakpoints on file reloads for PIE
binaries").  It causes solib_create_inferior_hook to be called while
the inferior is not running, which darwin_solib_create_inferior_hook
does not expect.  darwin_solib_get_all_image_info_addr_at_init, in
particular, assumes that there is a current thread, as it tries to get
the current thread's regcache.

Fix it by adding a target_has_execution check and returning early.  Note
that there is a similar check in svr4_solib_create_inferior_hook.

gdb/ChangeLog:

	* solib-darwin.c (darwin_solib_create_inferior_hook): Return
	early if no execution.

Change-Id: Ia11dd983a1e29786e5ce663d0fcaa6846dc611bb
lmoriche pushed a commit that referenced this issue Oct 28, 2021
Commit 408f668 ("detach in all-stop
with threads running") regressed "detach" with "target remote":

 (gdb) detach
 Detaching from program: target:/any/program, process 3671843
 Detaching from process 3671843
 Ending remote debugging.
 [Inferior 1 (process 3671843) detached]
 In main
 terminate called after throwing an instance of 'gdb_exception_error'
 Aborted (core dumped)

Here's the exception above being thrown:

 (top-gdb) bt
 #0  throw_error (error=TARGET_CLOSE_ERROR, fmt=0x555556035588 "Remote connection closed") at src/gdbsupport/common-exceptions.cc:222
 #1  0x0000555555bbaa46 in remote_target::readchar (this=0x555556a11040, timeout=10000) at src/gdb/remote.c:9440
 #2  0x0000555555bbb9e5 in remote_target::getpkt_or_notif_sane_1 (this=0x555556a11040, buf=0x555556a11058, forever=0, expecting_notif=0, is_notif=0x0) at src/gdb/remote.c:9928
 #3  0x0000555555bbbda9 in remote_target::getpkt_sane (this=0x555556a11040, buf=0x555556a11058, forever=0) at src/gdb/remote.c:10030
 #4  0x0000555555bc0e75 in remote_target::remote_hostio_send_command (this=0x555556a11040, command_bytes=13, which_packet=14, remote_errno=0x7fffffffcfd0, attachment=0x0, attachment_len=0x0) at src/gdb/remote.c:12137
 #5  0x0000555555bc1b6c in remote_target::remote_hostio_close (this=0x555556a11040, fd=8, remote_errno=0x7fffffffcfd0) at src/gdb/remote.c:12455
 #6  0x0000555555bc1bb4 in remote_target::fileio_close (During symbol reading: .debug_line address at offset 0x64f417 is 0 [in module build/gdb/gdb]
 this=0x555556a11040, fd=8, remote_errno=0x7fffffffcfd0) at src/gdb/remote.c:12462
 #7  0x0000555555c9274c in target_fileio_close (fd=3, target_errno=0x7fffffffcfd0) at src/gdb/target.c:3365
 #8  0x000055555595a19d in gdb_bfd_iovec_fileio_close (abfd=0x555556b9f8a0, stream=0x555556b11530) at src/gdb/gdb_bfd.c:439
 #9  0x0000555555e09e3f in opncls_bclose (abfd=0x555556b9f8a0) at src/bfd/opncls.c:599
 #10 0x0000555555e0a2c7 in bfd_close_all_done (abfd=0x555556b9f8a0) at src/bfd/opncls.c:847
 #11 0x0000555555e0a27a in bfd_close (abfd=0x555556b9f8a0) at src/bfd/opncls.c:814
 #12 0x000055555595a9d3 in gdb_bfd_close_or_warn (abfd=0x555556b9f8a0) at src/gdb/gdb_bfd.c:626
 #13 0x000055555595ad29 in gdb_bfd_unref (abfd=0x555556b9f8a0) at src/gdb/gdb_bfd.c:715
 #14 0x0000555555ae4730 in objfile::~objfile (this=0x555556515540, __in_chrg=<optimized out>) at src/gdb/objfiles.c:573
 #15 0x0000555555ae955a in std::_Sp_counted_ptr<objfile*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x555556c20db0) at /usr/include/c++/9/bits/shared_ptr_base.h:377
 #16 0x000055555572b7c8 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x555556c20db0) at /usr/include/c++/9/bits/shared_ptr_base.h:155
 #17 0x00005555557263c3 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x555556bf0588, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:730
 #18 0x0000555555ae745e in std::__shared_ptr<objfile, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x555556bf0580, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:1169
 #19 0x0000555555ae747e in std::shared_ptr<objfile>::~shared_ptr (this=0x555556bf0580, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr.h:103
 #20 0x0000555555b1c1dc in __gnu_cxx::new_allocator<std::_List_node<std::shared_ptr<objfile> > >::destroy<std::shared_ptr<objfile> > (this=0x5555564cdd60, __p=0x555556bf0580) at /usr/include/c++/9/ext/new_allocator.h:153
 #21 0x0000555555b1bb1d in std::allocator_traits<std::allocator<std::_List_node<std::shared_ptr<objfile> > > >::destroy<std::shared_ptr<objfile> > (__a=..., __p=0x555556bf0580) at /usr/include/c++/9/bits/alloc_traits.h:497
 #22 0x0000555555b1b73e in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::_M_erase (this=0x5555564cdd60, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556515540}) at /usr/include/c++/9/bits/stl_list.h:1921
 #23 0x0000555555b1afeb in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::erase (this=0x5555564cdd60, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556515540}) at /usr/include/c++/9/bits/list.tcc:158
 #24 0x0000555555b19576 in program_space::remove_objfile (this=0x5555564cdd20, objfile=0x555556515540) at src/gdb/progspace.c:210
 #25 0x0000555555ae4502 in objfile::unlink (this=0x555556515540) at src/gdb/objfiles.c:487
 #26 0x0000555555ae5a12 in objfile_purge_solibs () at src/gdb/objfiles.c:875
 #27 0x0000555555c09686 in no_shared_libraries (ignored=0x0, from_tty=1) at src/gdb/solib.c:1236
 #28 0x00005555559e3f5f in detach_command (args=0x0, from_tty=1) at src/gdb/infcmd.c:2769

So frame #28 already detached the remote process, and then we're
purging the shared libraries.  GDB had opened remote shared libraries
via the target: sysroot, so it tries closing them.  GDBserver is
tearing down already, so remote communication breaks down and we close
the remote target and throw TARGET_CLOSE_ERROR.

Note frame #14:

 #14 0x0000555555ae4730 in objfile::~objfile (this=0x555556515540, __in_chrg=<optimized out>) at src/gdb/objfiles.c:573

That's a dtor, thus noexcept.  That's the reason for the
std::terminate.

Stepping back a bit, why do we still have open remote files if we've
managed to detach already, and, we're debugging with "target remote"?
The reason is that commit 408f668
makes detach_command hold a reference to the target, so the remote
target won't be finally closed until frame #28 returns.  It's closing
the target that invalidates target file I/O handles.

This commit fixes the issue by not relying on target_close to
invalidate the target file I/O handles, instead invalidate them
immediately in remote_unpush_target.  So when GDB purges the solibs,
and we end up in target_fileio_close (frame #7 above), there's nothing
to do, and we don't try to talk with the remote target anymore.

The regression isn't seen when testing with
--target_board=native-gdbserver, because that does "set sysroot" to
disable the "target:" sysroot, for test run speed reasons.  So this
commit adds a testcase that explicitly tests detach with "set sysroot
target:".

gdb/ChangeLog:
yyyy-mm-dd  Pedro Alves  <[email protected]>

	PR gdb/28080
	* remote.c (remote_unpush_target): Invalidate file I/O target
	handles.
	* target.c (fileio_handles_invalidate_target): Make extern.
	* target.h (fileio_handles_invalidate_target): Declare.

gdb/testsuite/ChangeLog:
yyyy-mm-dd  Pedro Alves  <[email protected]>

	PR gdb/28080
	* gdb.base/detach-sysroot-target.exp: New.
	* gdb.base/detach-sysroot-target.c: New.

Reported-By: Jonah Graham <[email protected]>

Change-Id: I851234910172f42a1b30e731161376c344d2727d
lmoriche pushed a commit that referenced this issue Oct 28, 2021
…080)

Before PR gdb/28080 was fixed by the previous patch, GDB was crashing
like this:

 (gdb) detach
 Detaching from program: target:/any/program, process 3671843
 Detaching from process 3671843
 Ending remote debugging.
 [Inferior 1 (process 3671843) detached]
 In main
 terminate called after throwing an instance of 'gdb_exception_error'
 Aborted (core dumped)

Here's the exception above being thrown:

 (top-gdb) bt
 #0  throw_error (error=TARGET_CLOSE_ERROR, fmt=0x555556035588 "Remote connection closed") at src/gdbsupport/common-exceptions.cc:222
 #1  0x0000555555bbaa46 in remote_target::readchar (this=0x555556a11040, timeout=10000) at src/gdb/remote.c:9440
 #2  0x0000555555bbb9e5 in remote_target::getpkt_or_notif_sane_1 (this=0x555556a11040, buf=0x555556a11058, forever=0, expecting_notif=0, is_notif=0x0) at src/gdb/remote.c:9928
 #3  0x0000555555bbbda9 in remote_target::getpkt_sane (this=0x555556a11040, buf=0x555556a11058, forever=0) at src/gdb/remote.c:10030
 #4  0x0000555555bc0e75 in remote_target::remote_hostio_send_command (this=0x555556a11040, command_bytes=13, which_packet=14, remote_errno=0x7fffffffcfd0, attachment=0x0, attachment_len=0x0) at src/gdb/remote.c:12137
 #5  0x0000555555bc1b6c in remote_target::remote_hostio_close (this=0x555556a11040, fd=8, remote_errno=0x7fffffffcfd0) at src/gdb/remote.c:12455
 #6  0x0000555555bc1bb4 in remote_target::fileio_close (During symbol reading: .debug_line address at offset 0x64f417 is 0 [in module build/gdb/gdb]
 this=0x555556a11040, fd=8, remote_errno=0x7fffffffcfd0) at src/gdb/remote.c:12462
 #7  0x0000555555c9274c in target_fileio_close (fd=3, target_errno=0x7fffffffcfd0) at src/gdb/target.c:3365
 #8  0x000055555595a19d in gdb_bfd_iovec_fileio_close (abfd=0x555556b9f8a0, stream=0x555556b11530) at src/gdb/gdb_bfd.c:439
 #9  0x0000555555e09e3f in opncls_bclose (abfd=0x555556b9f8a0) at src/bfd/opncls.c:599
 #10 0x0000555555e0a2c7 in bfd_close_all_done (abfd=0x555556b9f8a0) at src/bfd/opncls.c:847
 #11 0x0000555555e0a27a in bfd_close (abfd=0x555556b9f8a0) at src/bfd/opncls.c:814
 #12 0x000055555595a9d3 in gdb_bfd_close_or_warn (abfd=0x555556b9f8a0) at src/gdb/gdb_bfd.c:626
 #13 0x000055555595ad29 in gdb_bfd_unref (abfd=0x555556b9f8a0) at src/gdb/gdb_bfd.c:715
 #14 0x0000555555ae4730 in objfile::~objfile (this=0x555556515540, __in_chrg=<optimized out>) at src/gdb/objfiles.c:573
 #15 0x0000555555ae955a in std::_Sp_counted_ptr<objfile*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x555556c20db0) at /usr/include/c++/9/bits/shared_ptr_base.h:377
 #16 0x000055555572b7c8 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x555556c20db0) at /usr/include/c++/9/bits/shared_ptr_base.h:155
 #17 0x00005555557263c3 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x555556bf0588, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:730
 #18 0x0000555555ae745e in std::__shared_ptr<objfile, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x555556bf0580, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:1169
 #19 0x0000555555ae747e in std::shared_ptr<objfile>::~shared_ptr (this=0x555556bf0580, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr.h:103
 #20 0x0000555555b1c1dc in __gnu_cxx::new_allocator<std::_List_node<std::shared_ptr<objfile> > >::destroy<std::shared_ptr<objfile> > (this=0x5555564cdd60, __p=0x555556bf0580) at /usr/include/c++/9/ext/new_allocator.h:153
 #21 0x0000555555b1bb1d in std::allocator_traits<std::allocator<std::_List_node<std::shared_ptr<objfile> > > >::destroy<std::shared_ptr<objfile> > (__a=..., __p=0x555556bf0580) at /usr/include/c++/9/bits/alloc_traits.h:497
 #22 0x0000555555b1b73e in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::_M_erase (this=0x5555564cdd60, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556515540}) at /usr/include/c++/9/bits/stl_list.h:1921
 #23 0x0000555555b1afeb in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::erase (this=0x5555564cdd60, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556515540}) at /usr/include/c++/9/bits/list.tcc:158
 #24 0x0000555555b19576 in program_space::remove_objfile (this=0x5555564cdd20, objfile=0x555556515540) at src/gdb/progspace.c:210
 #25 0x0000555555ae4502 in objfile::unlink (this=0x555556515540) at src/gdb/objfiles.c:487
 #26 0x0000555555ae5a12 in objfile_purge_solibs () at src/gdb/objfiles.c:875
 #27 0x0000555555c09686 in no_shared_libraries (ignored=0x0, from_tty=1) at src/gdb/solib.c:1236
 #28 0x00005555559e3f5f in detach_command (args=0x0, from_tty=1) at src/gdb/infcmd.c:2769

Note frame #14:

 #14 0x0000555555ae4730 in objfile::~objfile (this=0x555556515540, __in_chrg=<optimized out>) at src/gdb/objfiles.c:573

That's a dtor, thus noexcept.  That's the reason for the
std::terminate.

The previous patch fixed things such that the exception above isn't
thrown anymore.  However, it's possible that e.g., the remote
connection drops just while a user types "nosharedlibrary", or some
other reason that leads to objfile::~objfile, and then we end up the
same std::terminate problem.

Also notice that frames #9-#11 are BFD frames:

 #9  0x0000555555e09e3f in opncls_bclose (abfd=0x555556bc27e0) at src/bfd/opncls.c:599
 #10 0x0000555555e0a2c7 in bfd_close_all_done (abfd=0x555556bc27e0) at src/bfd/opncls.c:847
 #11 0x0000555555e0a27a in bfd_close (abfd=0x555556bc27e0) at src/bfd/opncls.c:814

BFD is written in C and thus throwing exceptions over such frames may
either not clean up properly, or, may abort if bfd is not compiled
with -fasynchronous-unwind-tables (x86-64 defaults that on, but not
all GCC ports do).

Thus frame #8 seems like a good place to swallow exceptions.  More so
since in this spot we already ignore target_fileio_close return
errors.  That's what this commit does.  Without the previous fix, we'd
see:

 (gdb) detach
 Detaching from program: target:/any/program, process 2197701
 Ending remote debugging.
 [Inferior 1 (process 2197701) detached]
 warning: cannot close "target:/lib64/ld-linux-x86-64.so.2": Remote connection closed

Note it prints a warning, which would still be a regression compared
to GDB 10, if it weren't for the previous fix.

gdb/ChangeLog:
yyyy-mm-dd  Pedro Alves  <[email protected]>

	PR gdb/28080
	* gdb_bfd.c (gdb_bfd_close_warning): New.
	(gdb_bfd_iovec_fileio_close): Wrap target_fileio_close in
	try/catch and print warning on exception.
	(gdb_bfd_close_or_warn): Use gdb_bfd_close_warning.

Change-Id: Ic7a26ddba0a4444e3377b0e7c1c89934a84545d7
lmoriche pushed a commit that referenced this issue Oct 28, 2021
We don't handle well a few scenarios involving vfork.  The reason is
that dbgapi can't attach to a vfork child when it is still attached to
the parent.

For example, with a simple program that just calls "vfork":

    $ ./gdb -q -nx --data-directory=data-directory vfork -ex "set follow-fork-mode child" -ex r
    Reading symbols from vfork...
    Starting program: /home/master/smarchi/build/binutils-gdb/gdb/vfork
    [Attaching after process 14346 vfork to child process 14354]
    [New inferior 2 (process 14354)]
    amd-dbgapi: fatal error: enable_debug failed (rc=-1)
    Backtrace:
        #0 0x00007f40337a2a14 amd::dbgapi::process_t::attach() in /home/master/smarchi/src/ROCdbgapi/src/process.cpp:1504
        #1 0x00007f40337a2be2 operator() in /home/master/smarchi/src/ROCdbgapi/src/process.cpp:2142
        #2 0x00007f40337a2e46 tracer_closure in /home/master/smarchi/src/ROCdbgapi/src/logging.h:538
        #3 0x00007f40337a2e46 enter<amd::dbgapi::detail::parameter_t<amd_dbgapi_client_process_s*, (& amd::dbgapi::utils::string_literal<'c', 'l', 'i', 'e', 'n', 't', '_', 'p', 'r', 'o', 'c', 'e', 's', 's', '_', 'i', 'd'>::value), (amd::dbgapi::detail::parameter_kind_t)0>, amd::dbgapi::detail::parameter_t<amd_dbgapi_process_id_t*, (& amd::dbgapi::utils::string_literal<'p', 'r', 'o', 'c', 'e', 's', 's', '_', 'i', 'd'>::value), (amd::dbgapi::detail::parameter_kind_t)0>, amd_dbgapi_process_attach(amd_dbgapi_client_process_id_t, amd_dbgapi_process_id_t*)::<lambda()> > in /home/master/smarchi/src/ROCdbgapi/src/logging.h:585
        #4 0x00007f40337a2e46 amd_dbgapi_process_attach in /home/master/smarchi/src/ROCdbgapi/src/process.cpp:2154
        #5 0x0000000000a63164 rocm_enable(inferior*) in /home/master/smarchi/src/binutils-gdb/gdb/rocm-tdep.c:1700
        #6 0x0000000000a63553 rocm_target_ops::follow_fork(inferior*, ptid_t, target_waitkind, bool, bool) in /home/master/smarchi/src/binutils-gdb/gdb/rocm-tdep.c:2139
        #7 0x0000000000b411d4 target_follow_fork(inferior*, ptid_t, target_waitkind, bool, bool) in /home/master/smarchi/src/binutils-gdb/gdb/target.c:2756
        #8 0x0000000000865137 follow_fork_inferior(bool, bool) in /home/master/smarchi/src/binutils-gdb/gdb/infrun.c:588
        #9 0x0000000000856c10 follow_fork() in /home/master/smarchi/src/binutils-gdb/gdb/infrun.c:760
        #10 0x000000000085e0e2 handle_inferior_event(execution_control_state*) in /home/master/smarchi/src/binutils-gdb/gdb/infrun.c:5549
        #11 0x000000000085c685 fetch_inferior_event() in /home/master/smarchi/src/binutils-gdb/gdb/infrun.c:4077
        #12 0x000000000083b1d3 inferior_event_handler(inferior_event_type) in /home/master/smarchi/src/binutils-gdb/gdb/inf-loop.c:41
        #13 0x00000000008ab957 handle_target_event(int, void*) in /home/master/smarchi/src/binutils-gdb/gdb/linux-nat.c:4208
        #14 0x0000000000dccc02 handle_file_event(file_handler*, int) in /home/master/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:575
        #15 0x0000000000dcbaeb gdb_wait_for_event(int) in /home/master/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:701
        #16 0x0000000000dcb57c gdb_do_one_event() in /home/master/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:212
        #17 0x0000000000b71b7c wait_sync_command_done() in /home/master/smarchi/src/binutils-gdb/gdb/top.c:526
        #18 0x0000000000b71c3d maybe_wait_sync_command_done(int) in /home/master/smarchi/src/binutils-gdb/gdb/top.c:543
        #19 0x0000000000b72431 execute_command(char const*, int) in /home/master/smarchi/src/binutils-gdb/gdb/top.c:674
        #20 0x00000000008e10da catch_command_errors(void (*)(char const*, int), char const*, int, bool) in /home/master/smarchi/src/binutils-gdb/gdb/main.c:523
        #21 0x00000000008e1274 execute_cmdargs(std::vector<cmdarg, std::allocator<cmdarg> > const*, cmdarg_kind, cmdarg_kind, int*) in /home/master/smarchi/src/binutils-gdb/gdb/main.c:618
        #22 0x00000000008e0d65 captured_main_1(captured_main_args*) in /home/master/smarchi/src/binutils-gdb/gdb/main.c:1317
        #23 0x00000000008de70c captured_main(void*) in /home/master/smarchi/src/binutils-gdb/gdb/main.c:1338
        #24 0x00000000008de664 gdb_main(captured_main_args*) in /home/master/smarchi/src/binutils-gdb/gdb/main.c:1363
        #25 0x0000000000427406 main in /home/master/smarchi/src/binutils-gdb/gdb/gdb.c:32
        #26 0x00007f40331f80b2 __libc_start_main
        #27 0x00000000004272ed _start
        #28 0xffffffffffffffff
    Could not attach to process 14354 (rc=-2)

 - With "detach-on-fork on" and "follow-fork-mode child", we
   try to attach the child before detaching the parent, which fails.  This
   might be a consequence of the recent follow_fork changes: before that,
   GDB used to detach from the parent before attaching the child.

 - With "detach-on-fork on" and "follow-fork-mode parent", there's not
   problem as we never try to attach to the child.

 - With "detach-on-fork off" and "follow-fork-mode child", we try to attach
   the child while staying attached to the parent, so that fails.

 - With "detach-on-fork off" and "follow-fork-mode parent" (and "set
   schedule-multiple on", to get around the message that GDB prints), same
   thing.

To avoid these vfork-related problems, I propose to not enable / push
the rocm target for vfork children.  A vfork child is only allowed to do
a limited set of thing, presumably to avoid messing up its parent's
address space.  As documented in vfork(2):

    (From POSIX.1) The vfork() function has the same effect as fork(2), except that the be‐
    havior is undefined if the process created by vfork() either modifies  any  data  other
    than  a  variable of type pid_t used to store the return value from vfork(), or returns
    from the function in which vfork() was called, or calls any other function before  suc‐
    cessfully calling _exit(2) or one of the exec(3) family of functions.

Clearly, a vfork child won't run GPU programs, as that would require
doing things that are not allowed at that time.  So we won't miss
anything by not attaching dbgapi to the fork child.  If the program
execs, though, then it might run some GPU programs in the new address
space, so we need to make sure to push the rocm target and attach dbgapi
when we catch the exec.

The following modifications are made:

  - in rocm_enable, don't push / attach if inf->vfork_parent is set.
    That is useful in the case of "detach-on-fork off"
  - in rocm_target_ops::follow_fork, don't call rocm_enable if the fork
    kind is vfork.  This change might seem redundant with the previous
    one, but both are needed to cover all cases.
  - A vfork child will not have the rocm target pushed, so if it
    subsequently execs, rocm_target_ops::follow_exec won't be called.
    To catch that event and push the rocm target at that point,
    implement the inferior_execd observer.

With just these changes, we hit this internal error:

    (gdb) PASS: gdb.base/foll-vfork.exp: exec: vfork child follow, finish after tcatch vfork: continue to vfork
    finish^M
    Run till exit from #0  0x00007ffff7eb325c in vfork () from /lib/x86_64-linux-gnu/libc.so.6^M
    [Attaching after process 7816 vfork to child process 7824]^M
    [New inferior 2 (process 7824)]^M
    /home/master/smarchi/src/binutils-gdb/gdb/rocm-tdep.c:559: internal-error: void async_event_handler_clear(): Assertion `rocm_async_event_handler != nullptr' failed.^M
    A problem internal to GDB has been detected,^M
    further debugging may prove unreliable.^M
    Quit this debugging session? (y or n) FAIL: gdb.base/foll-vfork.exp: exec: vfork child follow, finish after tcatch vfork: finish (GDB internal error)

Because we now don't push the rocm target in vfork child inferiors, we
can end up with two inferiors, with the following targets:

 - inferior 1, rocm target + linux nat target
 - inferior 2, linux nat target

If we call target_async(true) while inferior 2 is the current inferior,
the linux nat target (which is shared by the two inferiors) will enable
its async mode, but the rocm target won't.  If we then call target_wait
with inferior 1 as the current inferior (so end up in
rocm_target_ops::wait), the rocm target thinks it is async, when in
reality it is not, so we end up calling async_event_handler_clear while
rocm_async_event_handler is unset.

I think that the problem is that we can have target stack that are
partially async, with some targets that have received the order to
become async and other targets that haven't.  I don't think this is
specific to this patch.  We could have hit this problem if we had
decided that the the ROCm target would be pushed after a specific
library load, for example.

To fix that, modify target_async such we call target_async on all the
inferiors that have the current process target as their process target.
This ensures that all targets in all target stacks that have the linux
nat target in them (in the example above) get the memo that they should
enable their async mode.

Change-Id: I4d8102c65cc6b1663a46a49b5490c0f2c91f6279
lmoriche pushed a commit that referenced this issue Oct 28, 2021
As documented in bug 28086, test gdb.btrace/enable-new-thread.exp
started failing with commit 0618ae4 ("gdb: optimize
all_matching_threads_iterator"):

    (gdb) record btrace^M
    (gdb) PASS: gdb.btrace/enable-new-thread.exp: record btrace
    break 24^M
    Breakpoint 2 at 0x555555555175: file /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.btrace/enable-new-thread.c, line 24.^M
    (gdb) continue^M
    Continuing.^M
    /home/smarchi/src/binutils-gdb/gdb/inferior.c:303: internal-error: inferior* find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0' failed.^M
    A problem internal to GDB has been detected,^M
    further debugging may prove unreliable.^M
    Quit this debugging session? (y or n) FAIL: gdb.btrace/enable-new-thread.exp: continue to breakpoint: cont to bp.1 (GDB internal error)

Note that I only see the failure if GDB is compiled without libipt
support.  This is because GDB then makes use BTS instead of PT, so
exercises different code paths.

I think that the commit above just exposed an existing problem.  The
stack trace of the internal error is:

    #8  0x0000561cb81e404e in internal_error (file=0x561cb83aa2f8 "/home/smarchi/src/binutils-gdb/gdb/inferior.c", line=303, fmt=0x561cb83aa099 "%s: Assertion `%s' failed.") at /home/smarchi/src/binutils-gdb/gdbsupport/errors.cc:55
    #9  0x0000561cb7b5c031 in find_inferior_pid (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, pid=0) at /home/smarchi/src/binutils-gdb/gdb/inferior.c:303
    #10 0x0000561cb7b5c102 in find_inferior_ptid (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/inferior.c:317
    #11 0x0000561cb7f1d1c3 in find_thread_ptid (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/thread.c:487
    #12 0x0000561cb7f1b921 in all_matching_threads_iterator::all_matching_threads_iterator (this=0x7ffc4ee34678, filter_target=0x561cb8aafb60 <the_amd64_linux_nat_target>, filter_ptid=...) at /home/smarchi/src/binutils-gdb/gdb/thread-iter.c:125
    #13 0x0000561cb77bc462 in filtered_iterator<all_matching_threads_iterator, non_exited_thread_filter>::filtered_iterator<process_stratum_target* const&, ptid_t const&> (this=0x7ffc4ee34670) at /home/smarchi/src/binutils-gdb/gdb/../gdbsupport/filtered-iterator.h:42
    #14 0x0000561cb77b97cb in all_non_exited_threads_range::begin (this=0x7ffc4ee34650) at /home/smarchi/src/binutils-gdb/gdb/thread-iter.h:243
    #15 0x0000561cb7d8ba30 in record_btrace_target::record_is_replaying (this=0x561cb8aa6250 <record_btrace_ops>, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:1411
    #16 0x0000561cb7d8bb83 in record_btrace_target::xfer_partial (this=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_MEMORY, annex=0x0, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, offset=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:1437
    #17 0x0000561cb7ef73a9 in raw_memory_xfer_partial (ops=0x561cb8aa6250 <record_btrace_ops>, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1504
    #18 0x0000561cb7ef77da in memory_xfer_partial_1 (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1635
    #19 0x0000561cb7ef78b5 in memory_xfer_partial (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1664
    #20 0x0000561cb7ef7ba4 in target_xfer_partial (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, offset=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1721
    #21 0x0000561cb7ef8503 in target_read_partial (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0, buf=0x7ffc4ee34c58 "\260g\343N\374\177", offset=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1974
    #22 0x0000561cb7ef861f in target_read (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0, buf=0x7ffc4ee34c58 "\260g\343N\374\177", offset=140737352774277, len=1) at /home/smarchi/src/binutils-gdb/gdb/target.c:2014
    #23 0x0000561cb7ef809f in target_read_code (memaddr=140737352774277, myaddr=0x7ffc4ee34c58 "\260g\343N\374\177", len=1) at /home/smarchi/src/binutils-gdb/gdb/target.c:1869
    #24 0x0000561cb7937f4d in gdb_disassembler::dis_asm_read_memory (memaddr=140737352774277, myaddr=0x7ffc4ee34c58 "\260g\343N\374\177", len=1, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:139
    #25 0x0000561cb80ab66d in fetch_data (info=0x7ffc4ee34e88, addr=0x7ffc4ee34c59 "g\343N\374\177") at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:194
    #26 0x0000561cb80ab7e2 in ckprefix () at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:8628
    #27 0x0000561cb80adbd8 in print_insn (pc=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:9587
    #28 0x0000561cb80abe4f in print_insn_i386 (pc=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:8894
    #29 0x0000561cb7744a19 in default_print_insn (memaddr=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/arch-utils.c:1029
    #30 0x0000561cb7b33067 in i386_print_insn (pc=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/i386-tdep.c:4013
    #31 0x0000561cb7acd8f4 in gdbarch_print_insn (gdbarch=0x561cbae2fb60, vma=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/gdbarch.c:3478
    #32 0x0000561cb793a32d in gdb_disassembler::print_insn (this=0x7ffc4ee34e80, memaddr=140737352774277, branch_delay_insns=0x0) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:795
    #33 0x0000561cb793a5b0 in gdb_print_insn (gdbarch=0x561cbae2fb60, memaddr=140737352774277, stream=0x561cb8ac99f8 <null_stream>, branch_delay_insns=0x0) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:850
    #34 0x0000561cb793a631 in gdb_insn_length (gdbarch=0x561cbae2fb60, addr=140737352774277) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:859
    #35 0x0000561cb77f53f4 in btrace_compute_ftrace_bts (tp=0x561cbba11210, btrace=0x7ffc4ee35188, gaps=...) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1107
    #36 0x0000561cb77f55f5 in btrace_compute_ftrace_1 (tp=0x561cbba11210, btrace=0x7ffc4ee35180, cpu=0x0, gaps=...) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1527
    #37 0x0000561cb77f5705 in btrace_compute_ftrace (tp=0x561cbba11210, btrace=0x7ffc4ee35180, cpu=0x0) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1560
    #38 0x0000561cb77f583b in btrace_add_pc (tp=0x561cbba11210) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1589
    #39 0x0000561cb77f5a86 in btrace_enable (tp=0x561cbba11210, conf=0x561cb8ac6878 <record_btrace_conf>) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1629
    #40 0x0000561cb7d88d26 in record_btrace_enable_warn (tp=0x561cbba11210) at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:294
    #41 0x0000561cb7c603dc in std::__invoke_impl<void, void (*&)(thread_info*), thread_info*> (__f=@0x561cbb6c4878: 0x561cb7d88cdc <record_btrace_enable_warn(thread_info*)>) at /usr/include/c++/10/bits/invoke.h:60
    #42 0x0000561cb7c5e5a6 in std::__invoke_r<void, void (*&)(thread_info*), thread_info*> (__fn=@0x561cbb6c4878: 0x561cb7d88cdc <record_btrace_enable_warn(thread_info*)>) at /usr/include/c++/10/bits/invoke.h:153
    #43 0x0000561cb7c5dc92 in std::_Function_handler<void (thread_info*), void (*)(thread_info*)>::_M_invoke(std::_Any_data const&, thread_info*&&) (__functor=..., __args#0=@0x7ffc4ee35310: 0x561cbba11210) at /usr/include/c++/10/bits/std_function.h:291
    #44 0x0000561cb7f2600f in std::function<void (thread_info*)>::operator()(thread_info*) const (this=0x561cbb6c4878, __args#0=0x561cbba11210) at /usr/include/c++/10/bits/std_function.h:622
    #45 0x0000561cb7f23dc8 in gdb::observers::observable<thread_info*>::notify (this=0x561cb8ac5aa0 <gdb::observers::new_thread>, args#0=0x561cbba11210) at /home/smarchi/src/binutils-gdb/gdb/../gdbsupport/observable.h:150
    #46 0x0000561cb7f1c436 in add_thread_silent (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/thread.c:263
    #47 0x0000561cb7f1c479 in add_thread_with_info (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=..., priv=0x561cbb3f7ab0) at /home/smarchi/src/binutils-gdb/gdb/thread.c:272
    #48 0x0000561cb7bfa1d0 in record_thread (info=0x561cbb0413a0, tp=0x0, ptid=..., th_p=0x7ffc4ee35610, ti_p=0x7ffc4ee35620) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:1380
    #49 0x0000561cb7bf7a2a in thread_from_lwp (stopped=0x561cba81db20, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:429
    #50 0x0000561cb7bf7ac5 in thread_db_notice_clone (parent=..., child=...) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:447
    #51 0x0000561cb7bdc9a2 in linux_handle_extended_wait (lp=0x561cbae25720, status=4991) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:1981
    #52 0x0000561cb7bdf0f3 in linux_nat_filter_event (lwpid=435403, status=198015) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:2920
    #53 0x0000561cb7bdfed6 in linux_nat_wait_1 (ptid=..., ourstatus=0x7ffc4ee36398, target_options=...) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:3202
    #54 0x0000561cb7be0b68 in linux_nat_target::wait (this=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=..., ourstatus=0x7ffc4ee36398, target_options=...) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:3440
    #55 0x0000561cb7bfa2fc in thread_db_target::wait (this=0x561cb8a9acd0 <the_thread_db_target>, ptid=..., ourstatus=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:1412
    #56 0x0000561cb7d8e356 in record_btrace_target::wait (this=0x561cb8aa6250 <record_btrace_ops>, ptid=..., status=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:2547
    #57 0x0000561cb7ef996d in target_wait (ptid=..., status=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-gdb/gdb/target.c:2608
    #58 0x0000561cb7b6d297 in do_target_wait_1 (inf=0x561cba6d8780, ptid=..., status=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3640
    #59 0x0000561cb7b6d43e in operator() (__closure=0x7ffc4ee36190, inf=0x561cba6d8780) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3701
    #60 0x0000561cb7b6d7b2 in do_target_wait (ecs=0x7ffc4ee36370, options=...) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3720
    #61 0x0000561cb7b6e67d in fetch_inferior_event () at /home/smarchi/src/binutils-gdb/gdb/infrun.c:4069
    #62 0x0000561cb7b4659b in inferior_event_handler (event_type=INF_REG_EVENT) at /home/smarchi/src/binutils-gdb/gdb/inf-loop.c:41
    #63 0x0000561cb7be25f7 in handle_target_event (error=0, client_data=0x0) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:4227
    #64 0x0000561cb81e4ee2 in handle_file_event (file_ptr=0x561cbae24e10, ready_mask=1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:575
    #65 0x0000561cb81e5490 in gdb_wait_for_event (block=0) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:701
    #66 0x0000561cb81e41be in gdb_do_one_event () at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:212
    #67 0x0000561cb7c18096 in start_event_loop () at /home/smarchi/src/binutils-gdb/gdb/main.c:421
    #68 0x0000561cb7c181e0 in captured_command_loop () at /home/smarchi/src/binutils-gdb/gdb/main.c:481
    #69 0x0000561cb7c19d7e in captured_main (data=0x7ffc4ee366a0) at /home/smarchi/src/binutils-gdb/gdb/main.c:1353
    #70 0x0000561cb7c19df0 in gdb_main (args=0x7ffc4ee366a0) at /home/smarchi/src/binutils-gdb/gdb/main.c:1368
    #71 0x0000561cb7693186 in main (argc=11, argv=0x7ffc4ee367b8) at /home/smarchi/src/binutils-gdb/gdb/gdb.c:32

At frame 45, the new_thread observable is fired.  At this moment, the
new thread isn't the current thread, inferior_ptid is null_ptid.  I
think this is ok: the new_thread observable doesn't give any guarantee
on the global context when observers are invoked.  Frame 35,
btrace_compute_ftrace_bts, calls gdb_insn_length.  gdb_insn_length
doesn't have a thread_info or other parameter what could indicate where
to read memory from, it implicitly uses the global context
(inferior_ptid).

So we reach the all_non_exited_threads_range in
record_btrace_target::record_is_replaying with a null inferior_ptid.
The previous implemention of all_non_exited_threads_range didn't care,
but the new one does.  The problem of calling gdb_insn_length and
ultimately trying to read memory with a null inferior_ptid already
existed, but the commit mentioned above made it visible.

Something between frames 40 (record_btrace_enable_warn) and 35
(btrace_compute_ftrace_bts) needs to be switching the global context to
make TP the current thread.  Since btrace_compute_ftrace_bts takes the
thread_info to work with as a parameter, that typically means that it
doesn't require its caller to also set the global current context
(current thread) when calling.  If it needs to call other functions
that do require the global current thread to be set, then it needs to
temporarily change the current thread while calling these other
functions.  Therefore, switch and restore the current thread in
btrace_compute_ftrace_bts.

By inspection, it looks like btrace_compute_ftrace_pt may also call
functions sensitive to the global context: it installs the
btrace_pt_readmem_callback callback in the PT instruction decoder.  When
this function gets called, inferior_ptid must be set appropriately.  Add
a switch and restore in there too.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28086
Change-Id: I407fbfe41aab990068bd102491aa3709b0a034b3
lmoriche pushed a commit that referenced this issue Oct 28, 2021
gdb.threads/no-unwaited-for-left.exp is currently tripping on internal
errors:

 FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when thread 2 exits (GDB internal error)
 ...
 FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (GDB internal error)

The gdb.log shows:

 continue
 Continuing.
 [Thread 0x7ffff77c2700 (LWP 103640) exited]
 No unwaited-for children left.
 /src/rocm-gdb/gdb/thread.c:156: internal-error: thread_info* inferior_thread(): Assertion `current_thread_ != nullptr' failed.
 A problem internal to GDB has been detected,
 further debugging may prove unreliable.
 FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when thread 2 exits (GDB internal error)
 Resyncing due to internal error.
 Quit this debugging session? (y or n) n

Backtrace shows:

 (top-gdb) bt
 #0  internal_error (file=0xb55d4a605 <error: Cannot access memory at address 0xb55d4a605>, line=0, fmt=0x5555565e4240 "en_US.UTF-8") at ../../src/gdbsupport/errors.cc:51
 #1  0x0000555555d47c07 in inferior_thread () at ../../src/gdb/thread.c:156
 #2  0x0000555555c517db in rocm_target_normal_stop (bs_list=0x0, print_frame=0) at ../../src/gdb/rocm-tdep.c:2194
 #3  0x00005555556afff1 in std::_Function_handler<void (bpstats*, int), void (*)(bpstats*, int)>::_M_invoke(std::_Any_data const&, bpstats*&&, int&&) (__functor=..., __args
 #0=@0x7fffffffcc90: 0x0, __args#1=@0x7fffffffcc8c: 0) at /usr/include/c++/9/bits/std_function.h:300
 #4  0x0000555555a78720 in std::function<void (bpstats*, int)>::operator()(bpstats*, int) const (this=0x5555566ab388, __args#0=0x0, __args#1=0) at /usr/include/c++/9/bits/s
 td_function.h:688
 #5  0x0000555555a7767e in gdb::observers::observable<bpstats*, int>::notify (this=0x555556577380 <gdb::observers::normal_stop>, args#0=0x0, args#1=0) at ../../src/gdb/../g
 dbsupport/observable.h:150
 #6  0x0000555555a72477 in normal_stop () at ../../src/gdb/infrun.c:8673
 #7  0x0000555555a66de5 in fetch_inferior_event () at ../../src/gdb/infrun.c:4118
 #8  0x0000555555a46b7b in inferior_event_handler (event_type=INF_REG_EVENT) at ../../src/gdb/inf-loop.c:41

The problem is that rocm_target_normal_stop is assuming that
normal_stop is called with a thread selected, here:

  if (!ptid_is_gpu (inferior_thread ()->ptid))

That's an incorrect assumption.  In this case, we have a
TARGET_WAITKIND_NO_RESUMED event, so there's no thread selected.

I could see several different predicates we could use to fix this:

 #1 - get the last target event using get_last_target_status, and
      return early if we stopped for an event of kind other than
      TARGET_WAITKIND_STOPPED.

 #2 - bail out early if inferior_ptid is null_ptid.

 #3 - bail out early if the passed in BS_LIST is NULL.  Further below
      in the function we iterate over all breakpoints in BS_LIST
      looking for a watchpoint.  So if the list is empty, we know we
      won't print a warning.

 #4 - bail out early if !PRINT_FRAME.  If we've stopped for a
      watchpoint, then GDB prints the frame/location, which is what we
      warn about.

This commit implements #3 and #4.  #3, for being the simplest and
seems like a minor optimization even if we have stopped for
TARGET_WAITKIND_STOPPED anyhow.  #4, just because it would likely be
weird to warn about the stop location being off when we don't print
it.

Bug: https://ontrack-internal.amd.com/browse/SWDEV-301562
Change-Id: Ib126bab8e0eb5325800fa7dd341169b2b8f7af0d
lmoriche pushed a commit that referenced this issue Oct 28, 2021
This comment [1] on bug SWDEV-294225 shows a case where backtrace fails
and a Python error is shown:

    $ ./gdb -ex "set pag off" -q -nx --data-directory=data-directory deep -ex "b 7" -ex r -ex c
    Reading symbols from deep...
    Breakpoint 1 at 0x219a8f: file deep.cpp, line 25.
    Starting program: /home/master/smarchi/build/binutils-gdb/gdb/deep
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

    Breakpoint 1, main () at deep.cpp:25
    25          hipLaunchKernelGGL(HIP_KERNEL_NAME(hip_deep), dim3(1), dim3(1), 0, 0);
    Continuing.
    [New Thread 0x7ffff6350700 (LWP 43023)]
    [New Thread 0x7ffff53ff700 (LWP 43024)]
    [Thread 0x7ffff53ff700 (LWP 43024) exited]
    [New Thread 0x7ffff5b4f700 (LWP 43025)]
    [New Thread 0x7ffff5951700 (LWP 43026)]
    Hello Device
    [Switching to thread 6, lane 0 (AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0])]

    Thread 6 "deep" hit Breakpoint 1, with lane 0, base_case () at deep.cpp:7
    7           return;
    (gdb) bt
    Python Exception <class 'gdb.error'>: access outside bounds of object referenced via synthetic pointer
    #0  base_case () at deep.cpp:7

Python isn't the culprit here, it's just that we fail to compute frame
0's id while trying to apply frame filters, so the Python code catches
and exception and prints that error.  The failure to compute the frame
id is the reason for the backtrace stopping early.

The point where an exception is thrown is:

    #4  0x000056459263d322 in error (fmt=0x5645926f20f8 "access outside bounds of object referenced via synthetic pointer") at /home/master/smarchi/src/binutils-gdb/gdbsupport/errors.cc:43
    #5  0x0000564591fc0627 in invalid_synthetic_pointer () at /home/master/smarchi/src/binutils-gdb/gdb/dwarf2/loc.c:99
    #6  0x0000564591f758bd in dwarf_composite::to_gdb_value (this=0x5645955cc360, frame=0x564594c7c480, type=0x5645957b7a10, subobj_type=0x5645957b7a10, subobj_offset=0) at /home/master/smarchi/src/binutils-gdb/gdb/dwarf2/expr.c:2067
    #7  0x0000564591f774ef in dwarf_expr_context::fetch_result (this=0x7ffdf74c0880, type=0x5645957b7a10, subobj_type=0x5645957b7a10, subobj_offset=0, as_lval=true) at /home/master/smarchi/src/binutils-gdb/gdb/dwarf2/expr.c:2785
    #8  0x0000564591f77689 in dwarf_expr_context::evaluate (this=0x7ffdf74c0880, addr=0x564595d4172a "\220\250\024\026\344\200\004\346\021\224\b\354 @b\020\251\024\016\220\251\024\026\344\200\002\346\021\224\b\354 @b\020\252\024\r\220\252\024", <incomplete sequence \344>, len=14, as_lval=true, per_cu=0x0,
        frame=0x564594c7c480, init_values=0x7ffdf74c0990, addr_info=0x0, type=0x0, subobj_type=0x0, subobj_offset=0) at /home/master/smarchi/src/binutils-gdb/gdb/dwarf2/expr.c:2806
    #9  0x0000564591f7dff2 in dwarf2_evaluate (addr=0x564595d4172a "\220\250\024\026\344\200\004\346\021\224\b\354 @b\020\251\024\016\220\251\024\026\344\200\002\346\021\224\b\354 @b\020\252\024\r\220\252\024", <incomplete sequence \344>, len=14, as_lval=true, per_objfile=0x56459573d730, per_cu=0x0,
        frame=0x564594c7c480, addr_size=8, init_values=0x7ffdf74c0990, addr_info=0x0, type=0x0, subobj_type=0x0, subobj_offset=0) at /home/master/smarchi/src/binutils-gdb/gdb/dwarf2/expr.c:4342
    #10 0x0000564591f95960 in execute_stack_op (exp=0x564595d4172a "\220\250\024\026\344\200\004\346\021\224\b\354 @b\020\251\024\016\220\251\024\026\344\200\002\346\021\224\b\354 @b\020\252\024\r\220\252\024", <incomplete sequence \344>, len=14, addr_size=8, this_frame=0x564594c7c480, initial=432345564227578880,
        initial_in_stack_memory=1, per_objfile=0x56459573d730, type=0x564595804b60, as_lval=true) at /home/master/smarchi/src/binutils-gdb/gdb/dwarf2/frame.c:257
    #11 0x0000564591f98619 in dwarf2_frame_prev_register (this_frame=0x564594c7c480, this_cache=0x564594c7c498, regnum=40) at /home/master/smarchi/src/binutils-gdb/gdb/dwarf2/frame.c:1233
    #12 0x000056459207ff4d in frame_unwind_register_value (next_frame=0x564594c7c480, regnum=40) at /home/master/smarchi/src/binutils-gdb/gdb/frame.c:1234
    #13 0x000056459207f9f0 in frame_register_unwind (next_frame=0x564594c7c480, regnum=40, optimizedp=0x7ffdf74c0e00, unavailablep=0x7ffdf74c0e04, lvalp=0x7ffdf74c0c6c, addrp=0x7ffdf74c0c80, realnump=0x7ffdf74c0c70, bufferp=0x5645957be9b0 "\200ꏕEV") at /home/master/smarchi/src/binutils-gdb/gdb/frame.c:1144
    #14 0x000056459207fcdd in frame_register (frame=0x564594c21310, regnum=40, optimizedp=0x7ffdf74c0e00, unavailablep=0x7ffdf74c0e04, lvalp=0x7ffdf74c0c6c, addrp=0x7ffdf74c0c80, realnump=0x7ffdf74c0c70, bufferp=0x5645957be9b0 "\200ꏕEV") at /home/master/smarchi/src/binutils-gdb/gdb/frame.c:1187
    #15 0x0000564591f723a7 in read_from_register (frame=0x564594c21310, regnum=40, offset=8, buf=..., optimized=0x7ffdf74c0e00, unavailable=0x7ffdf74c0e04) at /home/master/smarchi/src/binutils-gdb/gdb/dwarf2/expr.c:137
    #16 0x0000564591f73e5f in dwarf_register::read (this=0x56459577a0c0, frame=0x564594c21310, buf=0x56459562e470 "", buf_bit_offset=0, bit_size=32, bits_to_skip=0, location_bit_limit=32, big_endian=false, optimized=0x7ffdf74c0e00, unavailable=0x7ffdf74c0e04)
        at /home/master/smarchi/src/binutils-gdb/gdb/dwarf2/expr.c:1242
    #17 0x0000564591f72ead in dwarf_location::write_to_gdb_value (this=0x56459577a0c0, frame=0x564594c21310, value=0x56459562b4d0, value_bit_offset=0, bits_to_skip=0, bit_size=32, location_bit_limit=32) at /home/master/smarchi/src/binutils-gdb/gdb/dwarf2/expr.c:704
    #18 0x0000564591f7551f in dwarf_composite::write_to_gdb_value (this=0x56459574b4a0, frame=0x564594c21310, value=0x56459562b4d0, value_bit_offset=0, bits_to_skip=0, bit_size=64, location_bit_limit=0) at /home/master/smarchi/src/binutils-gdb/gdb/dwarf2/expr.c:1968
    #19 0x0000564591f76234 in rw_closure_value (v=0x56459562b4d0, from=0x0) at /home/master/smarchi/src/binutils-gdb/gdb/dwarf2/expr.c:2248
    #20 0x0000564591f762f0 in read_closure_value (v=0x56459562b4d0) at /home/master/smarchi/src/binutils-gdb/gdb/dwarf2/expr.c:2261
    #21 0x00005645924c8952 in value_fetch_lazy (val=0x56459562b4d0) at /home/master/smarchi/src/binutils-gdb/gdb/value.c:4045
    #22 0x00005645924c2dde in value_optimized_out (value=0x56459562b4d0) at /home/master/smarchi/src/binutils-gdb/gdb/value.c:1431
    #23 0x00005645920804d4 in frame_unwind_register_unsigned (next_frame=0x564594c21310, regnum=624) at /home/master/smarchi/src/binutils-gdb/gdb/frame.c:1331
    #24 0x000056459207c87b in default_unwind_pc (gdbarch=0x5645957bec90, next_frame=0x564594c21310) at /home/master/smarchi/src/binutils-gdb/gdb/frame-unwind.c:240
    #25 0x000056459209bad3 in gdbarch_unwind_pc (gdbarch=0x5645957bec90, next_frame=0x564594c21310) at /home/master/smarchi/src/binutils-gdb/gdb/gdbarch.c:3364

The context described by this backtrace is:

 - We are trying to know the value of the PC register (regno 624) in
   frame 1
 - That value is saved in register v40 (regno 40, a 2048 bit value) in frame 0
 - The failure happens because when reading register v40, the expected
   type is 8 byte (64 bits) long

The cause is that execute_stack_op at frame 10 misses passing down the
expected type it receives as a parameter.  dwarf2_frame_prev_register at
frame 11 passes down the appropriate type for register v40 (a 2048 bit
long type), but this is lost by execute_stack_op.  Further down, when
time comes to convert to a GDB value, the expected type is nullptr, so
the default address type is used (frame 7, fetch_result).  This is where
the 64-bit type comes from.

[1] https://ontrack-internal.amd.com/browse/SWDEV-294225?focusedCommentId=7837574&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-7837574

Bug: https://ontrack-internal.amd.com/browse/SWDEV-294225
Change-Id: I8c7bcbd392b3e0ae839c00749c36e88066cb7f4c
lmoriche pushed a commit that referenced this issue Oct 28, 2021
When loading a file using the file command on macOS, we get:

    $ ./gdb -nx --data-directory=data-directory -q -ex "file ./test"
    Reading symbols from ./test...
    Reading symbols from /Users/smarchi/build/binutils-gdb/gdb/test.dSYM/Contents/Resources/DWARF/test...
    /Users/smarchi/src/binutils-gdb/gdb/thread.c:72: internal-error: struct thread_info *inferior_thread(): Assertion `current_thread_ != nullptr' failed.
    A problem internal to GDB has been detected,
    further debugging may prove unreliable.
    Quit this debugging session? (y or n)

The backtrace is:

    * frame #0: 0x0000000101fcb826 gdb`internal_error(file="/Users/smarchi/src/binutils-gdb/gdb/thread.c", line=72, fmt="%s: Assertion `%s' failed.") at errors.cc:52:3
      frame #1: 0x00000001018a2584 gdb`inferior_thread() at thread.c:72:3
      frame #2: 0x0000000101469c09 gdb`get_current_regcache() at regcache.c:421:31
      frame #3: 0x00000001015f9812 gdb`darwin_solib_get_all_image_info_addr_at_init(info=0x0000603000006d00) at solib-darwin.c:464:34
      frame #4: 0x00000001015f7a04 gdb`darwin_solib_create_inferior_hook(from_tty=1) at solib-darwin.c:515:5
      frame #5: 0x000000010161205e gdb`solib_create_inferior_hook(from_tty=1) at solib.c:1200:3
      frame #6: 0x00000001016d8f76 gdb`symbol_file_command(args="./test", from_tty=1) at symfile.c:1650:7
      frame #7: 0x0000000100abab17 gdb`file_command(arg="./test", from_tty=1) at exec.c:555:3
      frame #8: 0x00000001004dc799 gdb`do_const_cfunc(c=0x000061100000c340, args="./test", from_tty=1) at cli-decode.c:102:3
      frame #9: 0x00000001004ea042 gdb`cmd_func(cmd=0x000061100000c340, args="./test", from_tty=1) at cli-decode.c:2160:7
      frame #10: 0x00000001018d4f59 gdb`execute_command(p="t", from_tty=1) at top.c:674:2
      frame #11: 0x0000000100eee430 gdb`catch_command_errors(command=(gdb`execute_command(char const*, int) at top.c:561), arg="file ./test", from_tty=1, do_bp_actions=true)(char const*, int), char const*, int, bool) at main.c:523:7
      frame #12: 0x0000000100eee902 gdb`execute_cmdargs(cmdarg_vec=0x00007ffeefbfeba0 size=1, file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND, ret=0x00007ffeefbfec20) at main.c:618:9
      frame #13: 0x0000000100eed3a4 gdb`captured_main_1(context=0x00007ffeefbff780) at main.c:1322:3
      frame #14: 0x0000000100ee810d gdb`captured_main(data=0x00007ffeefbff780) at main.c:1343:3
      frame #15: 0x0000000100ee8025 gdb`gdb_main(args=0x00007ffeefbff780) at main.c:1368:7
      frame #16: 0x00000001000044f1 gdb`main(argc=6, argv=0x00007ffeefbff8a0) at gdb.c:32:10
      frame #17: 0x00007fff20558f5d libdyld.dylib`start + 1

The solib_create_inferior_hook call in symbol_file_command was added by
commit ea142fb ("Fix breakpoints on file reloads for PIE
binaries").  It causes solib_create_inferior_hook to be called while
the inferior is not running, which darwin_solib_create_inferior_hook
does not expect.  darwin_solib_get_all_image_info_addr_at_init, in
particular, assumes that there is a current thread, as it tries to get
the current thread's regcache.

Fix it by adding a target_has_execution check and returning early.  Note
that there is a similar check in svr4_solib_create_inferior_hook.

gdb/ChangeLog:

	* solib-darwin.c (darwin_solib_create_inferior_hook): Return
	early if no execution.

Change-Id: Ia11dd983a1e29786e5ce663d0fcaa6846dc611bb
lmoriche pushed a commit that referenced this issue Oct 28, 2021
Commit 408f668 ("detach in all-stop
with threads running") regressed "detach" with "target remote":

 (gdb) detach
 Detaching from program: target:/any/program, process 3671843
 Detaching from process 3671843
 Ending remote debugging.
 [Inferior 1 (process 3671843) detached]
 In main
 terminate called after throwing an instance of 'gdb_exception_error'
 Aborted (core dumped)

Here's the exception above being thrown:

 (top-gdb) bt
 #0  throw_error (error=TARGET_CLOSE_ERROR, fmt=0x555556035588 "Remote connection closed") at src/gdbsupport/common-exceptions.cc:222
 #1  0x0000555555bbaa46 in remote_target::readchar (this=0x555556a11040, timeout=10000) at src/gdb/remote.c:9440
 #2  0x0000555555bbb9e5 in remote_target::getpkt_or_notif_sane_1 (this=0x555556a11040, buf=0x555556a11058, forever=0, expecting_notif=0, is_notif=0x0) at src/gdb/remote.c:9928
 #3  0x0000555555bbbda9 in remote_target::getpkt_sane (this=0x555556a11040, buf=0x555556a11058, forever=0) at src/gdb/remote.c:10030
 #4  0x0000555555bc0e75 in remote_target::remote_hostio_send_command (this=0x555556a11040, command_bytes=13, which_packet=14, remote_errno=0x7fffffffcfd0, attachment=0x0, attachment_len=0x0) at src/gdb/remote.c:12137
 #5  0x0000555555bc1b6c in remote_target::remote_hostio_close (this=0x555556a11040, fd=8, remote_errno=0x7fffffffcfd0) at src/gdb/remote.c:12455
 #6  0x0000555555bc1bb4 in remote_target::fileio_close (During symbol reading: .debug_line address at offset 0x64f417 is 0 [in module build/gdb/gdb]
 this=0x555556a11040, fd=8, remote_errno=0x7fffffffcfd0) at src/gdb/remote.c:12462
 #7  0x0000555555c9274c in target_fileio_close (fd=3, target_errno=0x7fffffffcfd0) at src/gdb/target.c:3365
 #8  0x000055555595a19d in gdb_bfd_iovec_fileio_close (abfd=0x555556b9f8a0, stream=0x555556b11530) at src/gdb/gdb_bfd.c:439
 #9  0x0000555555e09e3f in opncls_bclose (abfd=0x555556b9f8a0) at src/bfd/opncls.c:599
 #10 0x0000555555e0a2c7 in bfd_close_all_done (abfd=0x555556b9f8a0) at src/bfd/opncls.c:847
 #11 0x0000555555e0a27a in bfd_close (abfd=0x555556b9f8a0) at src/bfd/opncls.c:814
 #12 0x000055555595a9d3 in gdb_bfd_close_or_warn (abfd=0x555556b9f8a0) at src/gdb/gdb_bfd.c:626
 #13 0x000055555595ad29 in gdb_bfd_unref (abfd=0x555556b9f8a0) at src/gdb/gdb_bfd.c:715
 #14 0x0000555555ae4730 in objfile::~objfile (this=0x555556515540, __in_chrg=<optimized out>) at src/gdb/objfiles.c:573
 #15 0x0000555555ae955a in std::_Sp_counted_ptr<objfile*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x555556c20db0) at /usr/include/c++/9/bits/shared_ptr_base.h:377
 #16 0x000055555572b7c8 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x555556c20db0) at /usr/include/c++/9/bits/shared_ptr_base.h:155
 #17 0x00005555557263c3 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x555556bf0588, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:730
 #18 0x0000555555ae745e in std::__shared_ptr<objfile, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x555556bf0580, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:1169
 #19 0x0000555555ae747e in std::shared_ptr<objfile>::~shared_ptr (this=0x555556bf0580, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr.h:103
 #20 0x0000555555b1c1dc in __gnu_cxx::new_allocator<std::_List_node<std::shared_ptr<objfile> > >::destroy<std::shared_ptr<objfile> > (this=0x5555564cdd60, __p=0x555556bf0580) at /usr/include/c++/9/ext/new_allocator.h:153
 #21 0x0000555555b1bb1d in std::allocator_traits<std::allocator<std::_List_node<std::shared_ptr<objfile> > > >::destroy<std::shared_ptr<objfile> > (__a=..., __p=0x555556bf0580) at /usr/include/c++/9/bits/alloc_traits.h:497
 #22 0x0000555555b1b73e in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::_M_erase (this=0x5555564cdd60, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556515540}) at /usr/include/c++/9/bits/stl_list.h:1921
 #23 0x0000555555b1afeb in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::erase (this=0x5555564cdd60, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556515540}) at /usr/include/c++/9/bits/list.tcc:158
 #24 0x0000555555b19576 in program_space::remove_objfile (this=0x5555564cdd20, objfile=0x555556515540) at src/gdb/progspace.c:210
 #25 0x0000555555ae4502 in objfile::unlink (this=0x555556515540) at src/gdb/objfiles.c:487
 #26 0x0000555555ae5a12 in objfile_purge_solibs () at src/gdb/objfiles.c:875
 #27 0x0000555555c09686 in no_shared_libraries (ignored=0x0, from_tty=1) at src/gdb/solib.c:1236
 #28 0x00005555559e3f5f in detach_command (args=0x0, from_tty=1) at src/gdb/infcmd.c:2769

So frame #28 already detached the remote process, and then we're
purging the shared libraries.  GDB had opened remote shared libraries
via the target: sysroot, so it tries closing them.  GDBserver is
tearing down already, so remote communication breaks down and we close
the remote target and throw TARGET_CLOSE_ERROR.

Note frame #14:

 #14 0x0000555555ae4730 in objfile::~objfile (this=0x555556515540, __in_chrg=<optimized out>) at src/gdb/objfiles.c:573

That's a dtor, thus noexcept.  That's the reason for the
std::terminate.

Stepping back a bit, why do we still have open remote files if we've
managed to detach already, and, we're debugging with "target remote"?
The reason is that commit 408f668
makes detach_command hold a reference to the target, so the remote
target won't be finally closed until frame #28 returns.  It's closing
the target that invalidates target file I/O handles.

This commit fixes the issue by not relying on target_close to
invalidate the target file I/O handles, instead invalidate them
immediately in remote_unpush_target.  So when GDB purges the solibs,
and we end up in target_fileio_close (frame #7 above), there's nothing
to do, and we don't try to talk with the remote target anymore.

The regression isn't seen when testing with
--target_board=native-gdbserver, because that does "set sysroot" to
disable the "target:" sysroot, for test run speed reasons.  So this
commit adds a testcase that explicitly tests detach with "set sysroot
target:".

gdb/ChangeLog:
yyyy-mm-dd  Pedro Alves  <[email protected]>

	PR gdb/28080
	* remote.c (remote_unpush_target): Invalidate file I/O target
	handles.
	* target.c (fileio_handles_invalidate_target): Make extern.
	* target.h (fileio_handles_invalidate_target): Declare.

gdb/testsuite/ChangeLog:
yyyy-mm-dd  Pedro Alves  <[email protected]>

	PR gdb/28080
	* gdb.base/detach-sysroot-target.exp: New.
	* gdb.base/detach-sysroot-target.c: New.

Reported-By: Jonah Graham <[email protected]>

Change-Id: I851234910172f42a1b30e731161376c344d2727d
lmoriche pushed a commit that referenced this issue Oct 28, 2021
…080)

Before PR gdb/28080 was fixed by the previous patch, GDB was crashing
like this:

 (gdb) detach
 Detaching from program: target:/any/program, process 3671843
 Detaching from process 3671843
 Ending remote debugging.
 [Inferior 1 (process 3671843) detached]
 In main
 terminate called after throwing an instance of 'gdb_exception_error'
 Aborted (core dumped)

Here's the exception above being thrown:

 (top-gdb) bt
 #0  throw_error (error=TARGET_CLOSE_ERROR, fmt=0x555556035588 "Remote connection closed") at src/gdbsupport/common-exceptions.cc:222
 #1  0x0000555555bbaa46 in remote_target::readchar (this=0x555556a11040, timeout=10000) at src/gdb/remote.c:9440
 #2  0x0000555555bbb9e5 in remote_target::getpkt_or_notif_sane_1 (this=0x555556a11040, buf=0x555556a11058, forever=0, expecting_notif=0, is_notif=0x0) at src/gdb/remote.c:9928
 #3  0x0000555555bbbda9 in remote_target::getpkt_sane (this=0x555556a11040, buf=0x555556a11058, forever=0) at src/gdb/remote.c:10030
 #4  0x0000555555bc0e75 in remote_target::remote_hostio_send_command (this=0x555556a11040, command_bytes=13, which_packet=14, remote_errno=0x7fffffffcfd0, attachment=0x0, attachment_len=0x0) at src/gdb/remote.c:12137
 #5  0x0000555555bc1b6c in remote_target::remote_hostio_close (this=0x555556a11040, fd=8, remote_errno=0x7fffffffcfd0) at src/gdb/remote.c:12455
 #6  0x0000555555bc1bb4 in remote_target::fileio_close (During symbol reading: .debug_line address at offset 0x64f417 is 0 [in module build/gdb/gdb]
 this=0x555556a11040, fd=8, remote_errno=0x7fffffffcfd0) at src/gdb/remote.c:12462
 #7  0x0000555555c9274c in target_fileio_close (fd=3, target_errno=0x7fffffffcfd0) at src/gdb/target.c:3365
 #8  0x000055555595a19d in gdb_bfd_iovec_fileio_close (abfd=0x555556b9f8a0, stream=0x555556b11530) at src/gdb/gdb_bfd.c:439
 #9  0x0000555555e09e3f in opncls_bclose (abfd=0x555556b9f8a0) at src/bfd/opncls.c:599
 #10 0x0000555555e0a2c7 in bfd_close_all_done (abfd=0x555556b9f8a0) at src/bfd/opncls.c:847
 #11 0x0000555555e0a27a in bfd_close (abfd=0x555556b9f8a0) at src/bfd/opncls.c:814
 #12 0x000055555595a9d3 in gdb_bfd_close_or_warn (abfd=0x555556b9f8a0) at src/gdb/gdb_bfd.c:626
 #13 0x000055555595ad29 in gdb_bfd_unref (abfd=0x555556b9f8a0) at src/gdb/gdb_bfd.c:715
 #14 0x0000555555ae4730 in objfile::~objfile (this=0x555556515540, __in_chrg=<optimized out>) at src/gdb/objfiles.c:573
 #15 0x0000555555ae955a in std::_Sp_counted_ptr<objfile*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x555556c20db0) at /usr/include/c++/9/bits/shared_ptr_base.h:377
 #16 0x000055555572b7c8 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x555556c20db0) at /usr/include/c++/9/bits/shared_ptr_base.h:155
 #17 0x00005555557263c3 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x555556bf0588, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:730
 #18 0x0000555555ae745e in std::__shared_ptr<objfile, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x555556bf0580, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:1169
 #19 0x0000555555ae747e in std::shared_ptr<objfile>::~shared_ptr (this=0x555556bf0580, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr.h:103
 #20 0x0000555555b1c1dc in __gnu_cxx::new_allocator<std::_List_node<std::shared_ptr<objfile> > >::destroy<std::shared_ptr<objfile> > (this=0x5555564cdd60, __p=0x555556bf0580) at /usr/include/c++/9/ext/new_allocator.h:153
 #21 0x0000555555b1bb1d in std::allocator_traits<std::allocator<std::_List_node<std::shared_ptr<objfile> > > >::destroy<std::shared_ptr<objfile> > (__a=..., __p=0x555556bf0580) at /usr/include/c++/9/bits/alloc_traits.h:497
 #22 0x0000555555b1b73e in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::_M_erase (this=0x5555564cdd60, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556515540}) at /usr/include/c++/9/bits/stl_list.h:1921
 #23 0x0000555555b1afeb in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::erase (this=0x5555564cdd60, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556515540}) at /usr/include/c++/9/bits/list.tcc:158
 #24 0x0000555555b19576 in program_space::remove_objfile (this=0x5555564cdd20, objfile=0x555556515540) at src/gdb/progspace.c:210
 #25 0x0000555555ae4502 in objfile::unlink (this=0x555556515540) at src/gdb/objfiles.c:487
 #26 0x0000555555ae5a12 in objfile_purge_solibs () at src/gdb/objfiles.c:875
 #27 0x0000555555c09686 in no_shared_libraries (ignored=0x0, from_tty=1) at src/gdb/solib.c:1236
 #28 0x00005555559e3f5f in detach_command (args=0x0, from_tty=1) at src/gdb/infcmd.c:2769

Note frame #14:

 #14 0x0000555555ae4730 in objfile::~objfile (this=0x555556515540, __in_chrg=<optimized out>) at src/gdb/objfiles.c:573

That's a dtor, thus noexcept.  That's the reason for the
std::terminate.

The previous patch fixed things such that the exception above isn't
thrown anymore.  However, it's possible that e.g., the remote
connection drops just while a user types "nosharedlibrary", or some
other reason that leads to objfile::~objfile, and then we end up the
same std::terminate problem.

Also notice that frames #9-#11 are BFD frames:

 #9  0x0000555555e09e3f in opncls_bclose (abfd=0x555556bc27e0) at src/bfd/opncls.c:599
 #10 0x0000555555e0a2c7 in bfd_close_all_done (abfd=0x555556bc27e0) at src/bfd/opncls.c:847
 #11 0x0000555555e0a27a in bfd_close (abfd=0x555556bc27e0) at src/bfd/opncls.c:814

BFD is written in C and thus throwing exceptions over such frames may
either not clean up properly, or, may abort if bfd is not compiled
with -fasynchronous-unwind-tables (x86-64 defaults that on, but not
all GCC ports do).

Thus frame #8 seems like a good place to swallow exceptions.  More so
since in this spot we already ignore target_fileio_close return
errors.  That's what this commit does.  Without the previous fix, we'd
see:

 (gdb) detach
 Detaching from program: target:/any/program, process 2197701
 Ending remote debugging.
 [Inferior 1 (process 2197701) detached]
 warning: cannot close "target:/lib64/ld-linux-x86-64.so.2": Remote connection closed

Note it prints a warning, which would still be a regression compared
to GDB 10, if it weren't for the previous fix.

gdb/ChangeLog:
yyyy-mm-dd  Pedro Alves  <[email protected]>

	PR gdb/28080
	* gdb_bfd.c (gdb_bfd_close_warning): New.
	(gdb_bfd_iovec_fileio_close): Wrap target_fileio_close in
	try/catch and print warning on exception.
	(gdb_bfd_close_or_warn): Use gdb_bfd_close_warning.

Change-Id: Ic7a26ddba0a4444e3377b0e7c1c89934a84545d7
lmoriche pushed a commit that referenced this issue Oct 28, 2021
As documented in bug 28086, test gdb.btrace/enable-new-thread.exp
started failing with commit 0618ae4 ("gdb: optimize
all_matching_threads_iterator"):

    (gdb) record btrace^M
    (gdb) PASS: gdb.btrace/enable-new-thread.exp: record btrace
    break 24^M
    Breakpoint 2 at 0x555555555175: file /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.btrace/enable-new-thread.c, line 24.^M
    (gdb) continue^M
    Continuing.^M
    /home/smarchi/src/binutils-gdb/gdb/inferior.c:303: internal-error: inferior* find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0' failed.^M
    A problem internal to GDB has been detected,^M
    further debugging may prove unreliable.^M
    Quit this debugging session? (y or n) FAIL: gdb.btrace/enable-new-thread.exp: continue to breakpoint: cont to bp.1 (GDB internal error)

Note that I only see the failure if GDB is compiled without libipt
support.  This is because GDB then makes use BTS instead of PT, so
exercises different code paths.

I think that the commit above just exposed an existing problem.  The
stack trace of the internal error is:

    #8  0x0000561cb81e404e in internal_error (file=0x561cb83aa2f8 "/home/smarchi/src/binutils-gdb/gdb/inferior.c", line=303, fmt=0x561cb83aa099 "%s: Assertion `%s' failed.") at /home/smarchi/src/binutils-gdb/gdbsupport/errors.cc:55
    #9  0x0000561cb7b5c031 in find_inferior_pid (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, pid=0) at /home/smarchi/src/binutils-gdb/gdb/inferior.c:303
    #10 0x0000561cb7b5c102 in find_inferior_ptid (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/inferior.c:317
    #11 0x0000561cb7f1d1c3 in find_thread_ptid (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/thread.c:487
    #12 0x0000561cb7f1b921 in all_matching_threads_iterator::all_matching_threads_iterator (this=0x7ffc4ee34678, filter_target=0x561cb8aafb60 <the_amd64_linux_nat_target>, filter_ptid=...) at /home/smarchi/src/binutils-gdb/gdb/thread-iter.c:125
    #13 0x0000561cb77bc462 in filtered_iterator<all_matching_threads_iterator, non_exited_thread_filter>::filtered_iterator<process_stratum_target* const&, ptid_t const&> (this=0x7ffc4ee34670) at /home/smarchi/src/binutils-gdb/gdb/../gdbsupport/filtered-iterator.h:42
    #14 0x0000561cb77b97cb in all_non_exited_threads_range::begin (this=0x7ffc4ee34650) at /home/smarchi/src/binutils-gdb/gdb/thread-iter.h:243
    #15 0x0000561cb7d8ba30 in record_btrace_target::record_is_replaying (this=0x561cb8aa6250 <record_btrace_ops>, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:1411
    #16 0x0000561cb7d8bb83 in record_btrace_target::xfer_partial (this=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_MEMORY, annex=0x0, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, offset=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:1437
    #17 0x0000561cb7ef73a9 in raw_memory_xfer_partial (ops=0x561cb8aa6250 <record_btrace_ops>, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1504
    #18 0x0000561cb7ef77da in memory_xfer_partial_1 (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1635
    #19 0x0000561cb7ef78b5 in memory_xfer_partial (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1664
    #20 0x0000561cb7ef7ba4 in target_xfer_partial (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, offset=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1721
    #21 0x0000561cb7ef8503 in target_read_partial (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0, buf=0x7ffc4ee34c58 "\260g\343N\374\177", offset=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1974
    #22 0x0000561cb7ef861f in target_read (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0, buf=0x7ffc4ee34c58 "\260g\343N\374\177", offset=140737352774277, len=1) at /home/smarchi/src/binutils-gdb/gdb/target.c:2014
    #23 0x0000561cb7ef809f in target_read_code (memaddr=140737352774277, myaddr=0x7ffc4ee34c58 "\260g\343N\374\177", len=1) at /home/smarchi/src/binutils-gdb/gdb/target.c:1869
    #24 0x0000561cb7937f4d in gdb_disassembler::dis_asm_read_memory (memaddr=140737352774277, myaddr=0x7ffc4ee34c58 "\260g\343N\374\177", len=1, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:139
    #25 0x0000561cb80ab66d in fetch_data (info=0x7ffc4ee34e88, addr=0x7ffc4ee34c59 "g\343N\374\177") at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:194
    #26 0x0000561cb80ab7e2 in ckprefix () at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:8628
    #27 0x0000561cb80adbd8 in print_insn (pc=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:9587
    #28 0x0000561cb80abe4f in print_insn_i386 (pc=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:8894
    #29 0x0000561cb7744a19 in default_print_insn (memaddr=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/arch-utils.c:1029
    #30 0x0000561cb7b33067 in i386_print_insn (pc=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/i386-tdep.c:4013
    #31 0x0000561cb7acd8f4 in gdbarch_print_insn (gdbarch=0x561cbae2fb60, vma=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/gdbarch.c:3478
    #32 0x0000561cb793a32d in gdb_disassembler::print_insn (this=0x7ffc4ee34e80, memaddr=140737352774277, branch_delay_insns=0x0) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:795
    #33 0x0000561cb793a5b0 in gdb_print_insn (gdbarch=0x561cbae2fb60, memaddr=140737352774277, stream=0x561cb8ac99f8 <null_stream>, branch_delay_insns=0x0) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:850
    #34 0x0000561cb793a631 in gdb_insn_length (gdbarch=0x561cbae2fb60, addr=140737352774277) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:859
    #35 0x0000561cb77f53f4 in btrace_compute_ftrace_bts (tp=0x561cbba11210, btrace=0x7ffc4ee35188, gaps=...) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1107
    #36 0x0000561cb77f55f5 in btrace_compute_ftrace_1 (tp=0x561cbba11210, btrace=0x7ffc4ee35180, cpu=0x0, gaps=...) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1527
    #37 0x0000561cb77f5705 in btrace_compute_ftrace (tp=0x561cbba11210, btrace=0x7ffc4ee35180, cpu=0x0) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1560
    #38 0x0000561cb77f583b in btrace_add_pc (tp=0x561cbba11210) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1589
    #39 0x0000561cb77f5a86 in btrace_enable (tp=0x561cbba11210, conf=0x561cb8ac6878 <record_btrace_conf>) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1629
    #40 0x0000561cb7d88d26 in record_btrace_enable_warn (tp=0x561cbba11210) at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:294
    #41 0x0000561cb7c603dc in std::__invoke_impl<void, void (*&)(thread_info*), thread_info*> (__f=@0x561cbb6c4878: 0x561cb7d88cdc <record_btrace_enable_warn(thread_info*)>) at /usr/include/c++/10/bits/invoke.h:60
    #42 0x0000561cb7c5e5a6 in std::__invoke_r<void, void (*&)(thread_info*), thread_info*> (__fn=@0x561cbb6c4878: 0x561cb7d88cdc <record_btrace_enable_warn(thread_info*)>) at /usr/include/c++/10/bits/invoke.h:153
    #43 0x0000561cb7c5dc92 in std::_Function_handler<void (thread_info*), void (*)(thread_info*)>::_M_invoke(std::_Any_data const&, thread_info*&&) (__functor=..., __args#0=@0x7ffc4ee35310: 0x561cbba11210) at /usr/include/c++/10/bits/std_function.h:291
    #44 0x0000561cb7f2600f in std::function<void (thread_info*)>::operator()(thread_info*) const (this=0x561cbb6c4878, __args#0=0x561cbba11210) at /usr/include/c++/10/bits/std_function.h:622
    #45 0x0000561cb7f23dc8 in gdb::observers::observable<thread_info*>::notify (this=0x561cb8ac5aa0 <gdb::observers::new_thread>, args#0=0x561cbba11210) at /home/smarchi/src/binutils-gdb/gdb/../gdbsupport/observable.h:150
    #46 0x0000561cb7f1c436 in add_thread_silent (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/thread.c:263
    #47 0x0000561cb7f1c479 in add_thread_with_info (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=..., priv=0x561cbb3f7ab0) at /home/smarchi/src/binutils-gdb/gdb/thread.c:272
    #48 0x0000561cb7bfa1d0 in record_thread (info=0x561cbb0413a0, tp=0x0, ptid=..., th_p=0x7ffc4ee35610, ti_p=0x7ffc4ee35620) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:1380
    #49 0x0000561cb7bf7a2a in thread_from_lwp (stopped=0x561cba81db20, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:429
    #50 0x0000561cb7bf7ac5 in thread_db_notice_clone (parent=..., child=...) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:447
    #51 0x0000561cb7bdc9a2 in linux_handle_extended_wait (lp=0x561cbae25720, status=4991) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:1981
    #52 0x0000561cb7bdf0f3 in linux_nat_filter_event (lwpid=435403, status=198015) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:2920
    #53 0x0000561cb7bdfed6 in linux_nat_wait_1 (ptid=..., ourstatus=0x7ffc4ee36398, target_options=...) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:3202
    #54 0x0000561cb7be0b68 in linux_nat_target::wait (this=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=..., ourstatus=0x7ffc4ee36398, target_options=...) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:3440
    #55 0x0000561cb7bfa2fc in thread_db_target::wait (this=0x561cb8a9acd0 <the_thread_db_target>, ptid=..., ourstatus=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:1412
    #56 0x0000561cb7d8e356 in record_btrace_target::wait (this=0x561cb8aa6250 <record_btrace_ops>, ptid=..., status=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:2547
    #57 0x0000561cb7ef996d in target_wait (ptid=..., status=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-gdb/gdb/target.c:2608
    #58 0x0000561cb7b6d297 in do_target_wait_1 (inf=0x561cba6d8780, ptid=..., status=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3640
    #59 0x0000561cb7b6d43e in operator() (__closure=0x7ffc4ee36190, inf=0x561cba6d8780) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3701
    #60 0x0000561cb7b6d7b2 in do_target_wait (ecs=0x7ffc4ee36370, options=...) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3720
    #61 0x0000561cb7b6e67d in fetch_inferior_event () at /home/smarchi/src/binutils-gdb/gdb/infrun.c:4069
    #62 0x0000561cb7b4659b in inferior_event_handler (event_type=INF_REG_EVENT) at /home/smarchi/src/binutils-gdb/gdb/inf-loop.c:41
    #63 0x0000561cb7be25f7 in handle_target_event (error=0, client_data=0x0) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:4227
    #64 0x0000561cb81e4ee2 in handle_file_event (file_ptr=0x561cbae24e10, ready_mask=1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:575
    #65 0x0000561cb81e5490 in gdb_wait_for_event (block=0) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:701
    #66 0x0000561cb81e41be in gdb_do_one_event () at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:212
    #67 0x0000561cb7c18096 in start_event_loop () at /home/smarchi/src/binutils-gdb/gdb/main.c:421
    #68 0x0000561cb7c181e0 in captured_command_loop () at /home/smarchi/src/binutils-gdb/gdb/main.c:481
    #69 0x0000561cb7c19d7e in captured_main (data=0x7ffc4ee366a0) at /home/smarchi/src/binutils-gdb/gdb/main.c:1353
    #70 0x0000561cb7c19df0 in gdb_main (args=0x7ffc4ee366a0) at /home/smarchi/src/binutils-gdb/gdb/main.c:1368
    #71 0x0000561cb7693186 in main (argc=11, argv=0x7ffc4ee367b8) at /home/smarchi/src/binutils-gdb/gdb/gdb.c:32

At frame 45, the new_thread observable is fired.  At this moment, the
new thread isn't the current thread, inferior_ptid is null_ptid.  I
think this is ok: the new_thread observable doesn't give any guarantee
on the global context when observers are invoked.  Frame 35,
btrace_compute_ftrace_bts, calls gdb_insn_length.  gdb_insn_length
doesn't have a thread_info or other parameter what could indicate where
to read memory from, it implicitly uses the global context
(inferior_ptid).

So we reach the all_non_exited_threads_range in
record_btrace_target::record_is_replaying with a null inferior_ptid.
The previous implemention of all_non_exited_threads_range didn't care,
but the new one does.  The problem of calling gdb_insn_length and
ultimately trying to read memory with a null inferior_ptid already
existed, but the commit mentioned above made it visible.

Something between frames 40 (record_btrace_enable_warn) and 35
(btrace_compute_ftrace_bts) needs to be switching the global context to
make TP the current thread.  Since btrace_compute_ftrace_bts takes the
thread_info to work with as a parameter, that typically means that it
doesn't require its caller to also set the global current context
(current thread) when calling.  If it needs to call other functions
that do require the global current thread to be set, then it needs to
temporarily change the current thread while calling these other
functions.  Therefore, switch and restore the current thread in
btrace_compute_ftrace_bts.

By inspection, it looks like btrace_compute_ftrace_pt may also call
functions sensitive to the global context: it installs the
btrace_pt_readmem_callback callback in the PT instruction decoder.  When
this function gets called, inferior_ptid must be set appropriately.  Add
a switch and restore in there too.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28086
Change-Id: I407fbfe41aab990068bd102491aa3709b0a034b3
lmoriche pushed a commit that referenced this issue Oct 28, 2021
Simon Marchi tried gdb on OpenBSD, and it immediately segfaults when
running a program.  Simon tracked down the problem to x86_dr_low.get_status
being nullptr at this point:

    (lldb) print x86_dr_low.get_status
    (unsigned long (*)()) $0 = 0x0000000000000000
    (lldb) bt
    * thread #1, stop reason = step over
      * frame #0: 0x0000033b64b764aa gdb`x86_dr_stopped_data_address(state=0x0000033d7162a310, addr_p=0x00007f7ffffc5688) at x86-dregs.c:645:12
        frame #1: 0x0000033b64b766de gdb`x86_dr_stopped_by_watchpoint(state=0x0000033d7162a310) at x86-dregs.c:687:10
        frame #2: 0x0000033b64ea5f72 gdb`x86_stopped_by_watchpoint() at x86-nat.c:206:10
        frame #3: 0x0000033b64637fbb gdb`x86_nat_target<obsd_nat_target>::stopped_by_watchpoint(this=0x0000033b65252820) at x86-nat.h:100:12
        frame #4: 0x0000033b64d3ff11 gdb`target_stopped_by_watchpoint() at target.c:468:46
        frame #5: 0x0000033b6469b001 gdb`watchpoints_triggered(ws=0x00007f7ffffc61c8) at breakpoint.c:4790:32
        frame #6: 0x0000033b64a8bb8b gdb`handle_signal_stop(ecs=0x00007f7ffffc61a0) at infrun.c:6072:29
        frame #7: 0x0000033b64a7e3a7 gdb`handle_inferior_event(ecs=0x00007f7ffffc61a0) at infrun.c:5694:7
        frame #8: 0x0000033b64a7c1a0 gdb`fetch_inferior_event() at infrun.c:4090:5
        frame #9: 0x0000033b64a51921 gdb`inferior_event_handler(event_type=INF_REG_EVENT) at inf-loop.c:41:7
        frame #10: 0x0000033b64a827c9 gdb`infrun_async_inferior_event_handler(data=0x0000000000000000) at infrun.c:9384:3
        frame #11: 0x0000033b6465bd4f gdb`check_async_event_handlers() at async-event.c:335:4
        frame #12: 0x0000033b65070917 gdb`gdb_do_one_event() at event-loop.cc:216:10
        frame #13: 0x0000033b64af0db1 gdb`start_event_loop() at main.c:421:13
        frame #14: 0x0000033b64aefe9a gdb`captured_command_loop() at main.c:481:3
        frame #15: 0x0000033b64aed5c2 gdb`captured_main(data=0x00007f7ffffc6470) at main.c:1353:4
        frame #16: 0x0000033b64aed4f2 gdb`gdb_main(args=0x00007f7ffffc6470) at main.c:1368:7
        frame #17: 0x0000033b6459d787 gdb`main(argc=5, argv=0x00007f7ffffc6518) at gdb.c:32:10
        frame #18: 0x0000033b6459d521 gdb`___start + 321

On BSDs, get_status is set in _initialize_x86_bsd_nat, but only if
HAVE_PT_GETDBREGS is defined.  PT_GETDBREGS doesn't exist on OpenBSD, so
get_status (and the other fields of x86_dr_low) are left as nullptr.

OpenBSD doesn't support getting or setting the x86 debug registers, so
fix by omitting debug register support entirely on OpenBSD:

- Change x86bsd_nat_target to only inherit from x86_nat_target if
  PT_GETDBREGS is supported.

- Don't include x86-nat.o and nat/x86-dregs.o for OpenBSD/amd64.  They
  were already omitted for OpenBSD/i386.
lmoriche pushed a commit that referenced this issue Oct 28, 2021
We don't handle well a few scenarios involving vfork.  The reason is
that dbgapi can't attach to a vfork child when it is still attached to
the parent.

For example, with a simple program that just calls "vfork":

    $ ./gdb -q -nx --data-directory=data-directory vfork -ex "set follow-fork-mode child" -ex r
    Reading symbols from vfork...
    Starting program: /home/master/smarchi/build/binutils-gdb/gdb/vfork
    [Attaching after process 14346 vfork to child process 14354]
    [New inferior 2 (process 14354)]
    amd-dbgapi: fatal error: enable_debug failed (rc=-1)
    Backtrace:
        #0 0x00007f40337a2a14 amd::dbgapi::process_t::attach() in /home/master/smarchi/src/ROCdbgapi/src/process.cpp:1504
        #1 0x00007f40337a2be2 operator() in /home/master/smarchi/src/ROCdbgapi/src/process.cpp:2142
        #2 0x00007f40337a2e46 tracer_closure in /home/master/smarchi/src/ROCdbgapi/src/logging.h:538
        #3 0x00007f40337a2e46 enter<amd::dbgapi::detail::parameter_t<amd_dbgapi_client_process_s*, (& amd::dbgapi::utils::string_literal<'c', 'l', 'i', 'e', 'n', 't', '_', 'p', 'r', 'o', 'c', 'e', 's', 's', '_', 'i', 'd'>::value), (amd::dbgapi::detail::parameter_kind_t)0>, amd::dbgapi::detail::parameter_t<amd_dbgapi_process_id_t*, (& amd::dbgapi::utils::string_literal<'p', 'r', 'o', 'c', 'e', 's', 's', '_', 'i', 'd'>::value), (amd::dbgapi::detail::parameter_kind_t)0>, amd_dbgapi_process_attach(amd_dbgapi_client_process_id_t, amd_dbgapi_process_id_t*)::<lambda()> > in /home/master/smarchi/src/ROCdbgapi/src/logging.h:585
        #4 0x00007f40337a2e46 amd_dbgapi_process_attach in /home/master/smarchi/src/ROCdbgapi/src/process.cpp:2154
        #5 0x0000000000a63164 rocm_enable(inferior*) in /home/master/smarchi/src/binutils-gdb/gdb/rocm-tdep.c:1700
        #6 0x0000000000a63553 rocm_target_ops::follow_fork(inferior*, ptid_t, target_waitkind, bool, bool) in /home/master/smarchi/src/binutils-gdb/gdb/rocm-tdep.c:2139
        #7 0x0000000000b411d4 target_follow_fork(inferior*, ptid_t, target_waitkind, bool, bool) in /home/master/smarchi/src/binutils-gdb/gdb/target.c:2756
        #8 0x0000000000865137 follow_fork_inferior(bool, bool) in /home/master/smarchi/src/binutils-gdb/gdb/infrun.c:588
        #9 0x0000000000856c10 follow_fork() in /home/master/smarchi/src/binutils-gdb/gdb/infrun.c:760
        #10 0x000000000085e0e2 handle_inferior_event(execution_control_state*) in /home/master/smarchi/src/binutils-gdb/gdb/infrun.c:5549
        #11 0x000000000085c685 fetch_inferior_event() in /home/master/smarchi/src/binutils-gdb/gdb/infrun.c:4077
        #12 0x000000000083b1d3 inferior_event_handler(inferior_event_type) in /home/master/smarchi/src/binutils-gdb/gdb/inf-loop.c:41
        #13 0x00000000008ab957 handle_target_event(int, void*) in /home/master/smarchi/src/binutils-gdb/gdb/linux-nat.c:4208
        #14 0x0000000000dccc02 handle_file_event(file_handler*, int) in /home/master/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:575
        #15 0x0000000000dcbaeb gdb_wait_for_event(int) in /home/master/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:701
        #16 0x0000000000dcb57c gdb_do_one_event() in /home/master/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:212
        #17 0x0000000000b71b7c wait_sync_command_done() in /home/master/smarchi/src/binutils-gdb/gdb/top.c:526
        #18 0x0000000000b71c3d maybe_wait_sync_command_done(int) in /home/master/smarchi/src/binutils-gdb/gdb/top.c:543
        #19 0x0000000000b72431 execute_command(char const*, int) in /home/master/smarchi/src/binutils-gdb/gdb/top.c:674
        #20 0x00000000008e10da catch_command_errors(void (*)(char const*, int), char const*, int, bool) in /home/master/smarchi/src/binutils-gdb/gdb/main.c:523
        #21 0x00000000008e1274 execute_cmdargs(std::vector<cmdarg, std::allocator<cmdarg> > const*, cmdarg_kind, cmdarg_kind, int*) in /home/master/smarchi/src/binutils-gdb/gdb/main.c:618
        #22 0x00000000008e0d65 captured_main_1(captured_main_args*) in /home/master/smarchi/src/binutils-gdb/gdb/main.c:1317
        #23 0x00000000008de70c captured_main(void*) in /home/master/smarchi/src/binutils-gdb/gdb/main.c:1338
        #24 0x00000000008de664 gdb_main(captured_main_args*) in /home/master/smarchi/src/binutils-gdb/gdb/main.c:1363
        #25 0x0000000000427406 main in /home/master/smarchi/src/binutils-gdb/gdb/gdb.c:32
        #26 0x00007f40331f80b2 __libc_start_main
        #27 0x00000000004272ed _start
        #28 0xffffffffffffffff
    Could not attach to process 14354 (rc=-2)

 - With "detach-on-fork on" and "follow-fork-mode child", we
   try to attach the child before detaching the parent, which fails.  This
   might be a consequence of the recent follow_fork changes: before that,
   GDB used to detach from the parent before attaching the child.

 - With "detach-on-fork on" and "follow-fork-mode parent", there's not
   problem as we never try to attach to the child.

 - With "detach-on-fork off" and "follow-fork-mode child", we try to attach
   the child while staying attached to the parent, so that fails.

 - With "detach-on-fork off" and "follow-fork-mode parent" (and "set
   schedule-multiple on", to get around the message that GDB prints), same
   thing.

To avoid these vfork-related problems, I propose to not enable / push
the rocm target for vfork children.  A vfork child is only allowed to do
a limited set of thing, presumably to avoid messing up its parent's
address space.  As documented in vfork(2):

    (From POSIX.1) The vfork() function has the same effect as fork(2), except that the be‐
    havior is undefined if the process created by vfork() either modifies  any  data  other
    than  a  variable of type pid_t used to store the return value from vfork(), or returns
    from the function in which vfork() was called, or calls any other function before  suc‐
    cessfully calling _exit(2) or one of the exec(3) family of functions.

Clearly, a vfork child won't run GPU programs, as that would require
doing things that are not allowed at that time.  So we won't miss
anything by not attaching dbgapi to the fork child.  If the program
execs, though, then it might run some GPU programs in the new address
space, so we need to make sure to push the rocm target and attach dbgapi
when we catch the exec.

The following modifications are made:

  - in rocm_enable, don't push / attach if inf->vfork_parent is set.
    That is useful in the case of "detach-on-fork off"
  - in rocm_target_ops::follow_fork, don't call rocm_enable if the fork
    kind is vfork.  This change might seem redundant with the previous
    one, but both are needed to cover all cases.
  - A vfork child will not have the rocm target pushed, so if it
    subsequently execs, rocm_target_ops::follow_exec won't be called.
    To catch that event and push the rocm target at that point,
    implement the inferior_execd observer.

With just these changes, we hit this internal error:

    (gdb) PASS: gdb.base/foll-vfork.exp: exec: vfork child follow, finish after tcatch vfork: continue to vfork
    finish^M
    Run till exit from #0  0x00007ffff7eb325c in vfork () from /lib/x86_64-linux-gnu/libc.so.6^M
    [Attaching after process 7816 vfork to child process 7824]^M
    [New inferior 2 (process 7824)]^M
    /home/master/smarchi/src/binutils-gdb/gdb/rocm-tdep.c:559: internal-error: void async_event_handler_clear(): Assertion `rocm_async_event_handler != nullptr' failed.^M
    A problem internal to GDB has been detected,^M
    further debugging may prove unreliable.^M
    Quit this debugging session? (y or n) FAIL: gdb.base/foll-vfork.exp: exec: vfork child follow, finish after tcatch vfork: finish (GDB internal error)

Because we now don't push the rocm target in vfork child inferiors, we
can end up with two inferiors, with the following targets:

 - inferior 1, rocm target + linux nat target
 - inferior 2, linux nat target

If we call target_async(true) while inferior 2 is the current inferior,
the linux nat target (which is shared by the two inferiors) will enable
its async mode, but the rocm target won't.  If we then call target_wait
with inferior 1 as the current inferior (so end up in
rocm_target_ops::wait), the rocm target thinks it is async, when in
reality it is not, so we end up calling async_event_handler_clear while
rocm_async_event_handler is unset.

I think that the problem is that we can have target stack that are
partially async, with some targets that have received the order to
become async and other targets that haven't.  I don't think this is
specific to this patch.  We could have hit this problem if we had
decided that the the ROCm target would be pushed after a specific
library load, for example.

To fix that, modify target_async such we call target_async on all the
inferiors that have the current process target as their process target.
This ensures that all targets in all target stacks that have the linux
nat target in them (in the example above) get the memo that they should
enable their async mode.

Change-Id: I4d8102c65cc6b1663a46a49b5490c0f2c91f6279
lmoriche pushed a commit that referenced this issue Oct 28, 2021
gdb.threads/no-unwaited-for-left.exp is currently tripping on internal
errors:

 FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when thread 2 exits (GDB internal error)
 ...
 FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (GDB internal error)

The gdb.log shows:

 continue
 Continuing.
 [Thread 0x7ffff77c2700 (LWP 103640) exited]
 No unwaited-for children left.
 /src/rocm-gdb/gdb/thread.c:156: internal-error: thread_info* inferior_thread(): Assertion `current_thread_ != nullptr' failed.
 A problem internal to GDB has been detected,
 further debugging may prove unreliable.
 FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when thread 2 exits (GDB internal error)
 Resyncing due to internal error.
 Quit this debugging session? (y or n) n

Backtrace shows:

 (top-gdb) bt
 #0  internal_error (file=0xb55d4a605 <error: Cannot access memory at address 0xb55d4a605>, line=0, fmt=0x5555565e4240 "en_US.UTF-8") at ../../src/gdbsupport/errors.cc:51
 #1  0x0000555555d47c07 in inferior_thread () at ../../src/gdb/thread.c:156
 #2  0x0000555555c517db in rocm_target_normal_stop (bs_list=0x0, print_frame=0) at ../../src/gdb/rocm-tdep.c:2194
 #3  0x00005555556afff1 in std::_Function_handler<void (bpstats*, int), void (*)(bpstats*, int)>::_M_invoke(std::_Any_data const&, bpstats*&&, int&&) (__functor=..., __args
 #0=@0x7fffffffcc90: 0x0, __args#1=@0x7fffffffcc8c: 0) at /usr/include/c++/9/bits/std_function.h:300
 #4  0x0000555555a78720 in std::function<void (bpstats*, int)>::operator()(bpstats*, int) const (this=0x5555566ab388, __args#0=0x0, __args#1=0) at /usr/include/c++/9/bits/s
 td_function.h:688
 #5  0x0000555555a7767e in gdb::observers::observable<bpstats*, int>::notify (this=0x555556577380 <gdb::observers::normal_stop>, args#0=0x0, args#1=0) at ../../src/gdb/../g
 dbsupport/observable.h:150
 #6  0x0000555555a72477 in normal_stop () at ../../src/gdb/infrun.c:8673
 #7  0x0000555555a66de5 in fetch_inferior_event () at ../../src/gdb/infrun.c:4118
 #8  0x0000555555a46b7b in inferior_event_handler (event_type=INF_REG_EVENT) at ../../src/gdb/inf-loop.c:41

The problem is that rocm_target_normal_stop is assuming that
normal_stop is called with a thread selected, here:

  if (!ptid_is_gpu (inferior_thread ()->ptid))

That's an incorrect assumption.  In this case, we have a
TARGET_WAITKIND_NO_RESUMED event, so there's no thread selected.

I could see several different predicates we could use to fix this:

 #1 - get the last target event using get_last_target_status, and
      return early if we stopped for an event of kind other than
      TARGET_WAITKIND_STOPPED.

 #2 - bail out early if inferior_ptid is null_ptid.

 #3 - bail out early if the passed in BS_LIST is NULL.  Further below
      in the function we iterate over all breakpoints in BS_LIST
      looking for a watchpoint.  So if the list is empty, we know we
      won't print a warning.

 #4 - bail out early if !PRINT_FRAME.  If we've stopped for a
      watchpoint, then GDB prints the frame/location, which is what we
      warn about.

This commit implements #3 and #4.  #3, for being the simplest and
seems like a minor optimization even if we have stopped for
TARGET_WAITKIND_STOPPED anyhow.  #4, just because it would likely be
weird to warn about the stop location being off when we don't print
it.

Bug: https://ontrack-internal.amd.com/browse/SWDEV-301562
Change-Id: Ib126bab8e0eb5325800fa7dd341169b2b8f7af0d
lmoriche pushed a commit that referenced this issue Oct 28, 2021
The original reproducer for PR28030 required use of a specific
compiler version - gcc-c++-11.1.1-3.fc34 is mentioned in the PR,
though it seems probable that other gcc versions might also be able to
reproduce the bug as well.  This commit introduces a test case which,
using the DWARF assembler, provides a reproducer which is independent
of the compiler version.  (Well, it'll work with whatever compilers
the DWARF assembler works with.)

To the best of my knowledge, it's also the first test case which uses
the DWARF assembler to provide debug info for a shared object.  That
being the case, I provided more than the usual commentary which should
allow this case to be used as a template when a combo shared
library / DWARF assembler test case is required in the future.

I provide some details regarding the bug in a comment near the
beginning of locexpr-dml.exp.

This problem was difficult to reproduce; I found myself constantly
referring to the backtrace while trying to figure out what (else) I
might be missing while trying to create a reproducer.  Below is a
partial backtrace which I include for posterity.

 #0  internal_error (
    file=0xc50110 "/ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/gdbtypes.c", line=5575,
    fmt=0xc520c0 "Unexpected type field location kind: %d")
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdbsupport/errors.cc:51
 #1  0x00000000006ef0c5 in copy_type_recursive (objfile=0x1635930,
    type=0x274c260, copied_types=0x30bb290)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/gdbtypes.c:5575
 #2  0x00000000006ef382 in copy_type_recursive (objfile=0x1635930,
    type=0x274ca10, copied_types=0x30bb290)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/gdbtypes.c:5602
 #3  0x0000000000a7409a in preserve_one_value (value=0x24269f0,
    objfile=0x1635930, copied_types=0x30bb290)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/value.c:2529
 #4  0x000000000072012a in gdbscm_preserve_values (
    extlang=0xc55720 <extension_language_guile>, objfile=0x1635930,
    copied_types=0x30bb290)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/guile/scm-value.c:94
 #5  0x00000000006a3f82 in preserve_ext_lang_values (objfile=0x1635930,
    copied_types=0x30bb290)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/extension.c:568
 #6  0x0000000000a7428d in preserve_values (objfile=0x1635930)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/value.c:2579
 #7  0x000000000082d514 in objfile::~objfile (this=0x1635930,
    __in_chrg=<optimized out>)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/objfiles.c:549
 #8  0x0000000000831cc8 in std::_Sp_counted_ptr<objfile*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x1654580)
    at /usr/include/c++/11/bits/shared_ptr_base.h:348
 #9  0x00000000004e6617 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x1654580) at /usr/include/c++/11/bits/shared_ptr_base.h:168
 #10 0x00000000004e1d2f in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x190bb88, __in_chrg=<optimized out>)
    at /usr/include/c++/11/bits/shared_ptr_base.h:705
 #11 0x000000000082feee in std::__shared_ptr<objfile, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x190bb80, __in_chrg=<optimized out>)
    at /usr/include/c++/11/bits/shared_ptr_base.h:1154
 #12 0x000000000082ff0a in std::shared_ptr<objfile>::~shared_ptr (
    this=0x190bb80, __in_chrg=<optimized out>)
    at /usr/include/c++/11/bits/shared_ptr.h:122
 #13 0x000000000085ed7e in __gnu_cxx::new_allocator<std::_List_node<std::shared_ptr<objfile> > >::destroy<std::shared_ptr<objfile> > (this=0x114bc00,
    __p=0x190bb80) at /usr/include/c++/11/ext/new_allocator.h:168
 #14 0x000000000085e88d in std::allocator_traits<std::allocator<std::_List_node<std::shared_ptr<objfile> > > >::destroy<std::shared_ptr<objfile> > (__a=...,
    __p=0x190bb80) at /usr/include/c++/11/bits/alloc_traits.h:531
 #15 0x000000000085e50c in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::_M_erase (this=0x114bc00, __position=
  std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x1635930})
    at /usr/include/c++/11/bits/stl_list.h:1925
 #16 0x000000000085df0e in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::erase (this=0x114bc00, __position=
  std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x1635930})
    at /usr/include/c++/11/bits/list.tcc:158
 #17 0x000000000085c748 in program_space::remove_objfile (this=0x114bbc0,
    objfile=0x1635930)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/progspace.c:210
 #18 0x000000000082d3ae in objfile::unlink (this=0x1635930)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/objfiles.c:487
 #19 0x000000000082e68c in objfile_purge_solibs ()
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/objfiles.c:875
 #20 0x000000000092dd37 in no_shared_libraries (ignored=0x0, from_tty=1)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/solib.c:1236
 #21 0x00000000009a37fe in target_pre_inferior (from_tty=1)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/target.c:2496
 #22 0x00000000007454d6 in run_command_1 (args=0x0, from_tty=1,
    run_how=RUN_NORMAL)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/infcmd.c:437

I'll note a few points regarding this backtrace:

Frame #1 is where the internal error occurs.  It's caused by an
unhandled case for FIELD_LOC_KIND_DWARF_BLOCK.  The fix for this bug
adds support for this case.

Frame #22 - it's a partial backtrace - shows that GDB is attempting to
(re)run the program.  You can see the exact command sequence that was
used for reproducing this problem in the PR (at
https://sourceware.org/bugzilla/show_bug.cgi?id=28030), but in a
nutshell, after starting the program and advancing to the appropriate
source line, GDB was asked to step into libstdc++; a "finish" command
was issued, returning a value.  The fact that a value was returned is
very important.  GDB was then used to step back into libstdc++.  A
breakpoint was set on a source line in the library after which a "run"
command was issued.

Frame #19 shows a call to objfile_purge_solibs.  It's aptly named.

Frame #7 is a call to the destructor for one of the objfile solibs; it
turned out to be the one for libstdc++.

Frames #6 thru #3 show various value preservation frames.  If you look
at preserve_values() in gdb/value.c, the value history is preserved
first, followed by internal variables, followed by values for the
extension languages (python and guile).
lmoriche pushed a commit that referenced this issue Feb 10, 2022
This commit fixes Bug 28308, titled "Strange interactions with
dprintf and break/commands":

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28308

Since creating that bug report, I've found a somewhat simpler way of
reproducing the problem.  I've encapsulated it into the GDB test case
which I've created along with this bug fix.  The name of the new test
is gdb.base/dprintf-execution-x-script.exp, I'll demonstrate the
problem using this test case, though for brevity, I've placed all
relevant files in the same directory and have renamed the files to all
start with 'dp-bug' instead of 'dprintf-execution-x-script'.

The script file, named dp-bug.gdb, consists of the following commands:

dprintf increment, "dprintf in increment(), vi=%d\n", vi
break inc_vi
commands
  continue
end
run

Note that the final command in this script is 'run'.  When 'run' is
instead issued interactively, the  bug does not occur.  So, let's look
at the interactive case first in order to see the correct/expected
output:

$ gdb -q -x dp-bug.gdb dp-bug
... eliding buggy output which I'll discuss later ...
(gdb) run
Starting program: /mesquite2/sourceware-git/f34-master/bld/gdb/tmp/dp-bug
vi=0
dprintf in increment(), vi=0

Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26
26	in dprintf-execution-x-script.c
vi=1
dprintf in increment(), vi=1

Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26
26	in dprintf-execution-x-script.c
vi=2
dprintf in increment(), vi=2

Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26
26	in dprintf-execution-x-script.c
vi=3
[Inferior 1 (process 1539210) exited normally]

In this run, in which 'run' was issued from the gdb prompt (instead
of at the end of the script), there are three dprintf messages along
with three 'Breakpoint 2' messages.  This is the correct output.

Now let's look at the output that I snipped above; this is the output
when 'run' is issued from the script loaded via GDB's -x switch:

$ gdb -q -x dp-bug.gdb dp-bug
Reading symbols from dp-bug...
Dprintf 1 at 0x40116e: file dprintf-execution-x-script.c, line 38.
Breakpoint 2 at 0x40113a: file dprintf-execution-x-script.c, line 26.
vi=0
dprintf in increment(), vi=0

Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26
26	dprintf-execution-x-script.c: No such file or directory.
vi=1

Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26
26	in dprintf-execution-x-script.c
vi=2

Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26
26	in dprintf-execution-x-script.c
vi=3
[Inferior 1 (process 1539175) exited normally]

In the output shown above, only the first dprintf message is printed.
The 2nd and 3rd dprintf messages are missing!  However, all three
'Breakpoint 2...' messages are still printed.

Why does this happen?

bpstat_do_actions_1() in gdb/breakpoint.c contains the following
comment and code near the start of the function:

  /* Avoid endless recursion if a `source' command is contained
     in bs->commands.  */
  if (executing_breakpoint_commands)
    return 0;

  scoped_restore save_executing
    = make_scoped_restore (&executing_breakpoint_commands, 1);

Also, as described by this comment prior to the 'async' field
in 'struct ui' in top.h, the main UI starts off in sync mode
when processing command line arguments:

  /* True if the UI is in async mode, false if in sync mode.  If in
     sync mode, a synchronous execution command (e.g, "next") does not
     return until the command is finished.  If in async mode, then
     running a synchronous command returns right after resuming the
     target.  Waiting for the command's completion is later done on
     the top event loop.  For the main UI, this starts out disabled,
     until all the explicit command line arguments (e.g., `gdb -ex
     "start" -ex "next"') are processed.  */

This combination of things, the state of the static global
'executing_breakpoint_commands' plus the state of the async
field in the main UI causes this behavior.

This is a backtrace after hitting the dprintf breakpoint for
the second time when doing 'run' from the script file, i.e.
non-interactively:

Thread 1 "gdb" hit Breakpoint 3, bpstat_do_actions_1 (bsp=0x7fffffffc2b8)
    at /ironwood1/sourceware-git/f34-master/bld/../../worktree-master/gdb/breakpoint.c:4431
4431	  if (executing_breakpoint_commands)

 #0  bpstat_do_actions_1 (bsp=0x7fffffffc2b8)
     at gdb/breakpoint.c:4431
 #1  0x00000000004d8bc6 in dprintf_after_condition_true (bs=0x1538090)
     at gdb/breakpoint.c:13048
 #2  0x00000000004c5caa in bpstat_stop_status (aspace=0x116dbc0, bp_addr=0x40116e, thread=0x137f450, ws=0x7fffffffc718,
     stop_chain=0x1538090) at gdb/breakpoint.c:5498
 #3  0x0000000000768d98 in handle_signal_stop (ecs=0x7fffffffc6f0)
     at gdb/infrun.c:6172
 #4  0x00000000007678d3 in handle_inferior_event (ecs=0x7fffffffc6f0)
     at gdb/infrun.c:5662
 #5  0x0000000000763cd5 in fetch_inferior_event ()
     at gdb/infrun.c:4060
 #6  0x0000000000746d7d in inferior_event_handler (event_type=INF_REG_EVENT)
     at gdb/inf-loop.c:41
 #7  0x00000000007a702f in handle_target_event (error=0, client_data=0x0)
     at gdb/linux-nat.c:4207
 #8  0x0000000000b8cd6e in gdb_wait_for_event (block=block@entry=0)
     at gdbsupport/event-loop.cc:701
 #9  0x0000000000b8d032 in gdb_wait_for_event (block=0)
     at gdbsupport/event-loop.cc:597
 #10 gdb_do_one_event () at gdbsupport/event-loop.cc:212
 #11 0x00000000009d19b6 in wait_sync_command_done ()
     at gdb/top.c:528
 #12 0x00000000009d1a3f in maybe_wait_sync_command_done (was_sync=0)
     at gdb/top.c:545
 #13 0x00000000009d2033 in execute_command (p=0x7fffffffcb18 "", from_tty=0)
     at gdb/top.c:676
 #14 0x0000000000560d5b in execute_control_command_1 (cmd=0x13b9bb0, from_tty=0)
     at gdb/cli/cli-script.c:547
 #15 0x000000000056134a in execute_control_command (cmd=0x13b9bb0, from_tty=0)
     at gdb/cli/cli-script.c:717
 #16 0x00000000004c3bbe in bpstat_do_actions_1 (bsp=0x137f530)
     at gdb/breakpoint.c:4469
 #17 0x00000000004c3d40 in bpstat_do_actions ()
     at gdb/breakpoint.c:4533
 #18 0x00000000006a473a in command_handler (command=0x1399ad0 "run")
     at gdb/event-top.c:624
 #19 0x00000000009d182e in read_command_file (stream=0x113e540)
     at gdb/top.c:443
 #20 0x0000000000563697 in script_from_file (stream=0x113e540, file=0x13bb0b0 "dp-bug.gdb")
     at gdb/cli/cli-script.c:1642
 #21 0x00000000006abd63 in source_gdb_script (extlang=0xc44e80 <extension_language_gdb>, stream=0x113e540,
     file=0x13bb0b0 "dp-bug.gdb") at gdb/extension.c:188
 #22 0x0000000000544400 in source_script_from_stream (stream=0x113e540, file=0x7fffffffd91a "dp-bug.gdb",
     file_to_open=0x13bb0b0 "dp-bug.gdb")
     at gdb/cli/cli-cmds.c:692
 #23 0x0000000000544557 in source_script_with_search (file=0x7fffffffd91a "dp-bug.gdb", from_tty=1, search_path=0)
     at gdb/cli/cli-cmds.c:750
 #24 0x00000000005445cf in source_script (file=0x7fffffffd91a "dp-bug.gdb", from_tty=1)
     at gdb/cli/cli-cmds.c:759
 #25 0x00000000007cf6d9 in catch_command_errors (command=0x5445aa <source_script(char const*, int)>,
     arg=0x7fffffffd91a "dp-bug.gdb", from_tty=1, do_bp_actions=false)
     at gdb/main.c:523
 #26 0x00000000007cf85d in execute_cmdargs (cmdarg_vec=0x7fffffffd1b0, file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND,
     ret=0x7fffffffd18c) at gdb/main.c:615
 #27 0x00000000007d0c8e in captured_main_1 (context=0x7fffffffd3f0)
     at gdb/main.c:1322
 #28 0x00000000007d0eba in captured_main (data=0x7fffffffd3f0)
     at gdb/main.c:1343
 #29 0x00000000007d0f25 in gdb_main (args=0x7fffffffd3f0)
     at gdb/main.c:1368
 #30 0x00000000004186dd in main (argc=5, argv=0x7fffffffd508)
     at gdb/gdb.c:32

There are two frames for bpstat_do_actions_1(), one at frame #16 and
the other at frame #0.  The one at frame #16 is processing the actions
for Breakpoint 2, which is a 'continue'.  The one at frame #0 is attempting
to process the dprintf breakpoint action.  However, at this point,
the value of 'executing_breakpoint_commands' is 1, forcing an early
return, i.e. prior to executing the command(s) associated with the dprintf
breakpoint.

For the sake of comparison, this is what the stack looks like when hitting
the dprintf breakpoint for the second time when issuing the 'run'
command from the GDB prompt.

Thread 1 "gdb" hit Breakpoint 3, bpstat_do_actions_1 (bsp=0x7fffffffccd8)
    at /ironwood1/sourceware-git/f34-master/bld/../../worktree-master/gdb/breakpoint.c:4431
4431	  if (executing_breakpoint_commands)

 #0  bpstat_do_actions_1 (bsp=0x7fffffffccd8)
     at gdb/breakpoint.c:4431
 #1  0x00000000004d8bc6 in dprintf_after_condition_true (bs=0x16b0290)
     at gdb/breakpoint.c:13048
 #2  0x00000000004c5caa in bpstat_stop_status (aspace=0x116dbc0, bp_addr=0x40116e, thread=0x13f0e60, ws=0x7fffffffd138,
     stop_chain=0x16b0290) at gdb/breakpoint.c:5498
 #3  0x0000000000768d98 in handle_signal_stop (ecs=0x7fffffffd110)
     at gdb/infrun.c:6172
 #4  0x00000000007678d3 in handle_inferior_event (ecs=0x7fffffffd110)
     at gdb/infrun.c:5662
 #5  0x0000000000763cd5 in fetch_inferior_event ()
     at gdb/infrun.c:4060
 #6  0x0000000000746d7d in inferior_event_handler (event_type=INF_REG_EVENT)
     at gdb/inf-loop.c:41
 #7  0x00000000007a702f in handle_target_event (error=0, client_data=0x0)
     at gdb/linux-nat.c:4207
 #8  0x0000000000b8cd6e in gdb_wait_for_event (block=block@entry=0)
     at gdbsupport/event-loop.cc:701
 #9  0x0000000000b8d032 in gdb_wait_for_event (block=0)
     at gdbsupport/event-loop.cc:597
 #10 gdb_do_one_event () at gdbsupport/event-loop.cc:212
 #11 0x00000000007cf512 in start_event_loop ()
     at gdb/main.c:421
 #12 0x00000000007cf631 in captured_command_loop ()
     at gdb/main.c:481
 #13 0x00000000007d0ebf in captured_main (data=0x7fffffffd3f0)
     at gdb/main.c:1353
 #14 0x00000000007d0f25 in gdb_main (args=0x7fffffffd3f0)
     at gdb/main.c:1368
 #15 0x00000000004186dd in main (argc=5, argv=0x7fffffffd508)
     at gdb/gdb.c:32

This relatively short backtrace is due to the current UI's async field
being set to 1.

Yet another thing to be aware of regarding this problem is the
difference in the way that commands associated to dprintf breakpoints
versus regular breakpoints are handled.  While they both use a command
list associated with the breakpoint, regular breakpoints will place
the commands to be run on the bpstat chain constructed in
bp_stop_status().  These commands are run later on.  For dprintf
breakpoints, commands are run via the 'after_condition_true' function
pointer directly from bpstat_stop_status().  (The 'commands' field in
the bpstat is cleared in dprintf_after_condition_true().  This
prevents the dprintf commands from being run again later on when other
commands on the bpstat chain are processed.)

Another thing that I noticed is that dprintf breakpoints are the only
type of breakpoint which use 'after_condition_true'.  This suggests
that one possible way of fixing this problem, that of making dprintf
breakpoints work more like regular breakpoints, probably won't work.
(I must admit, however, that my understanding of this code isn't
complete enough to say why.  I'll trust that whoever implemented it
had a good reason for doing it this way.)

The comment referenced earlier regarding 'executing_breakpoint_commands'
states that the reason for checking this variable is to avoid
potential endless recursion when a 'source' command appears in
bs->commands.  We know that a dprintf command is constrained to either
1) execution of a GDB printf command, 2) an inferior function call of
a printf-like function, or 3) execution of an agent-printf command.
Therefore, infinite recursion due to a 'source' command cannot happen
when executing commands upon hitting a dprintf breakpoint.

I chose to fix this problem by having dprintf_after_condition_true()
directly call execute_control_commands().  This means that it no
longer attempts to go through bpstat_do_actions_1() avoiding the
infinite recursion check for potential 'source' commands on the
command chain.  I think it simplifies this code a little bit too, a
definite bonus.

Summary:

	* breakpoint.c (dprintf_after_condition_true): Don't call
	bpstat_do_actions_1().  Call execute_control_commands()
	instead.
lmoriche pushed a commit that referenced this issue Feb 10, 2022
With a gdb build with CFLAGS "-O2 -g -flto=auto", I run into:
...
 #7  gdb_main (args=0x7fffffffd220) at src/gdb/main.c:1368^M
 #8  main (argc=<optimized out>, argv=<optimized out>) at src/gdb/gdb.c:32^M
 (gdb) FAIL: gdb.gdb/selftest.exp: backtrace through signal handler
...
which means that this regexp in proc test_with_self fails:
...
  -re "#0.*(read|poll).*in main \\(.*\\) at .*gdb\\.c.*$gdb_prompt $" {
...

The problem is that gdb_main has been inlined into main, and consequently the
backtrace uses:
...
 #x  <fn> ...
...
instead of
...
 #x  <address> in <fn> ...
...

Fix this by updating the regexp to not require "in" before " main".

Tested on x86_64-linux.
lmoriche pushed a commit that referenced this issue Mar 30, 2022
Running gdb.rocm/lane-pc-vega20.exp with -D_GLIBCXX_DEBUG=1 enabled, I
hit this internal error:

/home/smarchi/src/wt/amd/gdb/../gdbsupport/array-view.h:190: internal-error: slice: Assertion `start + size <= m_size' failed.

at:

    #0  internal_error (file=0x5555577495e0 "/home/smarchi/src/wt/amd/gdb/../gdbsupport/array-view.h", line=190, fmt=0x55555773d5e0 "%s: Assertion `%s' failed.") at /home/smarchi/src/wt/amd/gdbsupport/errors.cc:51
    #1  0x0000555555b53350 in gdb::array_view<unsigned char>::slice (this=0x7fffffffc2a0, start=8, size=128) at /home/smarchi/src/wt/amd/gdb/../gdbsupport/array-view.h:190
    #2  0x000055555722909c in value_contents_copy_raw (dst=0x6110001d3700, dst_offset=0, src=0x6110001d2f80, src_offset=8, src_bit_offset=0, length=128) at /home/smarchi/src/wt/amd/gdb/value.c:1355
    #3  0x0000555557233571 in value_primitive_field (arg1=0x6110001d2f80, offset=8, fieldno=2, arg_type=0x621000c3ba40) at /home/smarchi/src/wt/amd/gdb/value.c:3142
    #4  0x00005555560ff905 in cp_print_value_fields (val=0x6110001d2f80, stream=0x603000044950, recurse=0, options=0x7fffffffcab0, dont_print_vb=0x0, dont_print_statmem=0) at /home/smarchi/src/wt/amd/gdb/cp-valprint.c:333
    #5  0x0000555555f69189 in c_value_print_struct (val=0x6110001d2f80, stream=0x603000044950, recurse=0, options=0x7fffffffcab0) at /home/smarchi/src/wt/amd/gdb/c-valprint.c:383
    #6  0x0000555555f69878 in c_value_print_inner (val=0x6110001d2f80, stream=0x603000044950, recurse=0, options=0x7fffffffcab0) at /home/smarchi/src/wt/amd/gdb/c-valprint.c:438
    #7  0x00005555566a87f4 in language_defn::value_print_inner (this=0x5555587ed980 <opencl_language_defn>, val=0x6110001d2f80, stream=0x603000044950, recurse=0, options=0x7fffffffcab0) at /home/smarchi/src/wt/amd/gdb/language.c:632
    #8  0x0000555557211916 in do_val_print (value=0x6110001d2f80, stream=0x603000044950, recurse=0, options=0x7fffffffcc50, language=0x5555587ed980 <opencl_language_defn>) at /home/smarchi/src/wt/amd/gdb/valprint.c:1049
    #9  0x0000555557212286 in common_val_print (val=0x6110001d2f80, stream=0x603000044950, recurse=0, options=0x7fffffffcc50, language=0x5555587ed980 <opencl_language_defn>) at /home/smarchi/src/wt/amd/gdb/valprint.c:1152
    #10 0x0000555555f6a387 in c_value_print (val=0x6110001d2f80, stream=0x603000044950, options=0x7fffffffcf00) at /home/smarchi/src/wt/amd/gdb/c-valprint.c:587
    #11 0x00005555566a8799 in language_defn::value_print (this=0x5555587ed980 <opencl_language_defn>, val=0x6110001d2f80, stream=0x603000044950, options=0x7fffffffcf00) at /home/smarchi/src/wt/amd/gdb/language.c:614
    #12 0x0000555557212522 in value_print (val=0x6110001d2f80, stream=0x603000044950, options=0x7fffffffcf00) at /home/smarchi/src/wt/amd/gdb/valprint.c:1190
    #13 0x00005555569ac849 in print_formatted (val=0x6110001d2f80, size=0, options=0x7fffffffcf00, stream=0x603000044950) at /home/smarchi/src/wt/amd/gdb/printcmd.c:338
    #14 0x00005555569b2126 in print_value (val=0x6110001d2f80, opts=...) at /home/smarchi/src/wt/amd/gdb/printcmd.c:1262
    #15 0x00005555569b2a16 in print_command_1 (args=0x7fffffffe52c "const_struct", voidprint=1) at /home/smarchi/src/wt/amd/gdb/printcmd.c:1371
    #16 0x00005555569b32a3 in print_command (exp=0x7fffffffe52c "const_struct", from_tty=1) at /home/smarchi/src/wt/amd/gdb/printcmd.c:1462
    #17 0x0000555555fadffb in do_simple_func (args=0x7fffffffe52c "const_struct", from_tty=1, c=0x61200005ddc0) at /home/smarchi/src/wt/amd/gdb/cli/cli-decode.c:95
    #18 0x0000555555fbb9b4 in cmd_func (cmd=0x61200005ddc0, args=0x7fffffffe52c "const_struct", from_tty=1) at /home/smarchi/src/wt/amd/gdb/cli/cli-decode.c:2490
    #19 0x0000555556fd00b0 in execute_command (p=0x7fffffffe537 "t", from_tty=1) at /home/smarchi/src/wt/amd/gdb/top.c:670
    #20 0x00005555567cee87 in catch_command_errors (command=0x555556fcf25c <execute_command(char const*, int)>, arg=0x7fffffffe526 "print const_struct", from_tty=1, do_bp_actions=true) at /home/smarchi/src/wt/amd/gdb/main.c:523
    #21 0x00005555567cf543 in execute_cmdargs (cmdarg_vec=0x7fffffffda80, file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND, ret=0x7fffffffd740) at /home/smarchi/src/wt/amd/gdb/main.c:618
    #22 0x00005555567d2b50 in captured_main_1 (context=0x7fffffffdf20) at /home/smarchi/src/wt/amd/gdb/main.c:1317
    #23 0x00005555567d31bb in captured_main (data=0x7fffffffdf20) at /home/smarchi/src/wt/amd/gdb/main.c:1338
    #24 0x00005555567d325c in gdb_main (args=0x7fffffffdf20) at /home/smarchi/src/wt/amd/gdb/main.c:1363
    #25 0x0000555555aa654d in main (argc=19, argv=0x7fffffffe098) at /home/smarchi/src/wt/amd/gdb/gdb.c:32

The root cause is that the hand-written DWARF in lane-pc-vega20.exp says
the test_struct is 133 (0x85) bytes long, when in reality it's 136
(0x88) bytes long.  When fetching the last member, GDB tries to create
an array_view slice that extends to the 136th byte of the value, but its
value is only 133 bytes long.

Change the byte size of test_struct in the test to reflect the actual
size.  I saw the same mistake in
gdb.rocm/aspace-watchpoint-src-vega20.exp as well, so I fixed it there
too, although I did not see that test fail.

Change-Id: Ia8bc78a7260d1b0657bd41d25667606fd3967181
lmoriche pushed a commit that referenced this issue Jul 31, 2022
Fedora Rawhide is now using gcc-12.0.  As part of updating to the
gcc-12.0 package set, Rawhide is also now using a version of libgcc_s
which lacks a .data section.  This causes gdb to fail in the following
fashion while debugging a program (such as gdb) which uses libgcc_s:

    (top-gdb) run
    Starting program: rawhide-master/bld/gdb/gdb
    ...
    objfiles.h:467: internal-error: sect_index_data not initialized
    A problem internal to GDB has been detected,
    further debugging may prove unreliable.
    ...

I snipped the backtrace from the above output.  Instead, here's a
portion of a backtrace obtained using GDB's backtrace command.
(Obviously, in order to obtain it, I used a GDB which has been patched
with this commit.)

    #0  internal_error (
	file=0xc6a508 "gdb/objfiles.h", line=467,
	fmt=0xc6a4e8 "sect_index_data not initialized")
	at gdbsupport/errors.cc:51
    #1  0x00000000005f9651 in objfile::data_section_offset (this=0x4fa48f0)
	at gdb/objfiles.h:467
    #2  0x000000000097c5f8 in relocate_address (address=0x17244, objfile=0x4fa48f0)
	at gdb/stap-probe.c:1333
    #3  0x000000000097c630 in stap_probe::get_relocated_address (this=0xa1a17a0,
	objfile=0x4fa48f0)
	at gdb/stap-probe.c:1341
    #4  0x00000000004d7025 in create_exception_master_breakpoint_probe (
	objfile=0x4fa48f0)
	at gdb/breakpoint.c:3505
    #5  0x00000000004d7426 in create_exception_master_breakpoint ()
	at gdb/breakpoint.c:3575
    #6  0x00000000004efcc1 in breakpoint_re_set ()
	at gdb/breakpoint.c:13407
    #7  0x0000000000956998 in solib_add (pattern=0x0, from_tty=0, readsyms=1)
	at gdb/solib.c:1001
    #8  0x00000000009576a8 in handle_solib_event ()
	at gdb/solib.c:1269
    ...

The function 'relocate_address' in gdb/stap-probe.c attempts to do
its "relocation" by using objfile->data_section_offset().  That
method, data_section_offset() is defined as follows in objfiles.h:

  CORE_ADDR data_section_offset () const
  {
    return section_offsets[SECT_OFF_DATA (this)];
  }

The internal error occurs when the SECT_OFF_DATA macro finds that the
'sect_index_data' field is -1:

    #define SECT_OFF_DATA(objfile) \
	 ((objfile->sect_index_data == -1) \
	  ? (internal_error (__FILE__, __LINE__, \
			     _("sect_index_data not initialized")), -1)	\
	  : objfile->sect_index_data)

relocate_address() is obtaining the section offset in order to compute
a relocated address.  For some ABIs, such as the System V ABI, the
section offsets will all be the same.  So for those ABIs, it doesn't
matter which offset is used.  However, other ABIs, such as the FDPIC
ABI, will have different offsets for the various sections.  Thus, for
those ABIs, it is vital that this and other relocation code use the
correct offset.

In stap_probe::get_relocated_address, the address to which to add the
offset (thus forming the relocated address) is obtained via
this->get_address (); get_address is a getter for m_address in
probe.h.  It's documented/defined as follows (also in probe.h):

  /* The address where the probe is inserted, relative to
     SECT_OFF_TEXT.  */
  CORE_ADDR m_address;

(Thanks to Tom Tromey for this observation.)

So, based on this, the current use of data_section_offset /
SECT_OFF_DATA is wrong.  This relocation code should have been using
text_section_offset / SECT_OFF_TEXT all along.  That being the
case, I've adjusted the stap-probe.c relocation code accordingly.

Searching the sources turned up one other use of data_section_offset,
in gdb/dtrace-probe.c, so I've updated that code as well.  The same
reasoning presented above applies to this case too.

Summary:

	* gdb/dtrace-probe.c (dtrace_probe::get_relocated_address):
	Use method text_section_offset instead of data_section_offset.
	* gdb/stap-probe.c (relocate_address): Likewise.
lmoriche pushed a commit that referenced this issue Jul 31, 2022
g++ 11.1.0 has a bug where it will emit a negative
DW_AT_data_member_location in some cases:

    $ cat test.cpp
    #include <memory>

    int
    main()
    {
      std::unique_ptr<int> ptr;
    }
    $ g++ -g test.cpp
    $ llvm-dwarfdump -F a.out
    ...
    0x00000964:       DW_TAG_member
                        DW_AT_name [DW_FORM_strp]   ("_M_head_impl")
                        DW_AT_decl_file [DW_FORM_data1]     ("/usr/include/c++/11.1.0/tuple")
                        DW_AT_decl_line [DW_FORM_data1]     (125)
                        DW_AT_decl_column [DW_FORM_data1]   (0x27)
                        DW_AT_type [DW_FORM_ref4]   (0x0000067a "default_delete<int>")
                        DW_AT_data_member_location [DW_FORM_sdata]  (-1)
    ...

This leads to a GDB crash (when built with ASan, otherwise probably
garbage results), since it tries to read just before (to the left, in
ASan speak) of the value's buffer:

    ==888645==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000c52af at pc 0x7f711b239f4b bp 0x7fff356bd470 sp 0x7fff356bcc18
    READ of size 1 at 0x6020000c52af thread T0
        #0 0x7f711b239f4a in __interceptor_memcpy /build/gcc/src/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
        #1 0x555c4977efa1 in value_contents_copy_raw /home/simark/src/binutils-gdb/gdb/value.c:1347
        #2 0x555c497909cd in value_primitive_field(value*, long, int, type*) /home/simark/src/binutils-gdb/gdb/value.c:3126
        #3 0x555c478f2eaa in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:333
        #4 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513
        #5 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161
        #6 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513
        #7 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161
        #8 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513
        #9 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161
        #10 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383
        #11 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438
        #12 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632
        #13 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048
        #14 0x555c49759b17 in common_val_print(value*, ui_file*, int, value_print_options const*, language_defn const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1151
        #15 0x555c478f2fcb in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:335
        #16 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513
        #17 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161
        #18 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383
        #19 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438
        #20 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632
        #21 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048
        #22 0x555c49759b17 in common_val_print(value*, ui_file*, int, value_print_options const*, language_defn const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1151
        #23 0x555c478f2fcb in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:335
        #24 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383
        #25 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438
        #26 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632
        #27 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048
        #28 0x555c49759b17 in common_val_print(value*, ui_file*, int, value_print_options const*, language_defn const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1151
        #29 0x555c4760f04c in c_value_print(value*, ui_file*, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:587
        #30 0x555c483ff954 in language_defn::value_print(value*, ui_file*, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:614
        #31 0x555c49759f61 in value_print(value*, ui_file*, value_print_options const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1189
        #32 0x555c48950f70 in print_formatted /home/simark/src/binutils-gdb/gdb/printcmd.c:337
        #33 0x555c48958eda in print_value(value*, value_print_options const&) /home/simark/src/binutils-gdb/gdb/printcmd.c:1258
        #34 0x555c48959891 in print_command_1 /home/simark/src/binutils-gdb/gdb/printcmd.c:1367
        #35 0x555c4895a3df in print_command /home/simark/src/binutils-gdb/gdb/printcmd.c:1458
        #36 0x555c4767f974 in do_simple_func /home/simark/src/binutils-gdb/gdb/cli/cli-decode.c:97
        #37 0x555c47692e25 in cmd_func(cmd_list_element*, char const*, int) /home/simark/src/binutils-gdb/gdb/cli/cli-decode.c:2475
        #38 0x555c4936107e in execute_command(char const*, int) /home/simark/src/binutils-gdb/gdb/top.c:670
        #39 0x555c485f1bff in catch_command_errors /home/simark/src/binutils-gdb/gdb/main.c:523
        #40 0x555c485f249c in execute_cmdargs /home/simark/src/binutils-gdb/gdb/main.c:618
        #41 0x555c485f6677 in captured_main_1 /home/simark/src/binutils-gdb/gdb/main.c:1317
        #42 0x555c485f6c83 in captured_main /home/simark/src/binutils-gdb/gdb/main.c:1338
        #43 0x555c485f6d65 in gdb_main(captured_main_args*) /home/simark/src/binutils-gdb/gdb/main.c:1363
        #44 0x555c46e41ba8 in main /home/simark/src/binutils-gdb/gdb/gdb.c:32
        #45 0x7f71198bcb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
        #46 0x555c46e4197d in _start (/home/simark/build/binutils-gdb-one-target/gdb/gdb+0x77f197d)

    0x6020000c52af is located 1 bytes to the left of 8-byte region [0x6020000c52b0,0x6020000c52b8)
    allocated by thread T0 here:
        #0 0x7f711b2b7459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
        #1 0x555c470acdc9 in xcalloc /home/simark/src/binutils-gdb/gdb/alloc.c:100
        #2 0x555c49b775cd in xzalloc(unsigned long) /home/simark/src/binutils-gdb/gdbsupport/common-utils.cc:29
        #3 0x555c4977bdeb in allocate_value_contents /home/simark/src/binutils-gdb/gdb/value.c:1029
        #4 0x555c4977be25 in allocate_value(type*) /home/simark/src/binutils-gdb/gdb/value.c:1040
        #5 0x555c4979030d in value_primitive_field(value*, long, int, type*) /home/simark/src/binutils-gdb/gdb/value.c:3092
        #6 0x555c478f6280 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:501
        #7 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161
        #8 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513
        #9 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161
        #10 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513
        #11 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161
        #12 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383
        #13 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438
        #14 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632
        #15 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048
        #16 0x555c49759b17 in common_val_print(value*, ui_file*, int, value_print_options const*, language_defn const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1151
        #17 0x555c478f2fcb in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:335
        #18 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513
        #19 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161
        #20 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383
        #21 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438
        #22 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632
        #23 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048
        #24 0x555c49759b17 in common_val_print(value*, ui_file*, int, value_print_options const*, language_defn const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1151
        #25 0x555c478f2fcb in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:335
        #26 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383
        #27 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438
        #28 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632
        #29 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048

Since there are some binaries with this in the wild, I think it would be
useful for GDB to work around this.  I did the obvious simple thing, if
the DW_AT_data_member_location's value is -1, replace it with 0.  I
added a producer check to only apply this fixup for GCC 11.  The idea is
that if some other compiler ever uses a DW_AT_data_member_location value
of -1 by mistake, we don't know (before analyzing the bug at least) if
they did mean 0 or some other value.  So I wouldn't want to apply the
fixup in that case.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28063
Change-Id: Ieef3459b0b9bbce8bdad838ba83b4b64e7269d42
lmoriche pushed a commit that referenced this issue Jul 31, 2022
Starting with commit

  commit 1da5d0e
  Date:   Tue Jan 4 08:02:24 2022 -0700

    Change how Python architecture and language are handled

we see a failure in gdb.threads/killed-outside.exp:

  ...
  Executing on target: kill -9 16622    (timeout = 300)
  builtin_spawn -ignore SIGHUP kill -9 16622
  continue
  Continuing.
  Couldn't get registers: No such process.
  (gdb) [Thread 0x7ffff77c2700 (LWP 16626) exited]

  Program terminated with signal SIGKILL, Killed.
  The program no longer exists.
  FAIL: gdb.threads/killed-outside.exp: prompt after first continue (timeout)

This is not a regression but a failure due to a change in GDB's
output.  Prior to the aforementioned commit, GDB has been printing the
"Couldn't get registers: No such process." message twice.  The second
one came from

  (top-gdb) bt
  #0  amd64_linux_nat_target::fetch_registers (this=0x555557f31440 <the_amd64_linux_nat_target>, regcache=0x555558805ce0, regnum=16) at /gdb-up/gdb/amd64-linux-nat.c:225
  #1  0x000055555640ac5f in target_ops::fetch_registers (this=0x555557d636d0 <the_thread_db_target>, arg0=0x555558805ce0, arg1=16) at /gdb-up/gdb/target-delegates.c:502
  #2  0x000055555641a647 in target_fetch_registers (regcache=0x555558805ce0, regno=16) at /gdb-up/gdb/target.c:3945
  #3  0x0000555556278e68 in regcache::raw_update (this=0x555558805ce0, regnum=16) at /gdb-up/gdb/regcache.c:587
  #4  0x0000555556278f14 in readable_regcache::raw_read (this=0x555558805ce0, regnum=16, buf=0x555558881950 "") at /gdb-up/gdb/regcache.c:601
  #5  0x00005555562792aa in readable_regcache::cooked_read (this=0x555558805ce0, regnum=16, buf=0x555558881950 "") at /gdb-up/gdb/regcache.c:690
  #6  0x000055555627965e in readable_regcache::cooked_read_value (this=0x555558805ce0, regnum=16) at /gdb-up/gdb/regcache.c:748
  #7  0x0000555556352a37 in sentinel_frame_prev_register (this_frame=0x555558181090, this_prologue_cache=0x5555581810a8, regnum=16) at /gdb-up/gdb/sentinel-frame.c:53
  #8  0x0000555555fa4773 in frame_unwind_register_value (next_frame=0x555558181090, regnum=16) at /gdb-up/gdb/frame.c:1235
  #9  0x0000555555fa420d in frame_register_unwind (next_frame=0x555558181090, regnum=16, optimizedp=0x7fffffffd570, unavailablep=0x7fffffffd574, lvalp=0x7fffffffd57c, addrp=0x7fffffffd580,
      realnump=0x7fffffffd578, bufferp=0x7fffffffd5b0 "") at /gdb-up/gdb/frame.c:1143
  #10 0x0000555555fa455f in frame_unwind_register (next_frame=0x555558181090, regnum=16, buf=0x7fffffffd5b0 "") at /gdb-up/gdb/frame.c:1199
  #11 0x00005555560178e2 in i386_unwind_pc (gdbarch=0x5555587c4a70, next_frame=0x555558181090) at /gdb-up/gdb/i386-tdep.c:1972
  #12 0x0000555555cd2b9d in gdbarch_unwind_pc (gdbarch=0x5555587c4a70, next_frame=0x555558181090) at /gdb-up/gdb/gdbarch.c:3007
  #13 0x0000555555fa3a5b in frame_unwind_pc (this_frame=0x555558181090) at /gdb-up/gdb/frame.c:948
  #14 0x0000555555fa7621 in get_frame_pc (frame=0x555558181160) at /gdb-up/gdb/frame.c:2572
  #15 0x0000555555fa7706 in get_frame_address_in_block (this_frame=0x555558181160) at /gdb-up/gdb/frame.c:2602
  #16 0x0000555555fa77d0 in get_frame_address_in_block_if_available (this_frame=0x555558181160, pc=0x7fffffffd708) at /gdb-up/gdb/frame.c:2665
  #17 0x0000555555fa5f8d in select_frame (fi=0x555558181160) at /gdb-up/gdb/frame.c:1890
  #18 0x0000555555fa5bab in lookup_selected_frame (a_frame_id=..., frame_level=-1) at /gdb-up/gdb/frame.c:1720
  #19 0x0000555555fa5e47 in get_selected_frame (message=0x0) at /gdb-up/gdb/frame.c:1810
  #20 0x0000555555cc9c6e in get_current_arch () at /gdb-up/gdb/arch-utils.c:848
  #21 0x000055555625b239 in gdbpy_before_prompt_hook (extlang=0x555557451f20 <extension_language_python>, current_gdb_prompt=0x555557f4d890 <top_prompt+16> "(gdb) ")
      at /gdb-up/gdb/python/python.c:1063
  #22 0x0000555555f7cfbb in ext_lang_before_prompt (current_gdb_prompt=0x555557f4d890 <top_prompt+16> "(gdb) ") at /gdb-up/gdb/extension.c:922
  #23 0x0000555555f7d442 in std::_Function_handler<void (char const*), void (*)(char const*)>::_M_invoke(std::_Any_data const&, char const*&&) (__functor=...,
      __args#0=@0x7fffffffd900: 0x555557f4d890 <top_prompt+16> "(gdb) ") at /usr/include/c++/7/bits/std_function.h:316
  #24 0x0000555555f752dd in std::function<void (char const*)>::operator()(char const*) const (this=0x55555817d838, __args#0=0x555557f4d890 <top_prompt+16> "(gdb) ")
      at /usr/include/c++/7/bits/std_function.h:706
  #25 0x0000555555f75100 in gdb::observers::observable<char const*>::notify (this=0x555557f49060 <gdb::observers::before_prompt>, args#0=0x555557f4d890 <top_prompt+16> "(gdb) ")
      at /gdb-up/gdb/../gdbsupport/observable.h:150
  #26 0x0000555555f736dc in top_level_prompt () at /gdb-up/gdb/event-top.c:444
  #27 0x0000555555f735ba in display_gdb_prompt (new_prompt=0x0) at /gdb-up/gdb/event-top.c:411
  #28 0x00005555564611a7 in tui_on_command_error () at /gdb-up/gdb/tui/tui-interp.c:205
  #29 0x0000555555c2173f in std::_Function_handler<void (), void (*)()>::_M_invoke(std::_Any_data const&) (__functor=...) at /usr/include/c++/7/bits/std_function.h:316
  #30 0x0000555555e10c20 in std::function<void ()>::operator()() const (this=0x5555580f9028) at /usr/include/c++/7/bits/std_function.h:706
  #31 0x0000555555e10973 in gdb::observers::observable<>::notify() const (this=0x555557f48d20 <gdb::observers::command_error>) at /gdb-up/gdb/../gdbsupport/observable.h:150
  #32 0x00005555560e9b3f in start_event_loop () at /gdb-up/gdb/main.c:438
  #33 0x00005555560e9bcc in captured_command_loop () at /gdb-up/gdb/main.c:481
  #34 0x00005555560eb616 in captured_main (data=0x7fffffffddd0) at /gdb-up/gdb/main.c:1348
  #35 0x00005555560eb67c in gdb_main (args=0x7fffffffddd0) at /gdb-up/gdb/main.c:1363
  #36 0x0000555555c1b6b3 in main (argc=12, argv=0x7fffffffded8) at /gdb-up/gdb/gdb.c:32

Commit 1da5d0e eliminated the call to 'get_current_arch'
in 'gdbpy_before_prompt_hook'.  Hence, the second instance of
"Couldn't get registers: No such process." does not appear anymore.

Fix the failure by updating the regular expression in the test.
lmoriche pushed a commit that referenced this issue Jul 31, 2022
…ync."

Commit 14b3360 ("do_target_wait_1: Clear
TARGET_WNOHANG if the target isn't async.") broke some multi-target
tests, such as gdb.multi/multi-target-info-inferiors.exp.  The symptom
is that execution just hangs at some point.  What happens is:

1. One remote inferior is started, and now sits stopped at a breakpoint.
   It is not "async" at this point (but it "can async").

2. We run a native inferior, the event loop gets woken up by the native
   target's fd.

3. In do_target_wait, we randomly choose an inferior to call target_wait
   on first, it happens to be the remote inferior.

4. Because the target is currently not "async", we clear
   TARGET_WNOHANG, resulting in synchronous wait.  We therefore block
   here:

  #0  0x00007fe9540dbb4d in select () from /usr/lib/libc.so.6
  #1  0x000055fc7e821da7 in gdb_select (n=15, readfds=0x7ffdb77c1fb0, writefds=0x0, exceptfds=0x7ffdb77c2050, timeout=0x7ffdb77c1f90) at /home/simark/src/binutils-gdb/gdb/posix-hdep.c:31
  #2  0x000055fc7ddef905 in interruptible_select (n=15, readfds=0x7ffdb77c1fb0, writefds=0x0, exceptfds=0x7ffdb77c2050, timeout=0x7ffdb77c1f90) at /home/simark/src/binutils-gdb/gdb/event-top.c:1134
  #3  0x000055fc7eda58e4 in ser_base_wait_for (scb=0x6250002e4100, timeout=1) at /home/simark/src/binutils-gdb/gdb/ser-base.c:240
  #4  0x000055fc7eda66ba in do_ser_base_readchar (scb=0x6250002e4100, timeout=-1) at /home/simark/src/binutils-gdb/gdb/ser-base.c:365
  #5  0x000055fc7eda6ff6 in generic_readchar (scb=0x6250002e4100, timeout=-1, do_readchar=0x55fc7eda663c <do_ser_base_readchar(serial*, int)>) at /home/simark/src/binutils-gdb/gdb/ser-base.c:444
  #6  0x000055fc7eda718a in ser_base_readchar (scb=0x6250002e4100, timeout=-1) at /home/simark/src/binutils-gdb/gdb/ser-base.c:471
  #7  0x000055fc7edb1ecd in serial_readchar (scb=0x6250002e4100, timeout=-1) at /home/simark/src/binutils-gdb/gdb/serial.c:393
  #8  0x000055fc7ec48b8f in remote_target::readchar (this=0x617000038780, timeout=-1) at /home/simark/src/binutils-gdb/gdb/remote.c:9446
  #9  0x000055fc7ec4da82 in remote_target::getpkt_or_notif_sane_1 (this=0x617000038780, buf=0x6170000387a8, forever=1, expecting_notif=1, is_notif=0x7ffdb77c24f0) at /home/simark/src/binutils-gdb/gdb/remote.c:9928
  #10 0x000055fc7ec4f045 in remote_target::getpkt_or_notif_sane (this=0x617000038780, buf=0x6170000387a8, forever=1, is_notif=0x7ffdb77c24f0) at /home/simark/src/binutils-gdb/gdb/remote.c:10037
  #11 0x000055fc7ec354d4 in remote_target::wait_ns (this=0x617000038780, ptid=..., status=0x7ffdb77c33c8, options=...) at /home/simark/src/binutils-gdb/gdb/remote.c:8147
  #12 0x000055fc7ec38aa1 in remote_target::wait (this=0x617000038780, ptid=..., status=0x7ffdb77c33c8, options=...) at /home/simark/src/binutils-gdb/gdb/remote.c:8337
  #13 0x000055fc7f1409ce in target_wait (ptid=..., status=0x7ffdb77c33c8, options=...) at /home/simark/src/binutils-gdb/gdb/target.c:2612
  #14 0x000055fc7e19da98 in do_target_wait_1 (inf=0x617000038080, ptid=..., status=0x7ffdb77c33c8, options=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:3636
  #15 0x000055fc7e19e26b in operator() (__closure=0x7ffdb77c2f90, inf=0x617000038080) at /home/simark/src/binutils-gdb/gdb/infrun.c:3697
  #16 0x000055fc7e19f0c4 in do_target_wait (ecs=0x7ffdb77c33a0, options=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:3716
  #17 0x000055fc7e1a31f7 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4061

Before the aforementioned commit, we would not have cleared
TARGET_WNOHANG, the remote target's wait would have returned nothing,
and we would have consumed the native target's event.

After applying this revert, the testsuite state looks as good as before
for me on Ubuntu 20.04 amd64.

Change-Id: Ic17a1642935cabcc16c25cb6899d52e12c2f5c3f
rocm-ci pushed a commit that referenced this issue Sep 30, 2022
…atchpoint-src-vega20}.exp

Running current llvm-dwarfdump on the kernel .so produced by
gdb.rocm/lane-pc-vega20.exp hits an assertion, like:

 /opt/rocm/llvm/bin/llvm-dwarfdump ./testsuite/outputs/gdb.rocm/lane-pc-vega20/lane-pc-vega20-kernel.so

 (... snip ...)
 0x000000b4:   DW_TAG_structure_type
		 DW_AT_name      ("test_struct")
		 DW_AT_byte_size (136)
		 DW_AT_decl_file (llvm-dwarfdump: (... snip ...)/llvm/include/llvm/ADT/Optional.h:197: T& llvm::optional_detail::OptionalStorage<T, true>::getValue() & [with T = long unsigned int]: Assertion `hasVal' failed.
 PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
 Stack dump:
 0.      Program arguments: /opt/rocm/llvm/bin/llvm-dwarfdump ./testsuite/outputs/gdb.rocm/lane-pc-vega20/lane-pc-vega20-kernel.so
  #0 0x000055cc8e78315f PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
  #1 0x000055cc8e780d3d SignalHandler(int) Signals.cpp:0:0
  #2 0x00007f8f2cae8420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
  #3 0x00007f8f2c58d00b raise /build/glibc-SzIz7B/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
  #4 0x00007f8f2c56c859 abort /build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:81:7
  #5 0x00007f8f2c56c729 get_sysdep_segment_value /build/glibc-SzIz7B/glibc-2.31/intl/loadmsgcat.c:509:8
  #6 0x00007f8f2c56c729 _nl_load_domain /build/glibc-SzIz7B/glibc-2.31/intl/loadmsgcat.c:970:34
  #7 0x00007f8f2c57dfd6 (/lib/x86_64-linux-gnu/libc.so.6+0x33fd6)
  #8 0x000055cc8e58ceb9 llvm::DWARFDie::dump(llvm::raw_ostream&, unsigned int, llvm::DIDumpOptions) const (/opt/rocm/llvm/bin/llvm-dwarfdump+0x2e0eb9)
  #9 0x000055cc8e58bec3 llvm::DWARFDie::dump(llvm::raw_ostream&, unsigned int, llvm::DIDumpOptions) const (/opt/rocm/llvm/bin/llvm-dwarfdump+0x2dfec3)
 #10 0x000055cc8e5b28a3 llvm::DWARFCompileUnit::dump(llvm::raw_ostream&, llvm::DIDumpOptions) (.part.21) DWARFCompileUnit.cpp:0:0

The issue here is invalid DWARF that the tool isn't expecting,
particularly, the use of a signed data format for the file index:

 [31] DW_TAG_subprogram  DW_CHILDREN_yes
	 DW_AT_name      DW_FORM_string
	 DW_AT_low_pc    DW_FORM_addr
	 DW_AT_high_pc   DW_FORM_addr
	 DW_AT_decl_file DW_FORM_sdata   << signed
	 DW_AT_decl_line DW_FORM_sdata   << signed

The DWARF spec says:

 Any debugging information entry representing the declaration of an object,
 module, subprogram or type may have DW_AT_decl_file, DW_AT_decl_line and
 DW_AT_decl_column attributes, each of whose value is an unsigned integer
                                                         ^^^^^^^^
 constant.

llvm-dwarfdump should be fixed to error out or warn on invalid input
instead of asserting, but still, we should fix the DWARF.  That's what
this commit does, by using DW_FORM_udata instead of DW_FORM_sdata, for
both DW_AT_decl_file and DW_AT_decl_line.

"git grep "decl_file.*FORM_sdata" shows that
gdb.rocm/aspace-watchpoint-src-vega20.exp has the same issue, so fix
it too.

llvm-dwarfdump is able to parse the lane-pc-vega20-kernel.so file
without crashing after this fix, and the gdb.rocm/lane-pc-vega20.exp
and gdb.rocm/aspace-watchpoint-src-vega20.exp testcases still pass
cleanly.

Change-Id: Ifa14331bbaa7c6ebbac03d534d59268e63663af4
(cherry picked from commit 42cbe2e)
rocm-ci pushed a commit that referenced this issue Aug 1, 2024
Since commit b1da98a ("gdb: remove use of alloca in
new_macro_definition"), if cached_argv is empty, we call macro_bcache
with a nullptr data.  This ends up caught by UBSan deep down in the
bcache code:

    $ ./gdb -nx -q --data-directory=data-directory  /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/macscp/macscp -readnow
    Reading symbols from /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/macscp/macscp...
    Expanding full symbols from /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/macscp/macscp...
    /home/smarchi/src/binutils-gdb/gdb/bcache.c:195:12: runtime error: null pointer passed as argument 2, which is declared to never be null

The backtrace:

    #1  0x00007ffff619a05d in __ubsan::__ubsan_handle_nonnull_arg_abort (Data=<optimized out>) at ../../../../src/libsanitizer/ubsan/ubsan_handlers.cpp:750
    #2  0x000055556337fba2 in gdb::bcache::insert (this=0x62d0000c8458, addr=0x0, length=0, added=0x0) at /home/smarchi/src/binutils-gdb/gdb/bcache.c:195
    #3  0x0000555564b49222 in gdb::bcache::insert<char const*, void> (this=0x62d0000c8458, addr=0x0, length=0, added=0x0) at /home/smarchi/src/binutils-gdb/gdb/bcache.h:158
    #4  0x0000555564b481fa in macro_bcache<char const*> (t=0x62100007ae70, addr=0x0, len=0) at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:117
    #5  0x0000555564b42b4a in new_macro_definition (t=0x62100007ae70, kind=macro_function_like, special_kind=macro_ordinary, argv=std::__debug::vector of length 0, capacity 0, replacement=0x62a00003af3a "__builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:573
    #6  0x0000555564b44674 in macro_define_internal (source=0x6210000ab9e0, line=469, name=0x7fffffffa710 "__va_arg_pack", kind=macro_function_like, special_kind=macro_ordinary, argv=std::__debug::vector of length 0, capacity 0, replacement=0x62a00003af3a "__builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:777
    #7  0x0000555564b44ae2 in macro_define_function (source=0x6210000ab9e0, line=469, name=0x7fffffffa710 "__va_arg_pack", argv=std::__debug::vector of length 0, capacity 0, replacement=0x62a00003af3a "__builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:816
    #8  0x0000555563f62fc8 in parse_macro_definition (file=0x6210000ab9e0, line=469, body=0x62a00003af2a "__va_arg_pack() __builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/dwarf2/macro.c:203

This can be reproduced by running gdb.base/macscp.exp.  Avoid calling
macro_bcache if the macro doesn't have any arguments.

Change-Id: I33b5a7c3b3a93d5adba98983fcaae9c8522c383d
@ppanchad-amd
Copy link

@MathiasMagnus Is this ticket still relevant? If not, can we close it? Thanks!

@MathiasMagnus
Copy link
Author

This is still a missing ecosystem component, though my own immediate use for this has passed (having moved on from HIP programming to standard APIs), so it's still relevant, just not for me.

rocm-ci pushed a commit that referenced this issue Sep 13, 2024
The commit:

  commit c6b4867
  Date:   Thu Mar 30 19:21:22 2023 +0100

      gdb: parse pending breakpoint thread/task immediately

Introduce a use bug where the value of a temporary variable was being
used after it had gone out of scope.  This was picked up by the
address sanitizer and would result in this error:

  (gdb) maintenance selftest create_breakpoint_parse_arg_string
  Running selftest create_breakpoint_parse_arg_string.
  =================================================================
  ==2265825==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7fbb08046511 at pc 0x000001632230 bp 0x7fff7c2fb770 sp 0x7fff7c2fb768
  READ of size 1 at 0x7fbb08046511 thread T0
      #0 0x163222f in create_breakpoint_parse_arg_string(char const*, std::unique_ptr<char, gdb::xfree_deleter<char> >*, int*, int*, int*, std::unique_ptr<char, gdb::xfree_deleter<char> >*, bool*) ../../src/gdb/break-cond-parse.c:496
      #1 0x1633026 in test ../../src/gdb/break-cond-parse.c:582
      #2 0x163391b in create_breakpoint_parse_arg_string_tests ../../src/gdb/break-cond-parse.c:649
      #3 0x12cfebc in void std::__invoke_impl<void, void (*&)()>(std::__invoke_other, void (*&)()) /usr/include/c++/13/bits/invoke.h:61
      #4 0x12cc8ee in std::enable_if<is_invocable_r_v<void, void (*&)()>, void>::type std::__invoke_r<void, void (*&)()>(void (*&)()) /usr/include/c++/13/bits/invoke.h:111
      #5 0x12c81e5 in std::_Function_handler<void (), void (*)()>::_M_invoke(std::_Any_data const&) /usr/include/c++/13/bits/std_function.h:290
      #6 0x18bb51d in std::function<void ()>::operator()() const /usr/include/c++/13/bits/std_function.h:591
      #7 0x4193ef9 in selftests::run_tests(gdb::array_view<char const* const>, bool) ../../src/gdbsupport/selftest.cc:100
      #8 0x21c2206 in maintenance_selftest ../../src/gdb/maint.c:1172
      ... etc ...

The problem was caused by three lines like this one:

  thread_info *thr
    = parse_thread_id (std::string (t.get_value ()).c_str (), &tmptok);

After parsing the thread-id TMPTOK would be left pointing into the
temporary string which had been created on this line.  When on the
next line we did this:

  gdb_assert (*tmptok == '\0');

The value of *TMPTOK is undefined.

Fix this by creating the std::string earlier in the scope.  Now the
contents of the string will remain valid when we check *TMPTOK.  The
address sanitizer issue is now resolved.
rocm-ci pushed a commit that referenced this issue Sep 13, 2024
The binary provided with bug 32165 [1] has 36139 ELF sections.  GDB
crashes on it with (note that my GDB is build with -D_GLIBCXX_DEBUG=1:

    $ ./gdb  -nx -q --data-directory=data-directory ./vmlinux
    Reading symbols from ./vmlinux...
    (No debugging symbols found in ./vmlinux)
    (gdb) info func
    /usr/include/c++/14.2.1/debug/vector:508:
    In function:
        std::debug::vector<_Tp, _Allocator>::reference std::debug::vector<_Tp,
        _Allocator>::operator[](size_type) [with _Tp = long unsigned int;
        _Allocator = std::allocator<long unsigned int>; reference = long
        unsigned int&; size_type = long unsigned int]

    Error: attempt to subscript container with out-of-bounds index -29445, but
    container only holds 36110 elements.

    Objects involved in the operation:
        sequence "this" @ 0x514000007340 {
          type = std::debug::vector<unsigned long, std::allocator<unsigned long> >;
        }

The crash occurs here:

    #3  0x00007ffff5e334c3 in __GI_abort () at abort.c:79
    #4  0x00007ffff689afc4 in __gnu_debug::_Error_formatter::_M_error (this=<optimized out>) at /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/debug.cc:1320
    #5  0x0000555561119a16 in std::__debug::vector<unsigned long, std::allocator<unsigned long> >::operator[] (this=0x514000007340, __n=18446744073709522171)
        at /usr/include/c++/14.2.1/debug/vector:508
    #6  0x0000555562e288e8 in minimal_symbol::value_address (this=0x5190000bb698, objfile=0x514000007240) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:517
    #7  0x0000555562e5a131 in global_symbol_searcher::expand_symtabs (this=0x7ffff0f5c340, objfile=0x514000007240, preg=std::optional [no contained value])
        at /home/smarchi/src/binutils-gdb/gdb/symtab.c:4983
    #8  0x0000555562e5d2ed in global_symbol_searcher::search (this=0x7ffff0f5c340) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:5189
    #9  0x0000555562e5ffa4 in symtab_symbol_info (quiet=false, exclude_minsyms=false, regexp=0x0, kind=FUNCTION_DOMAIN, t_regexp=0x0, from_tty=1)
        at /home/smarchi/src/binutils-gdb/gdb/symtab.c:5361
    #10 0x0000555562e6131b in info_functions_command (args=0x0, from_tty=1) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:5525

That is, at this line of `minimal_symbol::value_address`, where
`objfile->section_offsets` is an `std::vector`:

    return (CORE_ADDR (this->unrelocated_address ())
	    + objfile->section_offsets[this->section_index ()]);

A section index of -29445 is suspicious.  The minimal_symbol at play
here is:

    (top-gdb) p m_name
    $1 = 0x521001de10af "_sinittext"

So I restarted debugging, breaking on:

   (top-gdb) b general_symbol_info::set_section_index if $_streq("_sinittext", m_name)

And I see that weird -29445 value:

    (top-gdb) frame
    #0  general_symbol_info::set_section_index (this=0x525000082390, idx=-29445) at /home/smarchi/src/binutils-gdb/gdb/symtab.h:611
    611       { m_section = idx; }

But going up one frame, the section index is 36091:

    (top-gdb) frame
    #1  0x0000555562426526 in minimal_symbol_reader::record_full (this=0x7ffff0ead560, name="_sinittext", copy_name=false,
        address=-2111475712, ms_type=mst_text, section=36091) at /home/smarchi/src/binutils-gdb/gdb/minsyms.c:1228
    1228      msymbol->set_section_index (section);

It seems like the problem is just that the type used for the section
index (short) is not big enough.  Change from short to int.  If somebody
insists, we could even go long long / int64_t, but I doubt it's
necessary.

With that fixed, I get:

    (gdb) info func
    All defined functions:

    Non-debugging symbols:
    0xffffffff81000000  _stext
    0xffffffff82257000  _sinittext
    0xffffffff822b4ebb  _einittext

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=32165

Change-Id: Icb1c3de9474ff5adef7e0bbbf5e0b67b279dee04
Reviewed-By: Tom de Vries <[email protected]>
Reviewed-by: Keith Seitz <[email protected]>
rocm-ci pushed a commit that referenced this issue Oct 19, 2024
When building gdb with gcc 12 and -fsanitize=threads while renabling
background dwarf reading by setting dwarf_synchronous to false, I run into:
...
(gdb) file amd64-watchpoint-downgrade
Reading symbols from amd64-watchpoint-downgrade...
(gdb) watch global_var
==================
WARNING: ThreadSanitizer: data race (pid=20124)
  Read of size 8 at 0x7b80000500d8 by main thread:
    #0 cooked_index_entry::full_name(obstack*, bool) const cooked-index.c:220
    #1 cooked_index::get_main_name(obstack*, language*) const cooked-index.c:735
    #2 cooked_index_worker::wait(cooked_state, bool) cooked-index.c:559
    #3 cooked_index::wait(cooked_state, bool) cooked-index.c:631
    #4 cooked_index_functions::wait(objfile*, bool) cooked-index.h:729
    #5 cooked_index_functions::compute_main_name(objfile*) cooked-index.h:806
    #6 objfile::compute_main_name() symfile-debug.c:461
    #7 find_main_name symtab.c:6503
    #8 main_language() symtab.c:6608
    #9 set_initial_language_callback symfile.c:1634
    #10 get_current_language() language.c:96
    ...

  Previous write of size 8 at 0x7b80000500d8 by thread T1:
    #0 cooked_index_shard::finalize(parent_map_map const*) \
         dwarf2/cooked-index.c:409
    #1 operator() cooked-index.c:663
    ...

  ...

SUMMARY: ThreadSanitizer: data race cooked-index.c:220 in \
  cooked_index_entry::full_name(obstack*, bool) const
==================
Hardware watchpoint 1: global_var
(gdb) PASS: gdb.arch/amd64-watchpoint-downgrade.exp: watch global_var
...

This was also reported in PR31715.

This is due do gcc PR110799 [1], generating wrong code with
-fhoist-adjacent-loads, and causing a false positive for
-fsanitize=threads.

Work around the gcc PR by forcing -fno-hoist-adjacent-loads for gcc <= 13
and -fsanitize=threads.

Tested in that same configuration on x86_64-linux.  Remaining ThreadSanitizer
problems are the ones reported in PR31626 (gdb.rust/dwindex.exp) and
PR32247 (gdb.trace/basic-libipa.exp).

PR gdb/31715
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31715

Tested-By: Bernd Edlinger <[email protected]>
Approved-By: Tom Tromey <[email protected]>

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799
rocm-ci pushed a commit that referenced this issue Nov 1, 2024
On Windows gcore is not implemented, and if you try it, you get an
heap-use-after-free error:

(gdb) gcore C:/gdb/build64/gdb-git-python3/gdb/testsuite/outputs/gdb.base/gcore-buffer-overflow/gcore-buffer-overflow.test
warning: cannot close "=================================================================
==10108==ERROR: AddressSanitizer: heap-use-after-free on address 0x1259ea503110 at pc 0x7ff6806e3936 bp 0x0062e01ed990 sp 0x0062e01ed140
READ of size 111 at 0x1259ea503110 thread T0
    #0 0x7ff6806e3935 in strlen C:/gcc/src/gcc-14.2.0/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:391
    #1 0x7ff6807169c4 in __pformat_puts C:/gcc/src/mingw-w64-v12.0.0/mingw-w64-crt/stdio/mingw_pformat.c:558
    #2 0x7ff6807186c1 in __mingw_pformat C:/gcc/src/mingw-w64-v12.0.0/mingw-w64-crt/stdio/mingw_pformat.c:2514
    #3 0x7ff680713614 in __mingw_vsnprintf C:/gcc/src/mingw-w64-v12.0.0/mingw-w64-crt/stdio/mingw_vsnprintf.c:41
    #4 0x7ff67f34419f in vsnprintf(char*, unsigned long long, char const*, char*) C:/msys64/mingw64/x86_64-w64-mingw32/include/stdio.h:484
    #5 0x7ff67f34419f in string_vprintf[abi:cxx11](char const*, char*) C:/gdb/src/gdb.git/gdbsupport/common-utils.cc:106
    #6 0x7ff67b37b739 in cli_ui_out::do_message(ui_file_style const&, char const*, char*) C:/gdb/src/gdb.git/gdb/cli-out.c:227
    #7 0x7ff67ce3d030 in ui_out::call_do_message(ui_file_style const&, char const*, ...) C:/gdb/src/gdb.git/gdb/ui-out.c:571
    #8 0x7ff67ce4255a in ui_out::vmessage(ui_file_style const&, char const*, char*) C:/gdb/src/gdb.git/gdb/ui-out.c:740
    #9 0x7ff67ce2c873 in ui_file::vprintf(char const*, char*) C:/gdb/src/gdb.git/gdb/ui-file.c:73
    #10 0x7ff67ce7f83d in gdb_vprintf(ui_file*, char const*, char*) C:/gdb/src/gdb.git/gdb/utils.c:1881
    #11 0x7ff67ce7f83d in vwarning(char const*, char*) C:/gdb/src/gdb.git/gdb/utils.c:181
    #12 0x7ff67f3530eb in warning(char const*, ...) C:/gdb/src/gdb.git/gdbsupport/errors.cc:33
    #13 0x7ff67baed27f in gdb_bfd_close_warning C:/gdb/src/gdb.git/gdb/gdb_bfd.c:437
    #14 0x7ff67baed27f in gdb_bfd_close_or_warn C:/gdb/src/gdb.git/gdb/gdb_bfd.c:646
    #15 0x7ff67baed27f in gdb_bfd_unref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.c:739
    #16 0x7ff68094b6f2 in gdb_bfd_ref_policy::decref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.h:82
    #17 0x7ff68094b6f2 in gdb::ref_ptr<bfd, gdb_bfd_ref_policy>::~ref_ptr() C:/gdb/src/gdb.git/gdbsupport/gdb_ref_ptr.h:91
    #18 0x7ff67badf4d2 in gcore_command C:/gdb/src/gdb.git/gdb/gcore.c:176

0x1259ea503110 is located 16 bytes inside of 4064-byte region [0x1259ea503100,0x1259ea5040e0)
freed by thread T0 here:
    #0 0x7ff6806b1687 in free C:/gcc/src/gcc-14.2.0/libsanitizer/asan/asan_malloc_win.cpp:90
    #1 0x7ff67f2ae807 in objalloc_free C:/gdb/src/gdb.git/libiberty/objalloc.c:187
    #2 0x7ff67d7f56e3 in _bfd_free_cached_info C:/gdb/src/gdb.git/bfd/opncls.c:247
    #3 0x7ff67d7f2782 in _bfd_delete_bfd C:/gdb/src/gdb.git/bfd/opncls.c:180
    #4 0x7ff67d7f5df9 in bfd_close_all_done C:/gdb/src/gdb.git/bfd/opncls.c:960
    #5 0x7ff67d7f62ec in bfd_close C:/gdb/src/gdb.git/bfd/opncls.c:925
    #6 0x7ff67baecd27 in gdb_bfd_close_or_warn C:/gdb/src/gdb.git/gdb/gdb_bfd.c:643
    #7 0x7ff67baecd27 in gdb_bfd_unref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.c:739
    #8 0x7ff68094b6f2 in gdb_bfd_ref_policy::decref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.h:82
    #9 0x7ff68094b6f2 in gdb::ref_ptr<bfd, gdb_bfd_ref_policy>::~ref_ptr() C:/gdb/src/gdb.git/gdbsupport/gdb_ref_ptr.h:91
    #10 0x7ff67badf4d2 in gcore_command C:/gdb/src/gdb.git/gdb/gcore.c:176

It happens because gdb_bfd_close_or_warn uses a bfd-internal name for
the failing-close warning, after the close is finished, and the name
already freed:

static int
gdb_bfd_close_or_warn (struct bfd *abfd)
{
  int ret;
  const char *name = bfd_get_filename (abfd);

  for (asection *sect : gdb_bfd_sections (abfd))
    free_one_bfd_section (sect);

  ret = bfd_close (abfd);

  if (!ret)
    gdb_bfd_close_warning (name,
			   bfd_errmsg (bfd_get_error ()));

  return ret;
}

Fixed by making a copy of the name for the warning.

Approved-By: Andrew Burgess <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants