-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gdbremote: Initial (and minimal) support for remote debugging #444
base: main
Are you sure you want to change the base?
Conversation
b81f461
to
2928020
Compare
The automatic tests are now implemented and are passing the CI tests (took me a couple of goes to get things running well on i686 but it's all good now). |
This is super cool, thank you! I'm happy with this functionality as a starting point. I only skimmed the code so far, so I'll have to give it a proper review in the next couple of days. BTW, I have a new guy on my team taking a stab at implementing the vmcoreinfo query packet (https://github.com/osandov/drgn/wiki/gdbstub-protocol-proposal:-linux.vmcoreinfo-query-packet) that will make it possible to support KASLR, so hopefully you'll be seeing patches for that soon. |
2928020
to
b5a2b83
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really appreciate this! I left several comments, mostly questions to help think about how to extend this further.
// gdbremote uses the same binary format as struct user_pt_reg | ||
// so we can just reuse that code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that true in general or only for some architectures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to admit I'm not entirely sure. kgdb uses lookup tables to convert from gdb format to struct (plain, ordinary, not-user) pt_reg. Hopefully we do that for a reason although I haven't, as yet, gone digging to find out why.
""" | ||
Set the program to the specificed elffile and connect to a gdbserver. | ||
|
||
:param conn: gdb connection string (e.g. localhost:2345) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll likely also want to support serial devices and Unix socket domain sockets natively in the future. This interface should still be sufficient, since we could differentiate by:
- If
conn
starts with/
or.
, it must be a path, and we can check whether the path is a socket or a character device. - If
conn
looks like ahost:port
, assume it is a TCP connection. If a user has a path that looks like ahost:port
, they can prefix it with./
to disambiguate it. - Otherwise, assume it is a path.
(Nothing for you to act on here, I'm just thinking ahead.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd have to double check but gdb's heuristic looks even simpler than that!
prog->main_thread = | ||
drgn_thread_set_search(&prog->thread_set, &thread.tid).entry; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the thread set is the best fit for this, since the available threads are dynamic based on whatever the remote server tells us, right?
Another thing I'm curious about, does the remote protocol have a notion of a main or current thread? From what I can tell, you're free to start and stop threads at will via protocol commands, right?
(I'm assuming you just did it this way to make it easy for the minimal initial support, so I'm mainly asking to think about how the drgn API may need to be extended to represent remote protocol concepts.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the thread set is the best fit for this, since the available threads are dynamic based on whatever the remote server tells us, right?
I think it would only be dynamic if we put the remote side into non-stop mode. Otherwise when we are talking to the remote then the whole system is stopped. In that case I figured using the thread_set would allow us to cache things (which will probably be desirable on a 115200 baud link). Having said that if we needed to cache more aggressively (including agressive lazyiness) we would probably need a different approach.
Another thing I'm curious about, does the remote protocol have a notion of a main or current thread? From what I can tell, you're free to start and stop threads at will via protocol commands, right?
Likewise, that can happen in non-stop mode but I'd suggest the initial development focus should be for all-stop mode, if only because that "feels" much like a core dump, only with specialist memory and register options.
@@ -1131,6 +1199,9 @@ drgn_thread_iterator_destroy(struct drgn_thread_iterator *it) | |||
if (it->prog->flags & DRGN_PROGRAM_IS_LINUX_KERNEL) { | |||
drgn_object_deinit(&it->entry.object); | |||
linux_helper_task_iterator_deinit(&it->task_iter); | |||
} else if (it->prog->flags & DRGN_PROGRAM_IS_GDBREMOTE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The various combinations of flags are starting to get out of hand, but I'll leave that as a followup for me to clean up.
Introduce the basic infrastructure to communicate with remote debuggers using the gdbremote protocol. See docs/advanced_usage.rst for both instructions on usages and a summary of the current limitations. Testing is been fairly modest but does cover pretty much all the new code paths: 1. the reported register values were compared between gdb and drgn 2. frame pointer based (fallback) stack tracing 3. x0 (argc) and x1 (argv) were checked and the pointers chased to verify that argv[0] contains the right value Signed-off-by: Daniel Thompson <[email protected]>
b5a2b83
to
cbb6112
Compare
Thanks for the review. I think I've fixed all the actionable bits. Let me know if there is anything more! |
Hi @daniel-thompson, I'm excited to see this after our discussion at LPC! I have yet to dive too far into the code. I actually wanted to take a "testing-first" approach here, so I cherry-picked your changes onto the I launched it like so:
And then tried to attach to it with:
I understand that KASLR is not supported, which is totally fine for now, but think I expected to see the ability to read from arbitrary virtual memory addresses (or physical, which I also tested to similar results). I also went ahead and exited my I'll definitely take a look into the code soon as well, though I'm not nearly the reviewer that @osandov is. But I wanted to share the above so you could try it out or point out the error in my ways :) |
I'll take a look. I suspect what you are seeing is a memory fault that hasn't been translated well (since we don't yet decode error packets). However whatever it is I'll take a look, adding error packet handling was next on my list anyway. I'm deliberately trying to keep each step as small as the next (useful) step can possibly be since I have to work on this in fits and starts. For physical access you should have received an exception since there is no support for physical memory access... but it shouldn't have been cryptic ("Cannot read from physical memory at 0").
drgn doesn't issue a detach packet on exit meaning QEMU should remain in the same run state it was when we connected (drgn also won't stop the target if it is running). Adding detach support would be trivial but it sits outside the minimum useful initial set I was aiming for (and IMHO it makes more sense once there is support for stopping the target during connect). |
Sorry, you're right, I can read valid virtual memory addresses with no issues. And if I boot with I do still see the behavior that as soon as drgn attaches to QEMU's GDB, the VM seems to be halted and it doesn't respond to any inputs. Even after I exit drgn. So I think QEMU may be automatically stopping the guest when a connection is accepted. |
Both usage and limitations are described in docs/advanced_usage.rst.
Testing is been but follows pretty much all the new code paths:
At present I have not implemented any new automatic tests. A mock gdbserver that looks up replies from a python dictionary should be fairly easy to put togther but I wanted to make what I've got available for code review first.