Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving the DisassembleRequest logic #341

Conversation

asimgunes
Copy link
Contributor

Hi @jonahgraham ,

Here is an update for the disassembleRequest logic in the debug adapter. This implementation is covering the scenarios discussed at the #340 .

Could you please help me on reviewing this update?

Notes:
There was one issue that I was in doubt. The previous implementation was covering a scenario for the endMemoryReference argument. I couldn't find the argument in the DAP specifications. Since this argument doesn't exists on the current DAP protocol, I avoid to make an implementation for this argument. Please let me know if the argument still expected for any clients used by the cdt-gdb-adapter.

@jonahgraham
Copy link
Contributor

Please let me know if the argument still expected for any clients used by the cdt-gdb-adapter.

This was added for disassembly support in the now abandoned front end in Eclipse CDT (the one for traditional Eclipse CDT)

I don't know of any other front end using it.

(rest of review soon)

Copy link
Contributor

@jonahgraham jonahgraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the design here looks good - please see individual comments for more details

src/util/disassembly.ts Outdated Show resolved Hide resolved
src/integration-tests/util.spec.ts Show resolved Hide resolved
src/integration-tests/util.spec.ts Show resolved Hide resolved
src/integration-tests/diassemble.spec.ts Outdated Show resolved Hide resolved
src/integration-tests/diassemble.spec.ts Outdated Show resolved Hide resolved
@jonahgraham
Copy link
Contributor

@asimgunes FYI on eclipse-cdt-cloud/cdt-gdb-vscode#120 (comment) - if we can land this soon, I will push the update out in a new vscode version unless you recommend against or we run into any roadblocks.

@asimgunes asimgunes force-pushed the improvement/updating-disassemble-request-implementation branch from 9ffbce3 to daeea95 Compare November 29, 2024 10:47
@asimgunes
Copy link
Contributor Author

Hi @jonahgraham , I made some updates related to the feeback, could you please check it when you are suitable?

PS: tests seems failed due to timeout issues, however all passed in my test pr, I am expecting all of the tests will pass if they are triggered again.

src/util/disassembly.ts Outdated Show resolved Hide resolved
@asimgunes asimgunes force-pushed the improvement/updating-disassemble-request-implementation branch from daeea95 to c7f9185 Compare December 5, 2024 10:02
@asimgunes asimgunes force-pushed the improvement/updating-disassemble-request-implementation branch from c7f9185 to 8cd7292 Compare December 5, 2024 10:13
@asimgunes asimgunes requested a review from jonahgraham December 5, 2024 10:43
Copy link
Contributor

@jonahgraham jonahgraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was experimenting with the change and I scrolled backwards (lower memory) for a while, then I got to unreadable memory and VSCode won't scroll backwards any further.

I see this in the log:

From client: disassemble({"memoryReference":"0x000055555555534e","offset":0,"instructionOffset":-1200,"instructionCount":50,"resolveSymbols":true})
GDB command: 38 -data-disassemble -s "(0x000055555555534e)-4800" -e "(0x000055555555534e)+0" -- 5
GDB result: 38 done,asm_insns=[{address="0x000055555555408e",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x0000555555554090",opcodes="18 03",inst="sbb    %al,(%rbx)"},{address="0x0000555555554092",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x0000555555554094",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x0000555555554096",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x0000555555554098",opcodes="1c 00",inst="sbb    $0x0,%al"},{address="0x000055555555409a",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x000055555555409c",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x000055555555409e",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x00005555555540a0",opcodes="1c 00",inst="sbb    $0x0,%al"},{address="0x00005555555540a2",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x00005555555540a4",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x00005555555540a6",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x00005555555540a8",opcodes="01 00",inst="
GDB -cont-: 38 add    %eax,(%rax)"},{address="0x00005555555540aa",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x00005555555540ac",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x00005555555540ae",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x00005555555540b0",opcodes="01 00",inst="add    %eax,(%rax)"},{address="0x00005555555540b2",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x00005555555540b4",opcodes="04 00",inst="add    $0x0,%al"},{address="0x00005555555540b6",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x00005555555540b8",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x00005555555540ba",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x00005555555540bc",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x00005555555540be",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x00005555555540c0",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x00005555555540c2",opcodes="00 00",inst="add    %al,(%rax)"},{address="0x00005555555540c4",opcodes="00 00
-- elided many more GDB -cont-: 38 lines --
GDB -cont-: 38 2eb",func-name="test_arr_md",offset="243",opcodes="8b 45 fc",inst="mov    -0x4(%rbp),%eax"},{address="0x00005555555552ee",func-name="test_arr_md",offset="246",opcodes="48 98",inst="cltq"},{address="0x00005555555552f0",func-name="test_arr_md",offset="248",opcodes="8b 55 f8",inst="mov    -0x8(%rbp),%edx"},{address="0x00005555555552f3",func-name="test_arr_md",offset="251",opcodes="48 63 d2",inst="movslq %edx,%rdx"},{address="0x00005555555552f6",func-name="test_arr_md",offset="254",opcodes="48 c1 e2 05",inst="shl    $0x5,%rdx"},{address="0x00005555555552fa",func-name="test_arr_md",offset="258",opcodes="48 01 d0",inst="add    %rdx,%rax"},{address="0x00005555555552fd",func-name="test_arr_md",offset="261",opcodes="48 8d 14 85 00 00 00 00",inst="lea    0x0(,%rax,4),%rdx"},{address="0x0000555555555305",func-name="test_arr_md",offset="269",opcodes="48 8d 05 34 cd 00 00",inst="lea    0xcd34(%rip),%rax        # 0x555555562040 <arr_md>"},{address="0x000055555555530c",func-name="test_arr_md",offset=
GDB -cont-: 38 "276",opcodes="c7 04 02 00 00 00 00",inst="movl   $0x0,(%rdx,%rax,1)"},{address="0x0000555555555313",func-name="test_arr_md",offset="283",opcodes="83 45 fc 01",inst="addl   $0x1,-0x4(%rbp)"},{address="0x0000555555555317",func-name="test_arr_md",offset="287",opcodes="83 7d fc 1f",inst="cmpl   $0x1f,-0x4(%rbp)"},{address="0x000055555555531b",func-name="test_arr_md",offset="291",opcodes="0f 8e 54 ff ff ff",inst="jle    0x555555555275 <test_arr_md+125>"},{address="0x0000555555555321",func-name="test_arr_md",offset="297",opcodes="83 45 f8 01",inst="addl   $0x1,-0x8(%rbp)"},{address="0x0000555555555325",func-name="test_arr_md",offset="301",opcodes="81 7d f8 ff 03 00 00",inst="cmpl   $0x3ff,-0x8(%rbp)"},{address="0x000055555555532c",func-name="test_arr_md",offset="308",opcodes="0f 8e 37 ff ff ff",inst="jle    0x555555555269 <test_arr_md+113>"},{address="0x0000555555555332",func-name="test_arr_md",offset="314",opcodes="8b 05 08 cd 00 00",inst="mov    0xcd08(%rip),%eax        # 0x555555562040 <
GDB -cont-: 38 arr_md>"},{address="0x0000555555555338",func-name="test_arr_md",offset="320",opcodes="83 c0 01",inst="add    $0x1,%eax"},{address="0x000055555555533b",func-name="test_arr_md",offset="323",opcodes="89 05 ff cc 00 00",inst="mov    %eax,0xccff(%rip)        # 0x555555562040 <arr_md>"},{address="0x0000555555555341",func-name="test_arr_md",offset="329",opcodes="e9 17 ff ff ff",inst="jmp    0x55555555525d <test_arr_md+101>"},{address="0x0000555555555346",func-name="main",offset="0",opcodes="f3 0f 1e fa",inst="endbr64"},{address="0x000055555555534a",func-name="main",offset="4",opcodes="55",inst="push   %rbp"},{address="0x000055555555534b",func-name="main",offset="5",opcodes="48 89 e5",inst="mov    %rsp,%rbp"}]
To client: {"seq":0,"type":"response","request_seq":32,"command":"disassemble","success":true,"body":{"instructions":[{"address":"0x000055555555483f","instructionBytes":"00 00","instruction":"add    %al,(%rax)","location":{}},{"address":"0x0000555555554841","instructionBytes":"00 00","instruction":"add    %al,(%rax)","location":{}},{"address":"0x0000555555554843","instructionBytes":"00 00","instruction":"add    %al,(%rax)","location":{}},{"address":"0x0000555555554845","instructionBytes":"00 00","instruction":"add    %al,(%rax)","location":{}},{"address":"0x0000555555554847","instructionBytes":"00 00","instruction":"add    %al,(%rax)","location":{}},{"address":"0x0000555555554849","instructionBytes":"00 00","instruction":"add    %al,(%rax)","location":{}},{"address":"0x000055555555484b","instructionBytes":"00 00","instruction":"add    %al,(%rax)","location":{}},{"address":"0x000055555555484d","instructionBytes":"00 00","instruction":"add    %al,(%rax)","location":{}},{"address":"0x000055555555484f","instructionBytes":"00 00","instruction":"add    %al,(%rax)","location":{}},{"address":"0x0000555555554851","instructionBytes":"00 00","instruction":"add    %al,(%rax)","location":{}},{"address":"0x0000555555554853","instructionBytes":"00 00","instruction":"add    %al,(%rax)","location":{}},{"address":"0x0000555555554855","instructionBytes":"00 00","instruction":"add    %al,(%rax)","location":{}},{"address":"0x0000555555554857","instructionBytes":"00 00","instruction":"add    %al,([...]
From client: disassemble({"memoryReference":"0x000055555555534e","offset":0,"instructionOffset":-1250,"instructionCount":50,"resolveSymbols":true})
GDB command: 39 -data-disassemble -s "(0x000055555555534e)-5000" -e "(0x000055555555534e)+0" -- 5
GDB result: 39 error,msg="Cannot access memory at address 0x555555553fc6"
To client: {"seq":0,"type":"response","request_seq":33,"command":"disassemble","success":true,"body":{"instructions":[]}}
From client: disassemble({"memoryReference":"0x000055555555534e","offset":0,"instructionOffset":-1250,"instructionCount":50,"resolveSymbols":true})
GDB command: 40 -data-disassemble -s "(0x000055555555534e)-5000" -e "(0x000055555555534e)+0" -- 5
GDB result: 40 error,msg="Cannot access memory at address 0x555555553fc6"
To client: {"seq":0,"type":"response","request_seq":34,"command":"disassemble","success":true,"body":{"instructions":[]}}

In here I see two issues:

  1. The -data-disassemble command keeps getting longer, it goes from ref - 4800 to ref - 0, even though instructionCount is 50.
  2. On failing to read memory, the reply to the client contains {"instructions":[]} - i.e. no instructions, rather than a bunch of invalid instructions. You have a comment Fill with empty instructions in case of memory error. so it looks like there is a special case not being handled here.

@asimgunes
Copy link
Contributor Author

Hi @jonahgraham ,

In here I see two issues:

  1. The -data-disassemble command keeps getting longer, it goes from ref - 4800 to ref - 0, even though instructionCount is 50.

This is related to instruction conversion, the request came as:

disassemble({"memoryReference":"0x000055555555534e","offset":0,"instructionOffset":-1200,"instructionCount":50,"resolveSymbols":true})

We need to have all the instructions until the memoryReference point to calculate the instructionOffset and instructionCount. Here you can say we need smaller chunks to query if thats the case, I might need to re-visit the implementation.

  1. On failing to read memory, the reply to the client contains {"instructions":[]} - i.e. no instructions, rather than a bunch of invalid instructions. You have a comment Fill with empty instructions in case of memory error. so it looks like there is a special case not being handled here.

It shouldn't happen, let me check whats missing here.

@jonahgraham
Copy link
Contributor

This is related to instruction conversion, the request came as:

disassemble({"memoryReference":"0x000055555555534e","offset":0,"instructionOffset":-1200,"instructionCount":50,"resolveSymbols":true})

We need to have all the instructions until the memoryReference point to calculate the instructionOffset and instructionCount. Here you can say we need smaller chunks to query if thats the case, I might need to re-visit the implementation.

Thanks for explaining it - the distinction between byte offset and instruction offset will probably keep tripping me (and future developers) up, but your implementation looks correct. I don't think we should bother with chunking unless/until we see some performance hit of doing these large reads.

@jonahgraham
Copy link
Contributor

It shouldn't happen, let me check whats missing here.

👍

@asimgunes asimgunes force-pushed the improvement/updating-disassemble-request-implementation branch from 560a2a7 to 83f2822 Compare December 9, 2024 11:56
@asimgunes
Copy link
Contributor Author

Hi @jonahgraham ,

I updated the logic and sent as a separate commit so you can easily track the changes. You can find the change related to the memory read and empty instruction filling in the last commit. Could you please review this update when you are available?

Additionally, it seems the Linux tests are broken but I believe this is a general issue rather than related to this update. I am leaving the tests to you but please let me know if it is not a general error and related to this update.

More details about this update:

You can spot a different approach than we discussed previously, so let me explain the new approach and its reason here:

When we read the memory, and go to the lower memory values, at some point, we came accross the invalid memory areas where we cannot disassemble the intructions. at this point if we return empty instructions, this will override previously read correct instructions. Because of this purpose, I returned back the error instead of correct number of the empty instructions. With this behaviour VSCode stops the scrolling and shows the error without killing the debug session.

I believe this is a better approach, otherwise I observed suddenly the disassembled instructions turned to empty instructions during my test session.

On the other hand, approach still contains edge cases. Since there is a conversion between the instruction offsets and memory offsets via a lineer conversion logic, in most of the times we are fetching more than we need. This is causing to hit the memory read error much more earlier than it needs. For example, DAP request came for instructionOffset: -1000, however we read memoryOffset: -4000 which contains much more instructions, for instance we read ~1500 instructions. When DAP request came for instructionOffset: -1200 we encounter memory read error while reading memoryOffset: -4800, however, as you can imagine, if we can handle the boundries better, we can still return back the 1200 instructions.

I tried some additional approaches but every different approach that I tried came with other issues. So, I left the implementation as simple as possible. Since our current appoach is a "Greedy" approach we are not keeping track of the areas which are disassembled before, in the future we may think about changing our approach with a "Dynamic Programming" approach where backtracking the successfully disassembled range and split/chunk the memory with a better approach and do not override/erase the already diassembled instructions with empty ones.

@asimgunes asimgunes requested a review from jonahgraham December 9, 2024 14:14
@jonahgraham
Copy link
Contributor

Additionally, it seems the Linux tests are broken

They are because GHA changed version of Ubuntu we are using. See #345 which is close to ready, but not quite there yet.

I'll review the code change once GHA issue is resolved - your description sounds good to me, but I want to run it up to see before I approve it.

Copy link
Contributor

@jonahgraham jonahgraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to approve this now - I think there are still some corner cases, but seeing as many of them are probably related to vscode and the design to use negative offsets in this way,

Three particular issues I see that I think all need to be solved in vscode side, so if there is interest, someone needs to invest there.

  • vscode often seems to stop fetching instructions when scrolling, it gets worse when the adapter sends an (legitimate) error, or when source code is interleaved
  • refreshing of the view is inconsistent. I can really see it when toggling source code in the disassembly
  • the way vscode handles negative offsets is really frustrating - I think vscode should make subsequent negative offsets using the memory reference that the adapter returned, rather than the current IP.

I have merged the test changes from #345 and if I get green tick I will merge and we can discuss if anything else is wanted in future versions.

@jonahgraham jonahgraham merged commit f790bd3 into eclipse-cdt-cloud:main Dec 10, 2024
4 checks passed
@jonahgraham
Copy link
Contributor

I published this change as 1.0.3 on npm

jonahgraham added a commit to jonahgraham/cdt-gdb-vscode that referenced this pull request Dec 10, 2024
@asimgunes asimgunes deleted the improvement/updating-disassemble-request-implementation branch January 28, 2025 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants