Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend support for data breakpoints in VS Code API #226735

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

martin-fleck-at
Copy link

What it does

The breakpoint API in VS Code supports to add

  • Allow the user to manage data breakpoints through vscode.debug API
  • Add methods to gain information about a potential data breakpoint
  • Allow data breakpoints to have different sources
    • Keep use case where data id is already known (current)
    • Add infrastructure for already existing address resolution
    • Extend for dynamic variables resolved for a session
      • Ensure dynamic variables are resolved in the debug model

Communication:

  • Adapt DataBreakpoint with source between extension host and main
  • Expose DataBreakpoint in VS Code API, previously not exposed (should not cause any backwards compatibility issues)

Minor:

  • Make bytes optional in data bytes info, as it is the same in the DAP

Fixes #195151

How to test

To test this change, you need to make sure you have a C++ file that you can debug as the Typescript debugger does not support data breakpoints as far as I can see.

  1. Install VS Code C++ Tools
  2. Create a C++ file, e.g.,
#include <iostream>

int main() {
    int counter = 0;

    for (int i = 0; i < 10; i++) {
        counter++;
        std::cout << "Counter: " << counter << std::endl;
    }

    return 0;
}
  1. Create and start an extension (or modify an existing one that is already started with VS Code) with the code similar to this one:
vscode.debug.onDidChangeBreakpoints(evt => {
    vscode.window.showInformationMessage(JSON.stringify(evt, undefined, 2));
});

vscode.debug.onDidChangeActiveDebugSession(session => {
    setTimeout(async () => {
        try {
            const info = await session?.getDataBreakpointInfo('i', 1000);
            vscode.window.showInformationMessage(JSON.stringify(info, undefined, 2));
        } catch (error) {
            vscode.window.showErrorMessage(error.message);
        }
        try {
            const info = await session?.getDataBytesBreakpointInfo('0x7fffffffd7e8', 4);
            vscode.window.showInformationMessage(JSON.stringify(info, undefined, 2));
        } catch (error) {
            vscode.window.showErrorMessage(error.message);
        }
    }, 2000);
});

vscode.debug.addBreakpoints([
    new vscode.DataBreakpoint({ type: 'variable', dataId: '0x7fffffffd7e8,counter,4' }, 'readWrite', false, '', false, 'condition', 'hitCondition', 'logMessage'),
    new vscode.DataBreakpoint({ type: 'address', address: '0x7fffffffd7e8', bytes: 4 }, 'readWrite', false, '', false, 'condition', 'hitCondition', 'logMessage'),
    new vscode.DataBreakpoint({ type: 'dynamicVariable', name: 'i', variablesReference: 1000 }, 'readWrite', false, '', false, 'condition', 'hitCondition', 'logMessage')
]);

When starting the extension you should already see the three data breakpoints that you added:
image

  1. Enable the data breakpoints (or already produce them enabled in your extension code).
  2. Debug the program with the Debug C/C++ File command.
  3. Depending on your other setup not all breakpoints may resolve correctly but you can see that they are properly evaluated.

@martin-fleck-at
Copy link
Author

@microsoft-github-policy-service agree company="EclipseSource"

@martin-fleck-at
Copy link
Author

@thegecko FYI, I believe you have also shown interest in this issue at some point.

Copy link
Member

@roblourens roblourens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, I just took a quick look at it. How would you use this without exposing more info about variables in the extension API? What scenario do you want to enable specifically?

src/vscode-dts/vscode.d.ts Outdated Show resolved Hide resolved
* @param name The name of the variable's child to obtain data breakpoint information for. If `variablesReference` isn't specified, this can be an expression.
* @param variablesReference Reference to the variable container if the data breakpoint is requested for a child of the container.
*/
getDataBreakpointInfo(name: string, variablesReference?: number): Thenable<DataBreakpointInfo | undefined>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would variablesReference come from? I don't think this is exposed in API elsewhere.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct that there is no API to directly retrieve the variablesReference at the moment. I made that choice to keep the API changes smaller but we could introduce such methods if we need to. The two main entry points I see that can currently be used to retrieve the reference is to:

  • Register a debug adapter factory vscode.debug.registerDebugAdapterTrackerFactory and listen to the messages that are being sent and tracking the variable information since there is no direct access to the variables through the API.
  • Call the debug adapter directly with vscode.debug.activeDebugSession.customRequest and use the Debug Adapter Protocol to access the variable information. For instance, in the C/C++ debugger, one could send an evaluate request to retrieve information about a variable if the variable name is known.

For now, I did not envision any particular support for the user in this regard but of course we could also look into extending the variable resolution area a bit more.

@roblourens roblourens requested a review from connor4312 August 31, 2024 00:41
@martin-fleck-at
Copy link
Author

@roblourens Thank you very much for your quick feedback, I really appreciate it!

Regarding the retrieval of the variable information, please see my inline comment.

As for the use cases: The original ticket describes a scenario where a data breakpoint feature is needed for a remote debugger. However, I do not have any details on that. The particular use case we are working on has to do with a dedicated memory view where we wanted to support setting data breakpoints through a context menu in our custom UI. The VS Code Debug UIs already have all the necessary access and many bits and pieces were already in place for most of the other breakpoint types. However, we hit a wall particularly with data breakpoints as they are not exposed in the extension API even though most of the functionality was already in place on the "main" side and only a few if-branches were missing on the extension host side. So the core of this change is really rather small and trying to align the support of data breakpoints with the support of other breakpoints. If you have any further questions, please let me know.

* @param name The name of the variable's child to obtain data breakpoint information for. If `variablesReference` isn't specified, this can be an expression.
* @param variablesReference Reference to the variable container if the data breakpoint is requested for a child of the container.
*/
getDataBreakpointInfo(name: string, variablesReference?: number): Thenable<DataBreakpointInfo | undefined>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this method or the one below it should be in the VS Code API. Both are trivial wrappers around DAP calls and could be made via customRequest. We try to avoid taking unnecessary knowledge about DAP into VS Code API.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully understand your point and agree that these methods are more like convenience wrappers around customRequest calls. The downside, however, is that with custom request calls we lack access to the session's capabilities, which means we cannot perform sanity checks on the client side (like RawDebugSession does) and have to rely on the debug adapter to send us proper errors. Would exposing these capabilities also be considered exposing unnecessary internal knowledge? I believe it could be useful in other scenarios like UI enablement as well.

@@ -646,6 +646,7 @@ export interface IExceptionBreakpoint extends IBaseBreakpoint {
export const enum DataBreakpointSetType {
Variable,
Address,
DynamicVariable
Copy link
Member

@connor4312 connor4312 Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the addition of DynamicVariable is only convenience for a debug adapter to reference a variable/expression without having to round trip dataBreakpointInfo themselves.

  • This enum expresses the two sources which can be used to set a data breakpoint in DAP. A dynamic variable is not that, it's just a variable that didn't yet get translated viadataBreakpointInfo
  • DynamicVariable adds an extra type to the API and brings in more DAP knowledge into the API that is necessary to implement functionality

Copy link
Author

@martin-fleck-at martin-fleck-at Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this scenario is a bit more nuanced because it involves the revival of data breakpoints on the VS Code side if they are persisted.

Retrieving the dataId via Data Breakpoint Info:

  1. Scoped Variables or Expressions
    By using name and specifying the variable container scope (variablesReference) or frame scope (frameId), the request yields a dataId that is valid only within that specific context.

  2. Address
    When using name to specify an address (by setting the asAddress flag), we get a dataId that is valid indefinitely. Additionally, we can set bytes to retrieve a memory range. This is known as a "data bytes breakpoint" in the codebase and is available only if the supportsDataBreakpointBytes capability is present.

  3. Expressions
    Using name to specify an expression without setting the asAddress flag results in a dataId that is also valid indefinitely. It’s unclear whether bytes should also be supported in this case.

In the first scenario, persisting the breakpoint likely doesn’t make sense. However, in the latter two cases, it might. Ultimately, it is up to the debug adapter to determine if a breakpoint can be persisted by returning the canPersist flag in the data breakpoint info response.

Persisted Data Breakpoints in VS Code (debugModel):

  • Type Variable: We reuse the dataId, assuming it remains consistent across sessions.
  • Type Address: We re-retrieve the dataId by requesting the data breakpoint info with the asAddress flag. It seems we assume this dataId could change across sessions.

Given this, I felt there might be a gap in supporting re-retrieval of the dataId for expressions or potentially scoped variables if the debug adapter allows them to be persisted. Probably the term DynamicVariable was a bit misleading and adding two additional types Expression and Scoped would be more useful, even though I am not sure about the scoped one being very useful as breakpoints are activated at the beginning of a session so we might want to skip that one. What are your thoughts on this? I'm very interested to hear your perspective.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed another commit that showcases the different types a little bit better.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@connor4312 What do you think of this approach?

- Allow the user to manage data breakpoints through vscode.debug API
- Add methods to gain information about a potential data breakpoint
- Allow data breakpoints to have different sources
-- Keep use case where data id is already known (current)
-- Add infrastructure for already existing address resolution
-- Extend for dynamic variables resolved for a session
--- Ensure dynamic variables are resolved in the debug model

Communication:
- Adapt DataBreakpoint with source between extension host and main
- Expose DataBreakpoint in VS Code API, previously not exposed

Minor:
- Make bytes optional in data bytes info, as it is the same in the DAP

Fixes microsoft#195151
@martin-fleck-at martin-fleck-at force-pushed the issues/195151 branch 3 times, most recently from 7e56de5 to 342b2c0 Compare September 4, 2024 14:19
- Properly extract API proposal into it's own file
- Remove data breakpoint info convenience method
- Provide proposal for different data breakpoint sources (discussion)
- Ensure that breakpoint mode is properly updated and passed through
@planger
Copy link

planger commented Oct 14, 2024

I just wanted to check in and see if there's been any progress on reviewing the PR. Let me know if there's anything I can do to help move it forward. Thank you!

@connor4312 connor4312 self-assigned this Oct 18, 2024
Copy link
Member

@connor4312 connor4312 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, and sorry for the slow review, it's been a busy few weeks.

I think there are some (partially existing) structural things we want to correct -- see my last comment. Let me know if you need help with that, I should have time during our next debt week.

}
} else {
// type === DataBreakpointSetType.Scoped
if (this.src.frameId) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any given frame ID is valid only within the context of the current debug stopped state.

Copy link
Author

@martin-fleck-at martin-fleck-at Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, I am not sure if this is still relevant after re-working the resolution mechanism a bit more.

Comment on lines 1231 to 1234
if (sessionDataId) {
this.sessionDataId.set(session, sessionDataId);
dataId = sessionDataId;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if (sessionDataId) { is pretty repetitive, you can assign the return value of most calls to dataId and this.sessionDataId.set(session, dataId); down below

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand you now from your comment below, the dataId is actually valid across sessions anyway so I'm not sure if it still makes sense to store the data Id for each individual session anymore, what do you think?

frameId: number;
variablesReference?: never;
}
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These types are more complex than we generally like in our API. We'd probably represent these as classes that extensions could construct and pass to DataBreakpointAccessType

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, I refactored the types as classes that can be constructed more naturally.

* @param hitCondition Expression that controls how many hits of the breakpoint are ignored.
* @param logMessage Log message to display when breakpoint is hit.
*/
constructor(source: DataBreakpointSource | string, accessType: DataBreakpointAccessType, canPersist?: boolean, label?: string, enabled?: boolean, condition?: string, hitCondition?: string, logMessage?: string);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think canPersist should be part of the type: 'variable' union, since in other cases the value of canPersist should be derived from the data breakpoint info request.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fully agree that canPersist should be part of the type/class that already knows about resolved values. I actually also added other properties such as the accessTypes which may also already be given if known.

sessionDataId = (await session.dataBytesBreakpointInfo(this.src.address, this.src.bytes))?.dataId;
if (!sessionDataId) {
return undefined;
let dataId = this.sessionDataId.get(session);
Copy link
Member

@connor4312 connor4312 Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might need to revisit this pattern for breakpoints with canPersist. It kind of works for address breakpoints (minus ASLR shenanigans) but what should actually happen is that the value of canPersist gets derived from the data DataBreakpointInfoResponse and then we persist the data ID.

This breaks down more when we have sources set using frames and variable IDs. When a new session is started, or when VS Code is restarted and the breakpoints are read from storage, those frames and variable references will no longer have any meaning. We'll request them again here in toDap but that's likely to fail or point to the wrong addresses. Instead, if we've already translated them to a data ID with canPersist, we should just set them with that data ID instead.

I think we still want to keep the src in here for display purposes (and we could potentially present a "refresh" action for certain non-canPersist breakpoints within a VS Code session) but the resolution of the data ID should be done and kept at the instant the breakpoint it gets set.

This is a bit egg on my face since I started the pattern of requesting dataBytesBreakpointInfo in this method which was misleading 🙈

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a deeper look now and if I understand you correctly, you argue that the resolution of the data breakpoints should be done as early as possible (i.e., as soon as they are added. And what we are actually interested in persisting is only the dataId which needs to be valid across multiple sessions anyway. Is that right? I tried to incorporate that in my next commit but skipped the refreshing of breakpoints within a session for now but that would be surely interesting as well.

@martin-fleck-at
Copy link
Author

@connor4312 Thank you very much for your feedback, I'll have a more detailed look this week!

- Translate complex types into separate classes
- Properly resolve data breakpoints as soon as the user adds them
-- Use resolution result to store additional data used later
@martin-fleck-at
Copy link
Author

@connor4312 Thank you very much for your very valuable feedback! I pushed another commit to move the data breakpoint resolution to an earlier point in time and properly propagate the resolution result throughout VS Code. I'm sure there are still areas to improve and I'd be happy to get more feedback but I want to make sure I'm on the right track here.

@thegecko
Copy link
Contributor

@connor4312 @roblourens Could we have an update on this PR, please?

Is there any more which needs changing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Data Breakpoints across sessions
5 participants