Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backtrace support, part 3: Actually produce continuation-aware backtraces #119

Merged
merged 9 commits into from
Mar 6, 2024

Conversation

frank-emrich
Copy link

This PR finalizes the implementation of backtrace creation in the presence of wasmfx stack switching. The only piece missing after this PR is testing, which I'm adding in a separate PR.

As of #117, it is already the case that each CallThreadState executing wasm stores a pointer to the underlying Store's StackChainCell.

This PR changes the existing creation of backtraces in backtrace.rs as follows:
Previously, trace_with_trap_state would call
call trace_through_wasm for the faulting program position plus all the entries in the current thread's CallThreadState chain.
Now, instead of calling trace_through_wasm directly, trace_with_trap_state calls a new function trace_through_continuations. The latter takes an optional StackChain and iterates all the stacks (= continuation stacks + main stack) therein, calling trace_through_wasm for each such stack. If no such StackChain is given, it just behaves as before. Note that the invariant established in #117 ensures that only the most recent execution of wasm may actually be inside a continuation, while all previous CallThreadStates must be on the main stack.

As a result, we only ever call trace_through_wasm such that the old invariants (which we had commented out at some point) hold again: Its parameters trampoline_sp and fp describe a continuous (!) sequence of stack memory, either on the main stack or inside a Fiber stack. Further, we have trampoline_sp > fp and following the frame pointer chain starting at fp eventually leads to a pointer past (= greater or equal than) trampoline_sp.

@dhil dhil self-requested a review March 1, 2024 13:32
Copy link
Member

@dhil dhil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. See my comments.

self.0 = StackChain::Absent;
Some(next)
}
StackChain::Continuation(ptr) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to make this a NonNull pointer to get rid of the unsafe usage here, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, any of the methods that dereference a NonNull are still unsafe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking again. We know that ptr is non-null, right? As far as I remember, it should never take the value of null. In this case, why don't we just do it all with raw pointers. Make the entire arm unsafe.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would the advantage of that be over the current version?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We wouldn't have any of the conversion into references... May be negligible.

@@ -382,6 +386,9 @@ pub mod baseline {
/// Continues a given continuation.
#[inline(always)]
pub fn resume(instance: &mut Instance, contref: &mut VMContRef) -> Result<u32, TrapReason> {
// Trigger fuse
HAS_EVER_RUN_CONTINUATION.set(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following may be faster if Cell::set unconditionally writes to the internal memory cell, though, I don't know the specifics of the internals of Cell::set

Suggested change
HAS_EVER_RUN_CONTINUATION.set(true);
if HAS_EVER_RUN_CONTINUATION.get() {
HAS_EVER_RUN_CONTINUATION.set(true);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it again, maybe you should use OnceCell.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking yet again, there is a potential problem here (and regression). Consider the scenario: You run a continuation to completion (e.g. the unit continuation), and afterwards on the main stack you perform a wasm-to-host call that reenters wasm, but it would be prohibited from doing so because HAS_EVER_RUN_CONTINUATION is true even though we are not currently running on a continuation... or?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following may be faster if Cell::set unconditionally writes to the internal memory cell, though, I don't know the specifics of the internals of Cell::set

Writing to the Cell here compiles down to a simple move instruction. So there isn't much of a point in making it conditional.

Thinking yet again, there is a potential problem here (and regression). Consider the scenario: You run a continuation to completion (e.g. the unit continuation), and afterwards on the main stack you perform a wasm-to-host call that reenters wasm, but it would be prohibited from doing so because HAS_EVER_RUN_CONTINUATION is true even though we are not currently running on a continuation... or?

Please note that the HAS_EVER_RUN_CONTINUATION is not used by the mechanism to disallow certain forms of re-entering wasm. Instead, it is only used in the logic for generating backtraces: If we are using the baseline implementation and have ever run a continuation, we return an empty backtrace. In particular, the program you describe would still run normally. It's only the case that in your example returning to the main stack, we would still produce empty backtraces in the case of a trap. But I think that's acceptable; I'd rather not make the logic regarding this more complex.

Introducing the HAS_EVER_RUN_CONTINUATION flag is mostly just so that the full test suite passes when compiling with the baseline flag: There are plenty of non-wasmfx test cases that do something and inspect the resulting backtrace. These tests would fail if we unconditionally produced empty backtraces whenever the baseline feature is enabled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following may be faster if Cell::set unconditionally writes to the internal memory cell, though, I don't know the specifics of the internals of Cell::set

Writing to the Cell here compiles down to a simple move instruction. So there isn't much of a point in making it conditional.

I suppose this move is to an address in main memory. Writing to memory is often slower than reading from memory, hence why a branch can yield better performance -- we could potentially help the speculation engine by providing a cold-branch hint.

Introducing the HAS_EVER_RUN_CONTINUATION flag is mostly just so that the full test suite passes when compiling with the baseline flag: There are plenty of non-wasmfx test cases that do something and inspect the resulting backtrace. These tests would fail if we unconditionally produced empty backtraces whenever the baseline feature is enabled.

I see, thanks.

crates/runtime/src/traphandlers/backtrace.rs Outdated Show resolved Hide resolved
let limits = limits.as_ref().unwrap();
match continuation_opt {
Some(continuation) => {
let cont = unsafe { &*continuation };
Copy link
Member

@dhil dhil Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unsafe block annotation should arguably be around the entire match arm, because the &*continuation could violate the strict aliasing rules. We should add a comment saying that this line is the unsafe bit of the arm.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure that making a piece of code unsafe doesn't change the permitted aliasing. The aliasing here should be fine: We aren't violating the rule that there must be no more than one mutable reference at a time (because there are zero mutable references here).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not what I am saying. The entire block would be unsafe if we were to alias the continuation. The safety of the rest of code relies on this assumption

@@ -387,7 +387,9 @@ pub mod baseline {
#[inline(always)]
pub fn resume(instance: &mut Instance, contref: &mut VMContRef) -> Result<u32, TrapReason> {
// Trigger fuse
HAS_EVER_RUN_CONTINUATION.set(true);
if HAS_EVER_RUN_CONTINUATION.get() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be

Suggested change
if HAS_EVER_RUN_CONTINUATION.get() {
if !HAS_EVER_RUN_CONTINUATION.get() {

right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you seem to be looking at an outdated commit

Copy link
Member

@dhil dhil Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you fixed it in the same moment as I submitted the review.

crates/runtime/src/traphandlers/backtrace.rs Outdated Show resolved Hide resolved
@frank-emrich frank-emrich merged commit f6f7793 into wasmfx:main Mar 6, 2024
19 checks passed
@frank-emrich frank-emrich deleted the backtrace-creation branch March 6, 2024 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants