Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error lifting LDAXR (load-acquire / store-release exclusive monitors) #84

Open
l-kent opened this issue Jun 12, 2024 · 8 comments
Open

Comments

@l-kent
Copy link

l-kent commented Jun 12, 2024

An error occurs with instruction ldaxr x8, [x11] with opcode 0xc85ffd68:

ASLi> :sem A64 0xc85ffd68
Decoding instruction A64 c85ffd68
  Error File "libASL/primops.ml", line 130, characters 4-10: Assertion failed

This is notably different to the error that this opcode causes when lifted via gtirb-semantics: #85

@l-kent
Copy link
Author

l-kent commented Jun 18, 2024

A possibly related issue is that the related instruction stlxr evaluates to assert FALSE ;

ASLi> :sem A64 0xc80afd6c
Decoding instruction A64 c80afd6c
assert FALSE ;

@katrinafyi
Copy link
Member

katrinafyi commented Jun 18, 2024

Possibly.

stlxr fails because it calls IsExclusiveVA() which is stubbed to assert false in memory.asl. I expect ldaxr would fail in a similar way, if it got to that point within SetExclusiveMonitors(). Instead, it gets stuck at translating the virtual address to a physical one to mark the physical addresses.

The assertion which is failing is a non-negativity assertion. It's evaluating to zeros_bits.0 {{ -20 }} ( )... This is obviously wrong. Anyway, we might be able stub out the address translation in a similar way to the non-atomic memory accesses. L-A/S-R semantics are not something I know, though.

The traces for ldaxr and stlxr are: x.txt, y.txt. These can be generated by passing -x 1 to aslp, and larger numbers will produce progressively more output.

@l-kent
Copy link
Author

l-kent commented Jun 18, 2024

Since IsExclusiveVA() is apparently always safe to return true and is an optional implementation-dependent test, shouldn't it be stubbed to true instead of false?

aslp/mra_tools/arch/arch.asl

Lines 11387 to 11395 in abd63cb

// An optional IMPLEMENTATION DEFINED test for an exclusive access to a virtual
// address region of size bytes starting at address.
//
// It is permitted (but not required) for this function to return FALSE and
// cause a store exclusive to fail if the virtual address region is not
// totally included within the region recorded by MarkExclusiveVA().
//
// It is always safe to return TRUE which will check the physical address only.
boolean AArch64.IsExclusiveVA(bits(64) address, integer processorid, integer size);

@l-kent
Copy link
Author

l-kent commented Jun 18, 2024

It seems like this is just something that will fall over elsewhere though, as most of the Exclusives monitor-related functions have no provided pseudocode?

@katrinafyi
Copy link
Member

katrinafyi commented Jun 18, 2024

I don't know. I assume the "always safe to return TRUE" is reliant on a correct implementation of physical exclusivity.

Reading about this, it appears to be a matter of setting/unsetting the monitors, but it is complicated by the fact it must persist between instructions. It might be easier for us to implement the logic entirely in virtual addresses. However, I can't find a "ClearExclusiveVA" to match the MarkExclusiveVA().

The easy fix in ASLp would be to defer this to downstream projects, but I am not sure.

@l-kent
Copy link
Author

l-kent commented Jun 18, 2024

Exclusive accesses for virtual addresses are an entirely optional part of the architecture, without any real spec for how those checks are supposed to behave, so I don't think it would really work for us to just implement that part of it. That seems to be why there isn't a ClearExclusiveVA.

The local monitor doesn't seem too difficult to implement on an abstract level (as long as we ignore address translation as we're already doing everywhere else). It seems to be mostly just a matter of storing an address in some global variable (representing the local monitor) that marks it as exclusive. The trickiest part seems to be that the size of the memory block that is marked as exclusive is implementation-defined and can be anywhere from 16 bytes to 2048 bytes (the least significant 4 to 11 bits of the physical address are ignored to get this).

Unfortunately, the global monitor seems much trickier and is going to be relevant for any real-world use of these instructions. The global monitor only has to be set or checked if a physical address is 'shareable' but that comes from the rather complex address translation process which we would really like to completely ignore. Each core has its own global monitor, and the global monitor for a core is cleared when another core writes to a physical address that overlaps with the address range in the core's global monitor. That part is done by ClearExclusiveByAddress() which is called by most (all?) memory stores.

@katrinafyi
Copy link
Member

It is tricky. I was thinking that ignoring address translation for "physical" monitors is no more correct than (and functionally identical to) just using virtual monitors throughout. However, you're right that the physical monitors are better specified, so we can go that way.

I was reading this page and it implies that the "eagerness" of monitor clearing (e.g. due to interference by another core) is somewhat arbitrary and implementation-specific. In particular, the monitor may be cleared due to other circumstances and the program code is expected to handle that.

If this is the case, we might be able to pessimise all addresses into shareable global addresses. (This is further justified (weakly) by our interest in weak memory effects which only appear in memory shared across cores).

Re the size of the exclusive block, any correct portable code should behave correctly with the minimum size of exclusion (so the reasoning can soundly assume the minimum size). I don't think this is a big problem.

Taking a step back, this is all very complex. I feel like the ideal approach would be Aslp emitting the monitors as intrinsics, then some analysis to locate the regions delineated by load/store pairs. A naive approach explicitly recording the monitors at each program point for each core sounds very expensive...

@katrinafyi katrinafyi changed the title Error lifting LDAXR Error lifting LDAXR (load-acquire / store-release exclusive monitors) Jun 18, 2024
@l-kent
Copy link
Author

l-kent commented Jun 18, 2024

I gave it some thought and using intrinsics similar (or even the same as?) the AtomicStart/End ones makes sense to me. The local monitor is fairly easy to sensibly model, and the global monitor is only really supposed to cause the store to fail if another core interferes with it, so the specifics of that can be ignored by ASLp and BASIL can do something with rely/guarantee conditions to model that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants