diff --git a/src/a-st-ext.adoc b/src/a-st-ext.adoc index e5c45e946..c7343db8a 100644 --- a/src/a-st-ext.adoc +++ b/src/a-st-ext.adoc @@ -9,13 +9,15 @@ load-reserved/store-conditional instructions and atomic fetch-and-op memory instructions. Both types of atomic instruction support various memory consistency orderings including unordered, acquire, release, and sequentially consistent semantics. These instructions allow RISC-V to -support the RCsc memory consistency model . +support the RCsc memory consistency model. +[NOTE] +==== After much debate, the language community and architecture community appear to have finally settled on release consistency as the standard memory consistency model and so the RISC-V atomic support is built around this model. - +==== === Specifying Ordering of Atomic Instructions The base RISC-V ISA has a relaxed memory model, with the FENCE @@ -24,7 +26,7 @@ space is divided by the execution environment into memory and I/O domains, and the FENCE instruction provides options to order accesses to one or both of these two address domains. -To provide more efficient support for release consistency , each atomic +To provide more efficient support for release consistency, each atomic instruction has two bits, _aq_ and _rl_, used to specify additional memory ordering constraints as viewed by other RISC-V harts. The bits order accesses to one of the two address domains, memory or I/O, @@ -130,7 +132,7 @@ and itself in program order. An SC may succeed only if no write from a device other than a hart to the bytes accessed by the LR instruction can be observed to have occurred between the LR and SC. Note this LR might have had a different effective address and data size, but reserved the -SC’s address as part of the reservation set. +SC's address as part of the reservation set. Following this model, in systems with memory translation, an SC is allowed to succeed if the earlier LR reserved the same location using an @@ -168,7 +170,7 @@ used to forcibly invalidate any existing load reservation: * if necessary when changing virtual to physical address mappings, such as when migrating pages that might contain an active reservation. -The invalidation of a hart’s reservation when it executes an LR or SC +The invalidation of a hart's reservation when it executes an LR or SC imply that a hart can only hold one reservation at a time, and that an SC can only pair with the most recent LR, and LR with the next following SC, in program order. This is a restriction to the Atomicity Axiom in @@ -215,7 +217,7 @@ those with both bits clear, but may result in lower performance. LR/SC can be used to construct lock-free data structures. An example using LR/SC to implement a compare-and-swap function is shown in -Figure link:#cas[[cas]]. If inlined, compare-and-swap functionality need +Figure <>. If inlined, compare-and-swap functionality need only take four instructions. [[sec:lrscseq]] @@ -256,7 +258,7 @@ on data-cache associativity in simple implementations that track the reservation within a private cache. The restrictions on branches and jumps limit the time that can be spent in the sequence. Floating-point operations and integer multiply/divide were disallowed to simplify the -operating system’s emulation of these instructions on implementations +operating system's emulation of these instructions on implementations lacking appropriate hardware support. Software is not forbidden from using unconstrained LR/SC sequences, but @@ -269,9 +271,9 @@ If a hart _H_ enters a constrained LR/SC loop, the execution environment must guarantee that one of the following events eventually occurs: * _H_ or some other hart executes a successful SC to the reservation set -of the LR instruction in _H_’s constrained LR/SC loops. +of the LR instruction in _H_'s constrained LR/SC loops. * Some other hart executes an unconditional store or AMO instruction to -the reservation set of the LR instruction in _H_’s constrained LR/SC +the reservation set of the LR instruction in _H_'s constrained LR/SC loop, or some other device in the system writes to that reservation set. * _H_ executes a branch or jump that exits the constrained LR/SC loop. * _H_ traps. @@ -289,7 +291,7 @@ other harts or devices continue to write to that reservation set, it is not guaranteed that any hart will exit its LR/SC loop. Loads and load-reserved instructions do not by themselves impede the -progress of other harts’ LR/SC sequences. We note this constraint +progress of other harts' LR/SC sequences. We note this constraint implies, among other things, that loads and load-reserved instructions executed by other harts (possibly within the same core) cannot impede LR/SC progress indefinitely. For example, cache evictions caused by @@ -396,7 +398,7 @@ relinquishment. We recommend the use of the AMO Swap idiom shown above for both lock acquire and release to simplify the implementation of speculative lock -elision . +elision. The instructions in the ''A'' extension can also be used to provide sequentially consistent loads and stores. A sequentially consistent load