The memory consistency model, ormemory model, specifies the possible orderings ofShared Data Block events, arising via accessingTypedArray instances backed by a SharedArrayBuffer and via methods on the Atomics object. When the program has no data races (defined below), the ordering of events appears as sequentially consistent, i.e., as an interleaving of actions from eachagent. When the program has data races, shared memory operations may appear sequentially inconsistent. For example, programs may exhibit causality-violating behaviour and other astonishments. These astonishments arise from compiler transforms and the design of CPUs (e.g., out-of-order execution and speculation). The memory model defines both the precise conditions under which a program exhibits sequentially consistent behaviour as well as the possible values read from data races. To wit, there is no undefined behaviour.
The memory model is defined as relational constraints on events introduced byabstract operations on SharedArrayBuffer or by methods on the Atomics object during an evaluation.
Note
This section provides an axiomatic model on events introduced by theabstract operations on SharedArrayBuffers. It bears stressing that the model is not expressible algorithmically, unlike the rest of this specification. The nondeterministic introduction of events byabstract operations is the interface between the operational semantics of ECMAScript evaluation and the axiomatic semantics of the memory model. The semantics of these events is defined by considering graphs of all events in an evaluation. These are neither Static Semantics nor Runtime Semantics. There is no demonstrated algorithmic implementation, but instead a set of constraints that determine if a particular event graph is allowed or disallowed.
29.1 Memory Model Fundamentals
Shared memory accesses (reads and writes) are divided into two groups, atomic accesses and data accesses, defined below. Atomic accesses are sequentially consistent, i.e., there is a strict total ordering of events agreed upon by allagents in anagent cluster. Non-atomic accesses do not have a strict total ordering agreed upon by allagents, i.e., unordered.
Note 1
No orderings weaker than sequentially consistent and stronger than unordered, such as release-acquire, are supported.
AShared Data Block event is either aReadSharedMemory,WriteSharedMemory, orReadModifyWriteSharedMemoryRecord.
These events are introduced byabstract operations or by methods on the Atomics object.
Some operations may also introduceSynchronize events. ASynchronize event has no fields, and exists purely to directly constrain the permitted orderings of other events.
In addition toShared Data Block and Synchronize events, there arehost-specific events.
Let the range of a ReadSharedMemory, WriteSharedMemory, or ReadModifyWriteSharedMemory event be the Set of contiguousintegers from its[[ByteIndex]] to[[ByteIndex]] +[[ElementSize]] - 1. Two events' ranges are equal when the events have the same[[Block]], and the ranges are element-wise equal. Two events' ranges are overlapping when the events have the same[[Block]], the ranges are not equal and their intersection is non-empty. Two events' ranges are disjoint when the events do not have the same[[Block]] or their ranges are neither equal nor overlapping.
Note 2
Examples ofhost-specific synchronizing events that should be accounted for are: sending a SharedArrayBuffer from oneagent to another (e.g., bypostMessage in a browser), starting and stoppingagents, and communicating within theagent cluster via channels other than shared memory. For a particular executionexecution, those events are provided by thehost via thehost-synchronizes-withstrict partial order. Additionally,hosts can addhost-specific synchronizing events toexecution.[[EventList]] so as to participate in theis-agent-order-beforeRelation.
Anempty candidate execution is a candidate executionRecord whose fields are emptyLists.
29.5 Abstract Operations for the Memory Model
29.5.1 EventSet (execution )
The abstract operation EventSet takes argumentexecution (acandidate execution) and returns a Set of events. It performs the following steps when called:
The abstract operation SharedDataBlockEventSet takes argumentexecution (acandidate execution) and returns a Set of events. It performs the following steps when called:
The abstract operation HostEventSet takes argumentexecution (acandidate execution) and returns a Set of events. It performs the following steps when called:
The read-modify-write modification[[ModifyOp]] is given by the function properties on the Atomics object that introduceReadModifyWriteSharedMemory events.
For eventsE andD,E is-agent-order-beforeD inexecution if there is someAgent Events Recordaer inexecution.[[EventsRecords]] such thataer.[[EventList]] contains bothE andD andE is beforeD inList order ofaer.[[EventList]].
In acandidate executionexecution, not allseq-cst events related byreads-from are related by synchronizes-with. Only events that also have equal ranges are related by synchronizes-with.
Assert: The remainder of dividingR.[[ByteIndex]] byR.[[ElementSize]] is 0.
For each eventW such thatRreads-fromW inexecution andW.[[NoTear]] istrue, do
IfR andW have equal ranges and there exists an eventV such thatV andW have equal ranges,V.[[NoTear]] istrue,W andV are not the sameShared Data Block event, andRreads-fromV inexecution, then
Returnfalse.
Returntrue.
Note
An event's[[NoTear]] field istrue when that event was introduced via accessing anintegerTypedArray, andfalse when introduced via accessing a floating pointTypedArray or DataView.
Intuitively, this requirement says when a memory range is accessed in an aligned fashion via anintegerTypedArray, a single write event on that range must "win" when in adata race with other write events with equal ranges. More precisely, this requirement says an aligned read event cannot read a value composed of bytes from multiple, different write events all with equal ranges. It is possible, however, for an aligned read event to read from multiple write events with overlapping ranges.
For eventsE andD,E is-memory-order-beforeD inexecution ifEhappens-beforeD inexecution.
For eventsR andW such thatRreads-fromW inexecution, there is noWriteSharedMemory orReadModifyWriteSharedMemory eventV inSharedDataBlockEventSet(execution) such thatV.[[Order]] isseq-cst,W is-memory-order-beforeV inexecution,V is-memory-order-beforeR inexecution, and any of the following conditions are true.
This clause together with the forward progress guarantee onagents ensure the liveness condition thatseq-cst writes become visible toseq-cst reads with equal range infinite time.
While is-memory-order-before includes all events inEventSet(execution), those that are not constrained byhappens-before orsynchronizes-with inexecution are allowed to occur anywhere in the order.
29.7.5 Valid Executions
Acandidate executionexecution is a valid execution (or simply an execution) if all of the following are true.
For an executionexecution and eventsE andD that are contained inSharedDataBlockEventSet(execution),E andD are in arace if the following algorithm returnstrue.
For an executionexecution and eventsE andD that are contained inSharedDataBlockEventSet(execution),E andD are in adata race if the following algorithm returnstrue.
A program is data race free if all its executions are data race free.
Thememory model guarantees sequential consistency of all events for data race free programs.
29.11 Shared Memory Guidelines
Note 1
The following are guidelines for ECMAScript programmers working with shared memory.
We recommend programs be keptdata race free, i.e., make it so that it is impossible for there to be concurrent non-atomic operations on the same memory location.Data race free programs have interleaving semantics where each step in the evaluation semantics of eachagent are interleaved with each other. Fordata race free programs, it is not necessary to understand the details of thememory model. The details are unlikely to build intuition that will help one to better write ECMAScript.
More generally, even if a program is notdata race free it may have predictable behaviour, so long as atomic operations are not involved in any data races and the operations that race all have the same access size. The simplest way to arrange for atomics not to be involved in races is to ensure that different memory cells are used by atomic and non-atomic operations and that atomic accesses of different sizes are not used to access the same cells at the same time. Effectively, the program should treat shared memory as strongly typed as much as possible. One still cannot depend on the ordering and timing of non-atomic accesses that race, but if memory is treated as strongly typed the racing accesses will not "tear" (bits of their values will not be mixed).
Note 2
The following are guidelines for ECMAScript implementers writing compiler transformations for programs using shared memory.
It is desirable to allow most program transformations that are valid in a single-agent setting in a multi-agent setting, to ensure that the performance of eachagent in a multi-agent program is as good as it would be in a single-agent setting. Frequently these transformations are hard to judge. We outline some rules about program transformations that are intended to be taken as normative (in that they are implied by thememory model or stronger than what thememory model implies) but which are likely not exhaustive. These rules are intended to apply to program transformations that precede the introductions of the events that make up theis-agent-order-beforeRelation.
Letpossible read values of a read event be the set of all values ofValueOfReadEvent for that event across all valid executions.
Any transformation of an agent-order slice that is valid in the absence of shared memory is valid in the presence of shared memory, with the following exceptions.
Atomics are carved in stone: Program transformations must not cause theseq-cst events in an agent-order slice to be reordered with itsunordered operations, nor itsseq-cst operations to be reordered with each other, nor may a program transformation remove aseq-cst operation from theis-agent-order-beforeRelation.
(In practice, the prohibition on reorderings forces a compiler to assume that everyseq-cst operation is a synchronization and included in the finalis-memory-order-beforeRelation, which it would usually have to assume anyway in the absence of inter-agent program analysis. It also forces the compiler to assume that every call where the callee's effects on the memory-order are unknown may containseq-cst operations.)
Reads must be stable: Any given shared memory read must only observe a single value in an execution.
(For example, if what is semantically a single read in the program is executed multiple times then the program is subsequently allowed to observe only one of the values read. A transformation known as rematerialization can violate this rule.)
Writes must be stable: All observable writes to shared memory must follow from program semantics in an execution.
(For example, a transformation may not introduce certain observable writes, such as by using read-modify-write operations on a larger location to write a smaller datum, writing a value to memory that the program could not have written, or writing a just-read value back to the location it was read from, if that location could have been overwritten by anotheragent after the read.)
Possible read values must be non-empty: Program transformations cannot cause the possible read values of a shared memory read to become empty.
(Counterintuitively, this rule in effect restricts transformations on writes, because writes have force inmemory model insofar as to be read by read events. For example, writes may be moved and coalesced and sometimes reordered between twoseq-cst operations, but the transformation may not remove every write that updates a location; some write must be preserved.)
Examples of transformations that remain valid are: merging multiple non-atomic reads from the same location, reordering non-atomic reads, introducing speculative non-atomic reads, merging multiple non-atomic writes to the same location, reordering non-atomic writes to different locations, and hoisting non-atomic reads out of loops even if that affects termination. Note in general that aliasedTypedArrays make it hard to prove that locations are different.
Note 3
The following are guidelines for ECMAScript implementers generating machine code for shared memory accesses.
For architectures with memory models no weaker than those of ARM or Power, non-atomic stores and loads may be compiled to bare stores and loads on the target architecture. Atomic stores and loads may be compiled down to instructions that guarantee sequential consistency. If no such instructions exist, memory barriers are to be employed, such as placing barriers on both sides of a bare store or load. Read-modify-write operations may be compiled to read-modify-write instructions on the target architecture, such asLOCK-prefixed instructions on x86, load-exclusive/store-exclusive instructions on ARM, and load-link/store-conditional instructions on Power.
Specifically, thememory model is intended to allow code generation as follows.
Every atomic operation in the program is assumed to be necessary.
Atomic operations are never rearranged with each other or with non-atomic operations.
Functions are always assumed to perform atomic operations.
Atomic operations are never implemented as read-modify-write operations on larger data, but as non-lock-free atomics if the platform does not have atomic operations of the appropriate size. (We already assume that every platform has normal memory access operations of every interesting size.)
Naive code generation uses these patterns:
Regular loads and stores compile to single load and store instructions.
Lock-free atomic loads and stores compile to a full (sequentially consistent) fence, a regular load or store, and a full fence.
Lock-free atomic read-modify-write accesses compile to a full fence, an atomic read-modify-write instruction sequence, and a full fence.
Non-lock-free atomics compile to a spinlock acquire, a full fence, a series of non-atomic load and store instructions, a full fence, and a spinlock release.
That mapping is correct so long as an atomic operation on an address range does not race with a non-atomic write or with an atomic operation of different size. However, that is all we need: thememory model effectively demotes the atomic operations involved in a race to non-atomic status. On the other hand, the naive mapping is quite strong: it allows atomic operations to be used as sequentially consistent fences, which thememory model does not actually guarantee.
Local improvements to those basic patterns are also allowed, subject to the constraints of thememory model. For example:
There are obvious platform-dependent improvements that remove redundant fences. For example, on x86 the fences around lock-free atomic loads and stores can always be omitted except for the fence following a store, and no fence is needed for lock-free read-modify-write instructions, as these all useLOCK-prefixed instructions. On many platforms there are fences of several strengths, and weaker fences can be used in certain contexts without destroying sequential consistency.
Most modern platforms support lock-free atomics for all the data sizes required by ECMAScript atomics. Should non-lock-free atomics be needed, the fences surrounding the body of the atomic operation can usually be folded into the lock and unlock steps. The simplest solution for non-lock-free atomics is to have a single lock word per SharedArrayBuffer.
There are also more complicated platform-dependent local improvements, requiring some code analysis. For example, two back-to-back fences often have the same effect as a single fence, so if code is generated for two atomic operations in sequence, only a single fence need separate them. On x86, even a single fence separating atomic stores can be omitted, as the fence following a store is only needed to separate the store from a subsequent load.