Transactional Synchronization Extensions (TSX), also calledTransactional Synchronization Extensions New Instructions (TSX-NI), is an extension to thex86instruction set architecture (ISA) that adds hardwaretransactional memory support, speeding up execution of multi-threaded software through lock elision. According to different benchmarks, TSX/TSX-NI can provide around 40% faster applications execution in specific workloads, and 4–5 times more databasetransactions per second (TPS).[1][2][3][4]
TSX/TSX-NI was documented byIntel in February 2012, and debuted in June 2013 on selected Intelmicroprocessors based on theHaswell microarchitecture.[5][6][7] Haswell processors below 45xx as well as R-series and K-series (with unlocked multiplier)SKUs do not support TSX/TSX-NI.[8] In August 2014, Intel announced a bug in the TSX/TSX-NI implementation on current steppings of Haswell, Haswell-E, Haswell-EP and earlyBroadwell CPUs, which resulted in disabling the TSX/TSX-NI feature on affected CPUs via amicrocode update.[9][10]
In 2016, aside-channeltiming attack was found by abusing the way TSX/TSX-NI handles transactional faults (i.e.page faults) in order to breakkernel address space layout randomization (KASLR) on all major operating systems.[11] In 2021, Intel released a microcode update that disabled the TSX/TSX-NI feature on CPU generations fromSkylake toCoffee Lake, as a mitigation for discovered security issues.[12]
While TSX/TSX-NI is not supported anymore in desktop-class processors, it remains supported in theXeon line of processors (at least on specific models, as of the 6th generation).[13]
Support for TSX/TSX-NI emulation is provided as part of the Intel Software Development Emulator.[14] There is also experimental support for TSX/TSX-NI emulation in aQEMU fork.[15]
TSX/TSX-NI provides two software interfaces for designating code regions for transactional execution.Hardware Lock Elision (HLE) is an instruction prefix-based interface designed to be backward compatible with processors without TSX/TSX-NI support.Restricted Transactional Memory (RTM) is a new instruction set interface that provides greater flexibility for programmers.[16]
TSX/TSX-NI enablesoptimistic execution of transactional code regions. The hardware monitors multiple threads for conflicting memory accesses, while aborting and rolling back transactions that cannot be successfully completed. Mechanisms are provided for software to detect and handle failed transactions.[16]
Hardware Lock Elision (HLE) adds two new instruction prefixes,XACQUIRE andXRELEASE. These two prefixes reuse theopcodes of the existingREPNE /REPE prefixes (F2H /F3H). On processors that do not support HLE,REPNE /REPE prefixes are ignored on instructions for which theXACQUIRE /XRELEASE are valid, thus enabling backward compatibility.[17]
TheXACQUIRE prefix hint can only be used with the following instructions with an explicitLOCK prefix:ADD,ADC,AND,BTC,BTR,BTS,CMPXCHG,CMPXCHG8B,DEC,INC,NEG,NOT,OR,SBB,SUB,XOR,XADD, andXCHG. TheXCHG instruction can be used without theLOCK prefix as well.
TheXRELEASE prefix hint can be used both with the instructions listed above, and with theMOV mem, reg andMOV mem, imm instructions.
HLE allows optimistic execution of a critical section by skipping the write to a lock, so that the lock appears to be free to other threads. A failed transaction results in execution restarting from theXACQUIRE-prefixed instruction, but treating the instruction as if theXACQUIRE prefix were not present.
In other words, lock elision through transactional execution uses memory transactions as a fast path where possible, while the slow (fallback) path is still a normal lock.
Restricted Transactional Memory (RTM) is an alternative implementation to HLE which gives the programmer the flexibility to specify a fallback code path that is executed when a transaction cannot be successfully executed. Unlike HLE, RTM is not backward compatible with processors that do not support it. For backward compatibility, programs are required to detect support for RTM in the CPU before using the new instructions.
RTM adds three new instructions:XBEGIN,XEND andXABORT. TheXBEGIN andXEND instructions mark the start and the end of a transactional code region; theXABORT instruction explicitly aborts a transaction. Transaction failure redirects the processor to the fallback code path specified by theXBEGIN instruction, with the abort status returned in theEAX register.
| EAX register bit position | Meaning |
|---|---|
| 0 | Set if abort caused byXABORT instruction. |
| 1 | If set, the transaction may succeed on a retry. This bit is always clear if bit 0 is set. |
| 2 | Set if another logical processor conflicted with a memory address that was part of the transaction that aborted. |
| 3 | Set if an internal buffer overflowed. |
| 4 | Set if debug breakpoint was hit. |
| 5 | Set if an abort occurred during execution of a nested transaction. |
| 23:6 | Reserved. |
| 31:24 | XABORT argument (only valid if bit 0 set, otherwise reserved). |
XTEST instructionTSX/TSX-NI provides a newXTEST instruction that returns whether the processor is executing a transactional region. This instruction is supported by the processor if it supports HLE or RTM or both.
TSX/TSX-NI Suspend Load Address Tracking (TSXLDTRK) is an instruction set extension that allows to temporarily disable tracking loads from memory in a section of code within a transactional region. This feature extends HLE and RTM, and its support in the processor must be detected separately.
TSXLDTRK introduces two new instructions,XSUSLDTRK andXRESLDTRK, for suspending and resuming load address tracking, respectively. While the tracking is suspended, any loads from memory will not be added to the transaction read set. This means that, unless these memory locations were added to the transaction read or write sets outside the suspend region, writes at these locations by other threads will not cause transaction abort. Suspending load address tracking for a portion of code within a transactional region allows to reduce the amount of memory that needs to be tracked for read-write conflicts and therefore increase the probability of successful commit of the transaction.
Intel's TSX/TSX-NI specification describes how the transactional memory is exposed to programmers, but withholds details on the actual transactional memory implementation.[18] Intel specifies in its developer's and optimization manuals that Haswell maintains both read-sets and write-sets at the granularity of a cache line, tracking addresses in the L1 data cache of the processor.[19][20][21][22] Intel also states that data conflicts are detected through thecache coherence protocol.[20]
Haswell's L1 data cache has an associativity of eight. This means that in this implementation, a transactional execution that writes to nine distinct locations mapping to the same cache set will abort. However, due to micro-architectural implementations, this does not mean that fewer accesses to the same set are guaranteed to never abort. Additionally, in CPU configurations withHyper-Threading Technology, the L1 cache is shared between the two threads on the same core, so operations in a sibling logical processor of the same core can cause evictions.[20]
Independent research points into Haswell's transactional memory most likely being a deferred update system using the per-core caches for transactional data and register checkpoints.[18] In other words, Haswell is more likely to use the cache-based transactional memory system, as it is a much less risky implementation choice. On the other hand, Intel'sSkylake or later may combine this cache-based approach withmemory ordering buffer (MOB) for the same purpose, possibly also providing multi-versioned transactional memory that is more amenable tospeculative multithreading.[23]
In August 2014, Intel announced that a bug exists in the TSX/TSX-NI implementation on Haswell, Haswell-E, Haswell-EP and early Broadwell CPUs, which resulted in disabling the TSX/TSX-NI feature on affected CPUs via a microcode update.[9][10][24] The bug was fixed in F-0 steppings of the vPro-enabled Core M-5Y70 Broadwell CPU in November 2014.[25]
The bug was found and then reported during a diploma thesis in the School of Electrical and Computer Engineering of theNational Technical University of Athens.[26]
In October 2018, Intel disclosed a TSX/TSX-NI memory ordering issue found in someSkylake processors.[27] As a result of a microcode update, HLE support was disabled in the affected CPUs, and RTM was mitigated by sacrificing one performance counter when used outside of IntelSGX mode or System Management Mode (SMM). System software would have to either effectively disable RTM or update performance monitoring tools not to use the affected performance counter.
In June 2021, Intel published a microcode update that further disables TSX/TSX-NI on various Xeon and Core processor models fromSkylake throughCoffee Lake andWhiskey Lake as a mitigation for TSX Asynchronous Abort (TAA) vulnerability. Earlier mitigation for memory ordering issue was removed.[28] By default, with the updated microcode, the processor would still indicate support for RTM but would always abort the transaction. System software is able to detect this mode of operation and mask support for TSX/TSX-NI from theCPUID instruction, preventing detection of TSX/TSX-NI by applications. System software may also enable the "Unsupported Software Development Mode", where RTM is fully active, but in this case RTM usage may be subject to the issues described earlier, and therefore this mode should not be enabled on production systems. On some systems RTM can't be re-enabled when SGX is active. HLE is always disabled.
According to Intel 64 and IA-32 Architectures Software Developer's Manual from May 2020, Volume 1, Chapter 2.5 Intel Instruction Set Architecture And Features Removed,[19] HLE has been removed from Intel products released in 2019 and later. RTM is not documented as removed. However, Intel 10th generationComet Lake andIce Lake client processors, which were released in 2020, do not support TSX/TSX-NI,[29][30][31][32][33] including both HLE and RTM. Engineering versions of Comet Lake processors were still retaining TSX/TSX-NI support.
In Intel Architecture Instruction Set Extensions Programming Reference revision 41 from October 2020,[34] a new TSXLDTRK instruction set extension was documented. It was first included inSapphire Rapids processors released in January 2023.
Under a complex set of internal timing conditions and system events, software using the Intel TSX/TSX-NI (Transactional Synchronization Extensions) instructions may observe unpredictable system behavior.
The processor tracks both the read-set addresses and the write-set addresses in the first level data cache (L1 cache) of the processor.
The whole "CPU does the fine grained locks" is based upon tagging the L1 (64 B) cachelines and there are 512 of them to be specific (64 x 512 = 32 KB). There is only one "lock tag" per cacheline.
BDM531 E-0: X, F-0:, Status: Fixed ERRATA: Intel TSX Instructions Not Available. 1. Applies to Intel Core M-5Y70 processor. Intel TSX is supported on Intel Core M-5Y70 processor with Intel vPro Technology. Intel TSX is not supported on other processor SKUs.
The October 2018 microcode update also disabled the HLE instruction prefix of Intel TSX and force all RTM transactions to abort when operating in Intel SGX mode or System Management Mode (SMM).