TheSSE5 (short forStreaming SIMD Extensions version 5) was aSIMD instruction set extension proposed byAMD on August 30, 2007 as a supplement to the 128-bitSSE core instructions in theAMD64 architecture.
AMD chose not to implement SSE5 as originally proposed. In May 2009, AMD replaced SSE5 with three smaller instruction set extensions named asXOP,FMA4, andF16C, which retain the proposed functionality of SSE5, but encode the instructions differently for better compatibility with Intel's proposedAVX instruction set.
The three SSE5-derived instruction sets were introduced in theBulldozer processor core, released in October 2011 on a32 nm process.[1]
AMD's SSE5 extension bundle does not include the full set ofIntel'sSSE4 instructions, making it a competitor to SSE4 rather than a successor.
The proposed SSE5 instruction set consisted of 170 instructions (including 46 base instructions), many of which are designed to improve single-threaded performance. Some SSE5 instructions are3-operand instructions, the use of which will increase the average number ofinstructions per cycle achievable byx86 code.[2] Selected new instructions include:[3]
AMD claimed SSE5 would provide dramatic performance improvements, particularly inhigh-performance computing (HPC),multimedia, andcomputer security applications, including a 5x performance gain forAES encryption and a 30% performance gain for thediscrete cosine transform (DCT) used for example in video processing.[2]
The SSE5 specification included a proposed extension to the general coding scheme ofx86 instructions in order to allow instructions to have more than two operands. In 2008,Intel announced their plannedAVX instruction set which proposed a different way of coding instructions with more than two operands. The two proposed coding schemes, SSE5 and AVX, are mutually incompatible, although the AVX scheme has certain advantages over the SSE5 scheme: most importantly, AVX has plenty of space for future extensions, including larger vector sizes.
In May 2009, AMD published a revised specification for the planned future instructions. This revision changes the coding scheme to make it compatible with the AVX scheme, but with a differing prefix byte in order to avoid overlap between instructions introduced by AMD and instructions introduced by Intel.
The revised instruction set no longer carries the name SSE5, which has been criticized for being misleading, but most of the instructions in the new revision are functionally identical to the original SSE5 specification—only the way the instructions are coded differs. The planned additions to the AMD instruction set consists of three subsets:
Both XOP and FMA4 are removed in newer AMD processors using theZen microarchitecture.[4]
But with Zen being a clean-sheet design, there are some instruction set extensions found in Bulldozer processors not found in Zen/znver1. Those no longer present include FMA4 and XOP.