Carry-less Multiplication (CLMUL) is an extension to thex86 instruction set used bymicroprocessors fromIntel andAMD which was proposed by Intel in March 2008[1] and made available in theIntel Westmere processors announced in early 2010. Mathematically, the instruction implements multiplication of polynomials over thefinite field GF(2) where the bitstring represents the polynomial. The CLMUL instruction also allows a more efficient implementation of the closely related multiplication of larger finite fields GF(2k) than the traditional instruction set.[2]
One use of these instructions is to improve the speed of applications doing block cipher encryption inGalois/Counter Mode, which depends on finite field GF(2k) multiplication. Another application is the fast calculation ofCRC values,[3] including those used to implement theLZ77sliding windowDEFLATE algorithm inzlib andpngcrush.[4]
ARMv8 also has a version of CLMUL. SPARC calls their version XMULX, for "XOR multiplication".
The instruction computes the 128-bitcarry-less product of two 64-bit values. The destination is a128-bit XMM register. The source may be another XMM register or memory. An immediate operand specifies which halves of the 128-bit operands are multiplied.Mnemonics specifying specific values of the immediate operand are also defined:
| Instruction | Opcode | Description |
|---|---|---|
PCLMULQDQ xmmreg,xmmrm,imm | [rmi: 66 0f 3a 44 /r ib] | Perform a carry-less multiplication of two 64-bit polynomials over the finite fieldGF(2)[X]. |
PCLMULLQLQDQ xmmreg,xmmrm | [rm: 66 0f 3a 44 /r 00] | Multiply the low halves of the two registers. |
PCLMULHQLQDQ xmmreg,xmmrm | [rm: 66 0f 3a 44 /r 01] | Multiply the high half of the destination register by the low half of the source register. |
PCLMULLQHQDQ xmmreg,xmmrm | [rm: 66 0f 3a 44 /r 10] | Multiply the low half of the destination register by the high half of the source register. |
PCLMULHQHQDQ xmmreg,xmmrm | [rm: 66 0f 3a 44 /r 11] | Multiply the high halves of the two registers. |
A EVEX vectorized version (VPCLMULQDQ) is seen inAVX-512.
The presence of the CLMUL instruction set can be checked by testing one of theCPU feature bits.