This articlerelies excessively onreferences toprimary sources. Please improve this article by addingsecondary or tertiary sources. Find sources: "Bit manipulation instructions" – news ·newspapers ·books ·scholar ·JSTOR(August 2025) (Learn how and when to remove this message) |
| Machine code |
|---|
| General concepts |
| Instructions |
Bit manipulation instructions areinstructions that performbit manipulation operations in hardware, rather than requiring several instructions for those operations as illustrated withexamples in software.[1] Several leading as well as historic architectures have bit manipulation instructions includingARM,WDC 65C02, theTX-2 and thePower ISA.[2]
Bit manipulation is usually divided into subsets as individual instructions can be costly to implement in hardware when the target application has no justification. Conversely, if thereis a justification then performance may suffer if the instruction is excluded. Carrying out the cost-benefit analysis is a complex task: one of the most comprehensive efforts in bit manipulation was a collaboration headed by Clare Wolfe, providing justifications, use-cases, c code, proofs and Verilog for each proposed RISC-V instruction.[3][4]
Particular practical examples includeBit banging ofGPIO using a low-costEmbedded controller such as theWDC 65C02,8051 andAtmel PIC. At the slow clock rate of these CPUs, if bit-set/clear/test bit manipulation were not available the use of that low-cost CPU would, self-evidently, not be viable for the target application.
All the architectures below have instruction subsets and groups where the bit manipulation is provided in hardware. From the list it can be seen thatDSPs andEmbeddedMicrocontrollers have at least test/set/clear bit, however there are much more comprehensive instructions such asCount leading zeros,Popcount,Galois field arithmetic,Binary-coded decimal, bit-matrix multiply and transpose, byte-permute, bit permute includingbit-reversal, specialised cryptographic instructions and many more.
BSR Bit Scan Reverse - Returns bit index of highest set bit in input, effectively backwards count leading zeros, not defined for 0.BSF Bit Scan Forward - Returns bit index of lowest set bit in input, effectively count trailing zeros, but not defined for 0.lzcnttzcntpopcntpext/pdepptest andvptest, given two inputs, do both anAND operation and anANDN operation between them, and set the ZF and CF EFLAGS bits on whether the results of the AND and ANDN, respectively, are 0. This can be used to test if all masked bits are zero, all masked bits are set, or a mix.vpternlog. Also noteworthy is a conflict detection instruction.VPCONFLICTDGF2P8AFFINEQB is effectively an 8x8 bit-matrix multiply in theGalois field GF(2^8).[5]VPSHUFBITQMB which is a bit-level shuffle instruction, that picks bits from one source using indexes in the second source.Power ISA has a large range of bit manipulation instructions,[7] largely due to its history and relationship with IBM mainframes and thez/Architecture:
popcntb is SWAR byte-level 8x8-bit but there is no 4x16-bitpopcnth yet there is 2x32-bitpopcntw and 64-bit scalarpopcntd. Likewise,prtyw is SWAR half-word 4x16-bit but there is noprtybpextd and bit-depositpdepd these drop and distribute bits in place according to a mask instead of the more usual technique of a offset and a length.;[10] An unusual centrifuge instruction which moves masked-bits to the left and unmasked bits to the right, preserving their relative order in both instances. Most ISAs would have an operand expressing the number of sequential bits to extract, plus the length:cfuged combines these into one general-purpose bitmask.[10]vgbbd[11] which treats a 64-bit quantity as an 8x8 2D matrix, and performs a matrix transpose operation. Each bit 0 of each byte therefore becomes the first byte, each bit 1 of each byte becomes the second and so on.bpermd)[12] which allows selection of up to eight individual bits from a 64-bit source, by treating each byte of a second 64-bit register as bit-indices into the first.xxeval[13] similar toAVX-512Cray patented BMM (Bit matrix multiply) in 1990 which could cope with up to 64x64-bit operands.[15] The closest equivalent today is the 8x8 GF(2) Affine Transform instruction of AVX512.
TheIBM System/360 has RR, RX and SI instructions for bit-wise and, exclusive or and or, RS arithmetic and logical shift[a] instructions, an SI test under mask[b] and an atomic RX test and set instruction. These instructions and their extensions remain available through z/Architecture.
Toward the end of theS/370 life cycle, IBM made move characters inverse, previously an RPQ, a standard instruction.
TheIBM 3090 introduced an optionalvector facility[16] to theSystem/370-XA andEnterprise Systems Architecture/370 instruction sets. In addition to integer and floating-point vector arithmetic and logical operations on multiple integer and floating-point values, it introduced vector bit manipulation operationscount leading zerosvczvm andpopulation countvcovm.[17]
Towards the end of theESA/390 life cycle, IBM introduced some z/Architecture instructions in ESA/390. These included the rotate left single logical, load reversed and store reversed instructions.
z/Architecture inherited all of the bit manipulation instructions of its predecessors, and added 64-bit (grande) and long (20-bit) displacement versions of some.
z/Architecture does not support the previous vector facility.[32] However, starting with the 11th edition of the z/Architecture Principles of Operation:[33] it supports the following instructions:
vclz,count trailing zerosvctz[34][35] and vectorpopulation countvpopct[36]vtm[37] - sets a Condition Code based on comparingall elements of one register against a second vector as a mask: if all masked-comparisons are all-zero, if all are all-ones or a mix of both.vgfm,[38] known ascarryless multiplyThe DECPDP-6 andPDP-10 had logical operations covering the full suite of 2-operandhardware lookup table (LUT2)Boolean functions[41] (rather than the 3-operand functions that AVX512 and Power ISA have).
Later models of the PDP-10 had instructions to convert betweenpacked BCD and binary.[42]
Also present is unusual (variable-bit-length) byte load and store instructions that usebyte pointers for memory operands: in modern terminology these are bit-field insert and extract. In addition to a word address, the bit length (S) and the bit offset (P) of the byte from which to load or into which to store are specified. These instructions can specify a byte size of 0-36, but a byte may not straddle a word boundary.[43] The string manipulation,[44] BCD/binary conversion,[45] and string editing[46] instructions in later models use byte pointers and have the same restrictions.
TheGE-600 series and its successors had Gray-to-binary conversion; without such an instruction,converting from Gray code requires multiple steps. Binary-to-Gray is simplyx^(x>>1) and does not justify a dedicated instruction. Gray coding has significantpractical applications.
In the standard extensions RISC-V has scalarbitwise operations including shift and arithmetic shift, but no rotate. The omissions are compensated for with additional extensions.
TEST, as well asbitwise operations[54]SETB,CLR andCPL - set clear and invert bit instructions - and a considerable percentage of its instructions are bit manipulation.[55] Also included is Or-complement and And-complement, present in RISC-V Zb*.[56]BIT,RES, andSET instructions. These test, reset, and set individual bits in registers or memory pointed to by HL, IX, or IY.[57]BITBISBIC and bytesBITBBISBBICB. The very similarWD16 supports only the word forms of these instructions plusBISB. The WD16 additionally supports faster byte-addressed flags with itsTSTBSETBCLRB andCOMB (compliment) instructions. The PDP-11 is missing theSETB instruction.BSET (set to 1),BCLR (clear to 0),BCHG (invert) andBTST (no change). All of these instructions first test the destination bit and set the CCR Z bit if the destination bit is 0.