
Thebinary GCD algorithm, also known asStein's algorithm or thebinary Euclidean algorithm,[1][2] is an algorithm that computes thegreatest common divisor (GCD) of two nonnegative integers. Stein's algorithm uses simpler arithmetic operations than the conventionalEuclidean algorithm; it replaces division witharithmetic shifts, comparisons, and subtraction.
Although the algorithm in its contemporary form was first published by the physicist and programmer Josef Stein in 1967,[3] it was known by the 2nd century BCE, in ancient China.[4]
The algorithm finds the GCD of two nonnegative numbers and by repeatedly applying these identities:
As GCD is commutative (), those identities still apply if the operands are swapped:, if is odd, etc.
While the above description of the algorithm is mathematically correct, the binary GCD algorithm performs more loop iterations that the Euclidean one, so it only offers a performance advantage if those iterations are substantially faster. Performant software implementations typically differ from it in a few notable ways:
v==0:unpredictable branches can have a large, negative impact on performance.[5][6]The following is an implementation of the algorithm inC. After the shift at the top of the main loop, the variablesu andv hold the values(u-1)/2 and(v-1)/2. This elimination of the least-significant bit leaves room for the signed differenceu-v to be computed on the next line without overflow.
// Count trailing zeros. This is a pedagogical example only;// for efficiency use __builtin_ctzll() or similar.staticintctz(uint64_tn){ints=0,t;while((t=n&15)==0){n>>=4;s+=4;}returns+((0x12131210>>2*t)&3);}uint64_tgcd(uint64_tu,uint64_tv){uint64_tuv=u|v;if(!u||!v)returnuv;// Identity #1u>>=ctz(u)+1;inttz=ctz(v);for(;;){v>>=tz+1;// Identity #3v-=u;// Identity #4if(v==0)break;tz=ctz(v);// ctz(v) == ctz(-v), so do this early// if ((int64_t)v < 0) { u += v; v = -v; }uint64_tmask=(int64_t)v>>63;u+=v&mask;// Branch-free conditional addv^=mask;// Branch-free conditional negate}return(2*u+1)<<ctz(uv);// Identity #2}
The following is an implementation of the algorithm inRust exemplifying those differences, adapted fromuutils. The conditional exchange of and (ensuring) compiles down toconditional moves;[7]:
usestd::cmp::min;usestd::mem::swap;pubfngcd(mutu:u64,mutv:u64)->u64{// Base cases: gcd(n, 0) = gcd(0, n) = nifu==0{returnv;}elseifv==0{returnu;}// Using identities 2 and 3:// gcd(2ⁱ u, 2ʲ v) = 2ᵏ gcd(u, v) with u, v odd and k = min(i, j)// 2ᵏ is the greatest power of two that divides both 2ⁱ u and 2ʲ vleti=u.trailing_zeros();u>>=i;letj=v.trailing_zeros();v>>=j;letk=min(i,j);loop{// u and v are odd at the start of the loopdebug_assert!(u%2==1,"u = {} should be odd",u);debug_assert!(v%2==1,"v = {} should be odd",v);// Swap if necessary so u ≤ vifu>v{(u,v)=(v,u);}// Identity 4: gcd(u, v) = gcd(u, v-u) as u ≤ v and u, v are both oddv-=u;// v is now evenifv==0{// Identity 1: gcd(u, 0) = u// The shift by k is necessary to add back the 2ᵏ factor that was removed before the loopreturnu<<k;}// Identity 3: gcd(u, 2ʲ v) = gcd(u, v) as u is oddv>>=v.trailing_zeros();}}
Note: The implementation above acceptsunsigned (non-negative) integers; given that, the signed case can be handled as follows:
/// Computes the GCD of two signed 64-bit integers/// The result is unsigned and not always representable as i64: gcd(i64::MIN, i64::MIN) == 1 << 63pubfnsigned_gcd(u:i64,v:i64)->u64{gcd(u.unsigned_abs(),v.unsigned_abs())}
Asymptotically, the algorithm requires steps, where is the number of bits in the larger of the two numbers, as every two steps reduce at least one of the operands by at least a factor of. Each step involves only a few arithmetic operations ( with a small constant); when working withword-sized numbers, each arithmetic operation translates to a single machine operation, so the number of machine operations is on the order of, i.e..
For arbitrarily large numbers, theasymptotic complexity of this algorithm is,[8] as each arithmetic operation (subtract and shift) involves a linear number of machine operations (one per word in the numbers' binary representation).If the numbers can be represented in the machine's memory,i.e. each number'ssize can be represented by a single machine word, this bound is reduced to:
This is the same as for the Euclidean algorithm, though a more precise analysis by Akhavi and Vallée proved that binary GCD uses about 60% fewer bit operations.[9]
The binary GCD algorithm can be extended in several ways, either to output additional information, deal witharbitrarily large integers more efficiently, or to compute GCDs in domains other than the integers.
Theextended binary GCD algorithm, analogous to theextended Euclidean algorithm, fits in the first kind of extension, as it provides theBézout coefficients in addition to the GCD: integers and such that.[10][11][12]
In the case of large integers, the best asymptotic complexity is, with the cost of-bit multiplication; this is near-linear and vastly smaller than the binary GCD algorithm's, though concrete implementations only outperform older algorithms for numbers larger than about 64 kilobits (i.e. greater than 8×1019265). This is achieved by extending the binary GCD algorithm using ideas from theSchönhage–Strassen algorithm for fast integer multiplication.[13]
The binary GCD algorithm has also been extended to domains other than natural numbers, such asGaussian integers,[14]Eisenstein integers,[15] quadratic rings,[16][17] andinteger rings ofnumber fields.[18]
An algorithm for computing the GCD of two numbers was known in ancient China, under theHan dynasty, as a method to reduce fractions:
If possible halve it; otherwise, take the denominator and the numerator, subtract the lesser from the greater, and do that alternately to make them the same. Reduce by the same number.
— Fangtian – Land surveying,The Nine Chapters on the Mathematical Art
The phrase "if possible halve it" is ambiguous,[4]
Covers the extended binary GCD, and a probabilistic analysis of the algorithm.
Covers a variety of topics, including the extended binary GCD algorithm which outputsBézout coefficients, efficient handling of multi-precision integers using a variant ofLehmer's GCD algorithm, and the relationship between GCD andcontinued fraction expansions of real numbers.
An analysis of the algorithm in the average case, through the lens offunctional analysis: the algorithms' main parameters are cast as adynamical system, and their average value is related to theinvariant measure of the system'stransfer operator.