Given two non-negative integers $a$ and $b$, we have to find their GCD (greatest common divisor), i.e. the largest number which is a divisor of both $a$ and $b$. It’s commonly denoted by $\gcd(a, b)$. Mathematically it is defined as: \[\gcd(a, b) = \max_ {k = 1 \dots \infty ~ : ~ k \mid a ~ \wedge k ~ \mid b} k.\]
(here the symbol “$\mid$” denotes divisibility, i.e. “$k \mid a$” means “$k$ divides $a$”)
When one of the numbers is zero, while the other is non-zero, their greatest common divisor, by definition, is the second number. When both numbers are zero, their greatest common divisor is undefined (it can be any arbitrarily large number), but we can define it to be zero. Which gives us a simple rule: if one of the numbers is zero, the greatest common divisor is the other number.
The Euclidean algorithm, discussed below, allows to find the greatest common divisor of two numbers $a$ and $b$ in $O(\log \min(a, b))$.
The algorithm was first described in Euclid’s “Elements” (circa 300 BC), but it is possible that the algorithm has even earlier origins.
Algorithm
The algorithm is extremely simple: \[\gcd(a, b) = \begin{cases}a,&\text{if }b = 0 \\\\ \gcd(b, a \bmod b),&\text{otherwise.}\end{cases}\]
Implementation
int gcd (int a, int b) {
if (b == 0)
return a;
else
return gcd (b, a % b);
}
Using the ternary operator in C++, we can write it as a one-liner.
int gcd (int a, int b) {
return b ? gcd (b, a % b) : a;
}
And finally, here is a non-recursive implementation:
int gcd (int a, int b) {
while (b) {
a %= b;
swap(a, b);
}
return a;
}
Correctness Proof
First, notice that in each iteration of the Euclidean algorithm the second argument strictly decreases, therefore (since the arguments are always non-negative) the algorithm will always terminate.
For the proof of correctness, we need to show that $\gcd(a, b) = \gcd(b, a \bmod b)$ for all $a \geq 0$, $b > 0$.
We will show that the value on the left side of the equation divides the value on the right side and vice versa. Obviously, this would mean that the left and right sides are equal, which will prove Euclid’s algorithm.
Let $d = \gcd(a, b)$. Then by definition $d\mid a$ and $d\mid b$.
Now let’s represent the remainder of the division of $a$ by $b$ as follows: \[a \bmod b = a - b \cdot \Bigl\lfloor\dfrac{a}{b}\Bigr\rfloor\]
From this it follows that $d \mid (a \bmod b)$, which means we have the system of divisibilities: \[\begin{cases}d \mid b,\\\\ d \mid (a \mod b)\end{cases}\]
Now we use the fact that for any three numbers $p$, $q$, $r$, if $p\mid q$ and $p\mid r$ then $p\mid \gcd(q, r)$. In our case, we get: \[d = \gcd(a, b) \mid \gcd(b, a \mod b)\]
Thus we have shown that the left side of the original equation divides the right. The second half of the proof is similar.
Time Complexity
The running time of the algorithm is estimated by Lamé’s theorem, which establishes a surprising connection between the Euclidean algorithm and the Fibonacci sequence:
If $a > b \geq 1$ and $b < F_n$ for some $n$, the Euclidean algorithm performs at most $n-2$ recursive calls.
Moreover, it is possible to show that the upper bound of this theorem is optimal. When $a = F_n$ and $b = F_{n-1}$, $gcd(a, b)$ will perform exactly $n-2$ recursive calls. In other words, consecutive Fibonacci numbers are the worst case input for Euclid’s algorithm.
Given that Fibonacci numbers grow exponentially, we get that the Euclidean algorithm works in $O(\log \min(a, b))$.
Least common multiple
Calculating the least common multiple (commonly denoted LCM) can be reduced to calculating the GCD with the following simple formula: \[\text{lcm}(a, b) = \frac{a \cdot b}{\gcd(a, b)}\]
Thus, LCM can be calculated using the Euclidean algorithm with the same time complexity:
A possible implementation, that cleverly avoids integer overflows by first dividing $a$ with the GCD, is given here:
int lcm (int a, int b) {
return a / gcd(a, b) * b;
}
Binary GCD
The Binary GCD algorithm is an optimization to the normal Eulidean algorithm.
The slow part of the normal algorithm are the modulo operations. Modulo operations, although we see them as $O(1)$, are a lot slower than simpler operations like addition, subtraction or bitwise operations. So it would be better to avoid those.
It turns out, that you can design a fast GCD algorithm that avoids modulo operations. It’s based on a few properties:
- If both numbers are even, then we can factor out a two of both and compute the GCD of the remaining numbers: $\gcd(2a, 2b) = 2 \gcd(a, b)$.
- If one of the numbers is even and the other one is odd, then we can remove the factor 2 from the even one: $\gcd(2a, b) = \gcd(a, b)$ if $b$ is odd.
- If both numbers are odd, then subtracting one number of the other one will not change the GCD: $\gcd(a, b) = \gcd(b, a-b)$ (this can be proven in the same way as the correctness proof of the normal algorithm)
Using only these properties, and some fast bitwise functions from GCC, we can implement a fast version:
int gcd(int a, int b) {
if (!a || !b)
return a | b;
unsigned shift = __builtin_ctz(a | b);
a >>= __builtin_ctz(a);
do {
b >>= __builtin_ctz(b);
if (a > b)
swap(a, b);
b -= a;
} while (b);
return a << shift;
}
Notice, that such an optimization is usually not necessary, and most programming languages already have a GCD function in their standard libraries. E.g. C++17 has such a function in the numeric
header.