CAPÍTULO 4. EL SURTIDO
4.3. Planificación cuantitativa del surtido
Quran, Surah Al-Fajr
Programmers often assume that since a data structure or algorithm is in a textbook or has been used for years, it represents the best solution to a problem. Surprisingly, this is often not the case—even if the algorithm has been used for thousands of years and has been worked on by everyone from Euclid to Gauss. In this chapter we’ll look at an example of a novel solution to an old problem—computing the GCD. Then we’ll see how proving a theorem from number theory resulted in an important variation of the algorithm still used today.
12.1 Hardware Constraints and a More Efficient Algorithm
In 1961, an Israeli Ph.D. student, Josef “Yossi” Stein, was working on something called Racah Algebra for his dissertation. He needed to do rational arithmetic, which required reducing fractions, which uses the GCD. But because he had only limited time on a slow computer, he was motivated to find a better way. As he explains: Using “Racah Algebra” meant doing calculations with numbers of the form a/b. , where, a, b, c were integers. I wrote a program for the only available computer in Israel at that time—the WEIZAC at the
Weizmann institute. Addition time was 57 microseconds, division took about 900 microseconds. Shift took less than addition.... We had neither compiler nor assembler, and no floating-point numbers, but used hexadecimal code for the programming, and had only 2 hours of computer-time per week for Racah and his students, and you see that I had the right conditions for finding that algorithm. Fast GCD meant survival. 1
1 J. Stein, personal communication, 2003.
What Stein observed was that there were certain situations where the GCD could be easily computed, or easily expressed in terms of another GCD expression. He looked at special cases like taking the GCD of an even number and an odd number, or a number and itself. Eventually, he came up with the following exhaustive list of cases:
Using these observations, Stein wrote the following algorithm:
Click here to view code image
template <BinaryInteger N> N stein_gcd(N m, N n) { if (m < N(0)) m = -m; if (n < N(0)) n = -n; if (m == N(0)) return n; if (n == N(0)) return m; // m > 0 && n > 0 int d_m = 0; while (even(m)) { m >>= 1; ++d_m;}
while (even(m)) { m >>= 1; ++d_m;}
int d_n = 0;
while (even(n)) { n >>= 1; ++d_n;} // odd(m) && odd(n)
while (m != n) { if (n > m) swap(n, m); m -= n; do m >>= 1; while (even(m)); } // m == n return m << min(d_m, d_n); }
Let’s look at what the code is doing. The function takes two BinaryIntegers—that is, an integer representation that supports fast shift and even/odd testing, like typical computer integers. First, it eliminates the easy cases where one of the arguments is zero, and inverts the sign if an argument is negative, so that we are dealing with two positive integers.
Next, it takes advantage of the identities with even arguments, removing factors of 2 (by shifting) while keeping track of how many there were. We can use a simple int for the counts, since what we’re counting is at most the total number of bits in the original arguments. After this part, we are operating on two odd numbers. Now comes the main loop. We repeatedly subtract the smaller from the larger each time (since we know the difference of two odd numbers is even), and again use shifts to remove additional powers of 2 from the result.
2 When we’re done, our two numbers will be equal. Since we’re halving at least once each time through the
loop, we know we’ll iterate no more than log n times; the algorithm is bounded by the number of 1-bits we encounter.
2 We use do-while rather than while because we don’t need to run the the test the first time; we know we’re
starting with an even number so we know we have to do at least one shift.
Finally, we return our result, using a shift to multiply our number by 2 for each of the minimum number of 2s we factored out at the beginning. We don’t need to worry about 2s in the main loop, because by that point we’ve reduced the problem to the GCD of two odd numbers; this GCD does not have 2 as a factor.
Here’s an example of the algorithm in operation. Suppose we want to compute GCD(196, 42). The computation looks like this:
As we saw, Stein took some observations about special cases and turned them into a faster algorithm. The special cases had to do with even and odd numbers, and places where we could factor out 2, which is easy on a computer; that’s why Stein’s algorithm is faster in practice. (Even today, when the remainder function can be computed in hardware, it is still much slower than simple shifts.) But is this just a clever hack, or is there more here than meets the eye? Does it make sense only because computers use binary arithmetic? Does Stein’s algorithm work just for integers, or can we generalize it just as we did with Euclid’s algorithm?
12.2 Generalizing Stein’s Algorithm
To answer these questions, let’s review some of the historical milestones for Euclid’s GCD: • Positive integers: Greeks (5th century BC)
• Polynomials: Stevin (ca. 1600) • Gaussian integers: Gauss (ca. 1830)
• Algebraic integers: Dirichlet, Dedekind (ca. 1860) • Generic version: Noether, van der Waerden (ca. 1930)
It took more than 2000 years to extend Euclid’s algorithm from integers to polynomials. Fortunately, it took much less time for Stein’s algorithm. In fact, just 2 years after its publication, Knuth already knew of a version for single-variable polynomials over a field .
The surprising insight was that we can have x play the role for polynomials that 2 plays for integers. That is, we can factor out powers of x, and so on. Carrying the analogy further, we see that x 2 + x (or anything else divisible
by x) is “even,” x 2 + x + 1 (or anything else with a zero-order coefficient) is “odd,” and x 2 + x “shifts” to x + 1.
Just as division by 2 is easier than general division for binary integers, so division by x is easier than general division for polynomials—in both cases, all we need is a shift. (Remember that a polynomial is really a sequence of coefficients, so division by x is literally a shift of the sequence.)
Stein’s “special cases” for polynomials look like this:
Notice how each of the last two rules cancels one of the zero-order coefficients, so we convert the “odd, odd” case to an “even, odd” case.
To get the equivalence expressed by Equation 12.6, we rely on two facts. First, if you have two polynomials u and v, then gcd(u,v) = gcd(u,av), where a is a nonzero coefficient. So we can multiply the second argument by the coefficient , and we’ll have the same GCD:
Second, gcd(u,v) = gcd(u,v - u), which we noted when we introduced GCD early in the book (Equation 3.9). So we can subtract our new second argument from the first, and we’ll still have the same GCD:
Finally, we can use the fact that if one of the GCD arguments is divisible by x and the other is not, we can drop x because the GCD will not contain it as a factor. So we “shift” out the x, which gives
which is what we wanted.
We also see that in each transformation, the norm—in this case, the degree of the polynomial—gets reduced. Here’s how the algorithm would compute gcd(x 3 – 3x – 2, x 2 – 4):
First we see that the ratio of their free coefficients (the c and d in Equations 12.6 and 12.7) is 1/2, so we will multiply n by 1/2 and subtract it from m (shown in the first line of the preceding table), resulting in the new m shown on the second line. Then we “shift” m by factoring out x, resulting in the third line, and so on.
In 2000, Andre Weilert further generalized Stein’s algorithm to Gaussian integers. This time, 1 + i plays the role of 2; the “shift” operation is division by 1 + i. In 2003, Damgård and Frandsen extended the algorithm to Eisenstein integers.
In 2004, Agarwal and Frandsen demonstrated that there is a ring that is not a Euclidean domain, but where the Stein algorithm still works. In other words, there are cases where Stein’s algorithm works but Euclid’s does not. If the domain of the Stein algorithm is not the Euclidean domain, then what is it? As of this writing, this is an unsolved problem.
What we do know is that Stein’s algorithm depends on the notion of even and odd; we generalize even to be divisible by a smallest prime, where p is a smallest prime if any remainder when dividing by it is either zero or an invertible element. (We say “a smallest prime” rather than “the smallest prime” because there could be
multiple smallest primes in a ring. For example, for Gaussian integers, 1 + i, 1 – i, – 1 + i, and – 1 – i are all smallest primes.)
Why do we factor out 2 when we’re computing the GCD of integers? Because when we repeatedly divide by 2, we eventually get 1 as a remainder; that is, we have an odd number. Once we have two odd numbers (two numbers whose remainders modulo 2 are both units), we can use subtraction to keep our GCD algorithm going. This ability to cancel remainders works because 2 is the smallest integer prime. Similarly, x is the smallest prime for polynomials, and i + 1 for Gaussian integers. 3 Division by the smallest prime always gives a remainder
of zero or a unit, because a unit is the number with the smallest nonzero norm. So 2 works for integers because it’s the smallest prime, not because computers use binary arithmetic. The algorithm is possible because of fundamental properties of integers, not because of the hardware implementation, although the algorithm is efficient because computers use binary arithmetic, making shifts fast.
3 Note that 2 is not prime in the ring of Gaussian integers, since it can be factored into (1+i) (1-i).
Exercise 12.1. Compare the performance of the Stein and Euclid algorithms on random integers from the