Prime Numbers: Checking for a Prime (Part 2)

Last time we saw how to test small or medium sized numbers to see if they are prime, including details on the elementary Trial Division method, and introduced the most popular test for larger numbers, the Fermat test. Here we’ll review Fermat, and then go beyond. This is not for the faint-hearted! (I myself am learning as I write this.)

Details of the Fermat test

I include this 1997 question because, although the answer starts with the Fermat test, it goes a little deeper than last time, then moves beyond. It comes from Dave, who is working on a computer program:

Large Prime Numbers

I believe that I read somewhere (Ivars Peterson, I think) that it is possible to determine if a very large number is prime using a simple procedure. (But not to determine its factors). It didn't say how this is done. If this is true, what is the algorithm?

Doctor Rob answered:

The situation is a bit more complicated than you remember. The fact is that the simple procedure does not provide a proof of primality, but may provide a proof of compositeness. It is based on Fermat's Little Theorem, which says:

   Theorem: If p is a prime number, a is an integer,
            and p is not a divisor of a,
            then p is a divisor of a^(p-1) - 1.

Given a number N we want to test, pick any old a, and find the 
greatest common divisor of a and N, GCD(a,N). If this is not 1, then it is a factor of N, and N is composite. If it is 1, then compute the remainder r of a^(N-1) when divided by N. If r is not 1, then by Fermat's Little Theorem, N cannot be prime, so is composite. This gives the proof of compositeness.

By the way, you might as well choose 1 < a < N-1, since a and a-N will give the same value of r, and since a = 1 or N-1 will always give r = 1.

This is a slightly different form of the method than we saw last time, but is equivalent. We are testing whether a given number N is prime, so we put it in the role of p in the theorem. By choosing a divisor a that is less than N, we know that N does not divide a. The theorem says that if N is a prime, then it must divide \(a^{N-1}-1\), which implies that \(a^{N-1}\equiv1\), so the remainder from that division must be 1. If the remainder is not 1, then, N can’t be a prime.

How to compute the remainder r?  If N is even moderately large, computing a^(N-1) and then dividing by N will be a bad idea, since the number a^(N-1) will have many, many digits. The trick is to divide by N and keep only the remainder at all intermediate steps. It may not be obvious that this works, but it does. If N = 67, N-1 = 66, you might compute a^66 by doing 65 multiplications. After each one, divide by 67 and keep only the remainder. 

Better than doing a^66 by 65 multiplications (and 65 divisions by N), you can shortcut the computation by the following trick:
a^66 = (a^33)^2, a^33 = a*a^32, a^32 = (a^16)^2, a^16 = (a^8)^2,a^8 = (a^4)^2, a^4 = (a^2)^2, a^2 = a*a. Working from the last equation backwards, you will need only 7 multiplications (and 7 divisions by N).

This second way is equivalent to what I demonstrated last time for \(N=55,409,243\).

Let’s carry out this version of the method in full for \(N=67\) (though of course the Fermat test isn’t needed for such a small number):

Taking \(a=2\), we need to calculate \(a^{N-1}=2^{66}\text{ mod } 67\). This requires calculating, successively, \(2^1\),\(2^2\),\(2^4\),\(2^8\),\(2^{16}\),\(2^{33}=2^{32}\cdot2\),\(2^{66}\), all mod 67 (that is, finding the remainder at each step). These are: $$2^1=2\\2^2=4\\2^4=(2^2)^2=4^2=16\\2^8=(2^4)^2=16^2=256\equiv55\\2^{16}=(2^8)^2\equiv55^2=3025\equiv10\\2^{33}=(2^{16})^2\cdot2\equiv10^2\cdot2=200\equiv66\\2^{66}=(2^{33})^2\equiv66^2=4356\equiv1$$

This tells us that 67 could be prime. If we want more assurance, we could repeat, using 3 as the base: $$3^1=3\\3^2=9\\3^4=(3^2)^2=9^2=81\equiv14\\3^8=(3^4)^2\equiv14^2=196\equiv62\\3^{16}=(3^8)^2\equiv62^2=3844\equiv25\\3^{33}=(3^{16})^2\cdot3\equiv25^2\cdot3=1875\equiv66\\3^{66}=(3^{33})^2\equiv66^2=4356\equiv1$$

There are several variations of this calculation of modular exponentiation by squaring; a particularly efficient method given in Wikipedia is what I did last time.

If r = 1, then what can we say?  All prime numbers will give r = 1, but there are a few composite numbers which will do so, too. For example, if a = 2, N = 341 = 11*31 is composite, but r = 1. Such N's are called Fermat pseudoprimes with respect to base a. 341 is a Fermat pseudoprime with respect to base 2. 

It turns out that Fermat pseudoprimes with respect to any fixed base are uncommon.  The chance of picking one at random from some large set of N's is very small. Thus, in some sense, which is again a complicated matter, if you get r = 1, N is "probably" prime.

In our example, we already know by trial division that 67 is certainly a prime; for a larger number, certainty would be much harder to attain, so knowing that a given number is probably a prime would be worthwhile, even though it might (rarely) be a pseudoprime.

A tactic you could use if you get r = 1 is to pick another base a and repeat the calculation.  If you get r = 1 again, try still another a. Continue this until you either find a proof that N is composite, or you decide that you couldn't possibly be unlucky enough to have an N which is a Fermat pseudoprime to all the bases you used.  Then declare the number to be prime, and you will be wrong only rarely.

A flaw with the above tactic is that there exist numbers called Carmichael numbers, for which, for all bases a such that N passes the GCD test, r will equal 1. The smallest one is 561 = 3*11*17. Every base a not divisible by 3, 11, or 17 will give r = 1. These are even rarer than Fermat pseudoprimes with respect to a given base a, but there are known to be infinitely many. If you happened to choose one of these, you might try very many bases, getting r = 1 over and over, then declare the number prime, and be wrong.

So absolute certainty by this method alone is impossible; but we can come close.

There are variations of the above method which get rid of the last flaw, give you proofs of compositeness, but only probabilistic statements about primality.  If you need to know more about them, write again.

If you want a true proof of primality, there are more complicated methods which will do this, but I am quite sure the above is what you remember reading about. The complicated methods can be proven to run on a computer in a relatively short time, and produce a certificate of primality which can be checked even faster.  They could not, however, be termed "simple" in most senses of the word!

The strong compositeness test (Miller-Rabin)

After some additional discussion about Dave’s programming project, Doctor Rob added a way to be certain for numbers with up to ten digits:

Here is another idea. Since your range is only up to 10^10 (as opposed to 10^1000[!!]), here is an alternative. Use the Strong Compositeness Test (below) with bases 2, 3, 5, and 7. If it fails all of these, and is not equal to 3215031751, then it is prime.

The Strong Compositeness Test with Base a: To test a number N for
compositeness:

1. If GCD(a,N) > 1, stop and declare that N is composite.
2. Write N - 1 = 2^s*d, where d is odd.
3. Compute R, the remainder of a^d when divided by N.  (Do this as
   described in the last message.)
4. If R = 1, stop and declare failure.
5. For i = 0, 1, ..., s-1, do the following:
   a. If R = N-1, stop and declare failure.
   b. Replace R with the remainder of R^2 when divided by N.
   c. If R = 1, stop and declare that N is composite.
6. Stop and declare N composite.

Note that he is calling this a test for compositeness (which makes sense, as it doesn’t prove primality), so that “failure” means you haven’t proved it’s composite – it’s a likely prime. Other sources call that “success”!

A number which fails the Strong Compositeness Test with Base a, yet is composite, is called a "strong pseudoprime with respect to base a."  Every strong pseudoprime is a Fermat pseudoprime to the same base, but the reverse is false.  The smallest strong pseudoprime with respect to base 2 is 2047.

   Theorem: The only number less than 2.5*10^10 which is a strong 
            pseudoprime with respect to bases 2, 3, 5, and 7, is 
            3,215,031,751.

So this test can be considered as proof of primality for any number less than 25 billion.

Let’s try it for our prime \(N=67\), with \(a=2\):

  1. First, since 67 is not even, we can use \(a=2\).
  2. Now we write \(N-1=66\) as \(2^1\cdot33\), so \(s=1\) and \(d=33\).
  3. Now we compute the remainder of \(2^{33}\) mod 67; from our work above, it is \(R=66\).
  4. This is not 1, so we do step 5, for i from 0 to \(s-1=1-1=0\) (that is, we do it only once).
  5. We find that \(R=66=67-1=N-1\), so we have “failed”: 67 is not proved to be composite. It is (potentially) prime.

Repeating with a different divisor a would confirm this.

To see what happens when the number is composite, let’s try again, with \(N=221\).

  1. Again, we safely choose \(a=2\).
  2. We write \(N-1=220\) as \(2^2\cdot55\), so \(s=2\) and \(d=55\).
  3. Now we compute the remainder of \(2^{55}\) mod 221; I get \(R=128\).
  4. Now we repeat step 5, for i from 0 to \(s-1=2-1=1\) (that is, twice).
  5. First time: \(R\ne N-1=220\), so we continue. We find that $$R^2=128^2=16,384\equiv30\text{ (mod }221).$$ This is not 1, so we repeat with \(R=30\).
    Second time: Since \(30\ne220\), we continue, finding that $$R^2=30^2=900\equiv16\text{ (mod }221).$$ This is not 1, so we drop off the end of the loop.
  6. We report that 221 is composite. It is, in fact, \(13\times17\).

We’ll close with two actual tests for primality (not just compositeness).

The Lucas primality test

Here is a question from Brian in 1994:

Testing primality of 32-bit numbers

What is the best (fastest) way to test if an arbitrary 32-bit number is prime?

Dividing by all numbers up to the square root is not great since you need to build a table of primes.  I am using a test that checks if the number is a strong pseudoprime to bases 2, 3, and 5 which is pretty good.  But is there a better way?

A 32-bit number can be as large as 4,294,967,295, so it’s within the range of that last method. Brian evidently knows about Miller-Rabin. Doctor Ken answered:

Yes, there is a faster way.  I'll assume that you're familiar with modulo arithmetic, since you wrote about strong pseudoprimes.  If this isn't the case, let us know.

The better way is called the Lucas Test for Primality.  It does rely on factoring, but not of the number in question.  If you're trying to determine whether the number M is prime, you have to factor M-1.  This isn't usually very hard, since it's certainly even if you're testing M for primality, and in general you'll get lots of small factors for a big number.  Anyway, here's how you apply the test:

We’ll demonstrate this test, again, with the prime \(M=67\) and the composite \(M=221\). In each case, we first factor \(M-1\):

$$67-1=66=2\cdot3\cdot11$$ $$221-1=220=2^2\cdot5\cdot11$$

If you can’t do that for the number of interest (see an upcoming post on factoring large numbers), then you can’t use this test.

You find all the prime factors p1, p2, p3, ..., pk that divide M-1.  You don't really need to know how many powers of each prime go into M-1, just which primes.  Then you pick a number (call it A) that's less than M.  Raise A to the power M-1, and find what it's congruent to Mod M.  If it's not congruent to 1, then M isn't even a pseudoprime (which is different than a strong pseudoprime) to the base A, so it's composite.  If A^(M-1) is congruent to 1 Mod M, then continue.

Raise A to the power (M-1)/p1, and find what it's congruent to Mod M.

***Note: do you have an efficient way to raise numbers to big powers when you're doing modulo arithmetic?  The number of steps you can do it in is about Log of the number of digits in the power.  It's done most efficiently by the method of successive squares.***

If it's congruent to 1, pick a new A and try again. If it's not, then find A^((M-1)/p2) (Mod M). Keep going like this with all the different primes that divide M-1. What you want is for there to be a number A for which none of these power-raisings is congruent to 1 Mod M. Stated precisely, here's the theorem:

  If there is an A such that A^(M-1) ≡ 1 (Mod M),
  and for every p dividing M-1 A^((M-1)/p)
    ≢ 1 (Mod M),
  then M is prime.

The proof is beyond what I want to cover here.

I’ll start with \(A=2\) for both of our examples, so we can use previous work.

For \(M=67\), we have three prime factors, 2, 3, and 11.

First, we find that \(A^{M-1}=2^{66}\equiv1\text{ (mod 67)}\), so we proceed.

Taking our first prime factor, \(p_1=2\), we find that $$A^{\frac{M-1}{p_1}}=2^{\frac{66}{2}}=2^{33}=2^{32}\cdot2\equiv66.$$

We repeat for the second prime factor, \(p_2=3\), and find that $$A^{\frac{M-1}{p_2}}=2^{\frac{66}{3}}=2^{22}=2^{16+4+2}\equiv10\cdot16\cdot4=640\equiv37.$$

We repeat for the third prime factor, \(p_3=11\), and find that $$A^{\frac{M-1}{p_3}}=2^{\frac{66}{11}}=2^{6}=64.$$

None of these was 1, so we have proved that 67 is prime.

For \(M=221\), we have three prime factors, 2, 5, and 11.

First, we find that \(A^{M-1}=2^{220}\equiv16\not\equiv1\text{ (mod 221)}\), so we immediately conclude that it is composite.

Note that if we got 1, we would try another A, and another; in principle we might have to try every number, if it weren’t for the built-in compositeness test!

Note that this gives an airtight proof of the primality of a number.  It's a little slower then finding out whether M is a strong pseudoprime to every base, which is what you'd really have to do to really show that your number is really a prime, and not just some imposter.

That’s what makes this “better” than Miller-Rabin. But, as we saw before, the latter is really sufficient:

But you know, it's really a pretty good test to check the first few bases for pseudoprimes, and here's why.

If you find that M is a strong pseudoprime to the base 2 and M is smaller than 2047, then M is prime.  If you test it again with the base 3, and it's again a strong pseudoprime, then M is a prime provided it's less than 1373,653.  If you test it again to the base 5, and it's again a strong pseudoprime, then M is prime provided it's less than 25,326,001.  That's the smallest number that's a SSP to the base 2,3, & 5.  So odds are pretty good that if it tests out to all three bases, it's a prime.  Of course, it's a gamble.....

So, experience with the compositeness tests allows us to have great confidence in them, even though they don’t give certainty for extremely large numbers.

The AKS primality test

We’ll close with a 2005 question:

Fastest Primality Test?

Hello!  I would like to ask about "the fastest prime test".  I found an interesting article on the Internet which said, "We present a deterministic polynomial-time algorithm that determines whether an input number n is prime or composite."  It was from the Department of Computer Science and Engineering at the Indian Institute of Technology in Kanpur, India.  I have some questions about it:

1. Is it really faster than the Rabin Miller or Lucas tests?

2. If it is true, why can't we use this form to search prime numbers with distributed algorithm? (e.g www.mersenne.org, Mersenne-Prime)?

3. The algorithm speed is O(log^12 N).  What do you think about this algorithm?  Could the mathematicians develop an algorithm about O(log N)?

Doctor Vogler answered:

Hi Norbert,

Thanks for writing to Dr. Math.  Those are very good questions.  

First of all, the Rabin-Miller test is a pseudo-primality test, which means that if you give it a composite number, then it will _probably_ tell you that it is composite.  But if you give it a prime number, then it will tell you that it is _probably prime_.  A composite number that passes a pseudo-primality test is called a pseudoprime.  It turns out that it is not known whether there are any Rabin-Miller pseudoprimes, but no one has yet proven that there are none.

(I’m not sure about this, as sources mention strong pseudo-primes, which seem to be what he is referring to.)

There are other primality tests that are more efficient than the algorithm you just mentioned, in the sense that they finish more quickly on numbers that we've tried, but they are _probabilistic_ algorithms, which means that some part of the algorithm is based on choosing a random number and then doing some computations with it.  With a certain probability, it will tell you that your number is either prime or composite.  In practice, after a few tries, you get a result.  

But many mathematicians wanted to find a _deterministic_ algorithm (one that is guaranteed to finish in no more than a known number of steps) which will (provably) finish in a number of steps that is a polynomial in log N.  That is what this new algorithm does, and the creators proved that it would always finish in O((log N)^12) time.  Later, someone else proved that it would, in fact, finish more quickly than that, in almost O((log N)^6) time.  That still doesn't make it faster than other tests, though, just faster than other general-purpose deterministic primality tests.

This “big-O” notation describes, not the actual speed of an algorithm, but how that speed depends on the input – that is, how fast the time it takes grows when you put in bigger and bigger numbers. The “new” algorithm is not faster on average than other methods, but promises not to take as long as they might, and to guarantee a maximum.

The Lucas-Lehmer primality test is a deterministic primality test, but it is not general-purpose because it only works on Mersenne numbers.  It is much faster than other algorithms (such as the one you mentioned), but it only works for those special numbers.  Consider that mersenne.org reports that the 42'nd Mersenne prime found was

  N = 2^25,964,951 - 1,

which took 50 days on a 2.4 GHz computer, which is

  50 days * 24 hours/day * 60 minutes/hour * 60 seconds/minute
      * 2,400,000,000 instructions/second

   = 10,368,000,000,000,000 instructions

(or, more correctly, cycles, but the difference is not important).  In this case,

  log N = 25,964,951 * log 2
        = 17,997,532.58...

and

  (log N)^12 = 1,154,929,888,777,235,354,042,315,422,387,627,895,434,
     164,939,766,318,268,467,306,355,354,201,088,395,258,493,601,391

which is a whole heck of a lot more than the number of instructions used in the Lucas-Lehmer test.  In fact,

  (log N)^6 = 
        33,984,259,426,640,965,925,133,186,173,883,788,875,047,677

which is still a heck of a lot more.  How many days would this many instructions take on that 2.4 GHz computer?  How many trillions of years?

So Lucas-Lehmer, when applicable, is considerably faster than AKS.

For more on these and other tests, see MathWorld.

Leave a Comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.