# 9.3 Primality Testing

Suppose we have an integer of 300 digits that we want to test for primality. We know by Exercise 7 in Chapter 3 that one way is to divide by all the primes up to its square root. What happens if we try this? There are around $3\times {10}^{147}$ primes less than ${10}^{150}$. This is significantly more than the number of particles in the universe. Moreover, if the computer can handle ${10}^{10}$ primes per second, the calculation would take around $3\times {10}^{130}$ years. (It’s been suggested that you could go sit on the beach for 20 years, then buy a computer that is 1000 times faster, which would cut the runtime down to $3\times {10}^{127}$ years – a very large savings!) Clearly, better methods are needed. Some of these are discussed in this section.

A very basic idea, one that is behind many factorization methods, is the following.

# Basic Principle

Let $n$ be an integer and suppose there exist integers $x$ and $y$ with ${x}^{2}\equiv {y}^{2}\text{\hspace{0.17em}}(\text{mod}\text{}n)$, but $x\text{}\overline{)\equiv}\pm y\text{}\text{}(\text{mod}n)$. Then $n$ is composite. Moreover, $gcd(x-y\text{,}\text{}\phantom{\rule{thinmathspace}{0ex}}n)$ gives a nontrivial factor of $n$.

Proof. Let $d=gcd(x-y\text{,}\text{}\phantom{\rule{thinmathspace}{0ex}}n)$. If $d=n$ then $x\equiv y\text{\hspace{0.17em}}(\text{mod}\text{}n)$, which is assumed not to happen. Suppose $d=1$. The Proposition in Subsection 3.3.1 says that if $ab\equiv ac\text{\hspace{0.17em}}(\text{mod}\text{}n)$ and if $gcd(a\text{,}\text{}n)=1$, then $b\equiv c\text{\hspace{0.17em}}(\text{mod}\text{}n)$. In our case, let $a=x-y$, let $b=x+y$, and let $c=0$. Then $ab={x}^{2}-{y}^{2}\equiv 0\equiv ac$. If $d=gcd(x-y\text{,}\text{}n)=1$, then $x+y=b\equiv c=0$. This says that $x\equiv -y\text{\hspace{0.17em}}(\text{mod}\text{}n)$, which contradicts the assumption that $x\text{}\overline{)\equiv}-y\text{}\text{}(\text{mod}n)$. Therefore, $d\ne 1\text{,}\text{}n$, so $d$ is a nontrivial factor of $n$.

# Example

Since ${12}^{2}\equiv {2}^{2}\text{\hspace{0.17em}}(\text{mod}\text{}35)$, but $12\text{}\overline{)\equiv}\pm 2\text{}\text{}(\text{mod35})\text{,}\text{}$, we know that 35 is composite. Moreover, $gcd(12-2\text{,}\text{}35)=5$ is a nontrivial factor of 35.

It might be surprising, but factorization and primality testing are not the same. It is much easier to prove a number is composite than it is to factor it. There are many large integers that are known to be composite but that have not been factored. How can this be done? We give a simple example. We know by Fermat’s theorem that if $p$ is prime, then ${2}^{p-1}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}p)$. Let’s use this to show 35 is not prime. By successive squaring, we find (congruences are mod 35)

Therefore,

Fermat’s theorem says that 35 cannot be prime, so we have proved 35 to be composite without finding a factor.

The same reasoning gives us the following.

# Fermat Primality Test

Let $n\text{}\text{}1$ be an integer. Choose a random integer $a$ with $1<a<n-1$. If ${a}^{n-1}\text{}\overline{)\equiv}\text{}1\text{\hspace{0.17em}}(\text{mod}\text{}n)$, then $n$ is composite. If ${a}^{n-1}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}n)$, then $n$ is probably prime.

Although this and similar tests are usually called “primality tests,” they are actually “compositeness tests,” since they give a completely certain answer only in the case when $n$ is composite. The Fermat test is quite accurate for large $n$. If it declares a number to be composite, then this is guaranteed to be true. If it declares a number to be probably prime, then empirical results show that this is very likely true. Moreover, since modular exponentiation is fast, the Fermat test can be carried out quickly.

Recall that modular exponentiation is accomplished by successive squaring. If we are careful about how we do this successive squaring, the Fermat test can be combined with the Basic Principle to yield the following stronger result.

# Miller-Rabin Primality Test

Let $n\text{}\text{}1$ be an odd integer. Write $n-1={2}^{k}m$ with $m$ odd. Choose a random integer $a$ with $1<a<n-1$. Compute ${b}_{0}\equiv {a}^{m}\text{\hspace{0.17em}}(\text{mod}\text{}n)$. If ${b}_{0}\equiv \pm 1\text{\hspace{0.17em}}(\text{mod}\text{}n)$, then stop and declare that $n$ is probably prime. Otherwise, let ${b}_{1}\equiv {b}_{0}^{2}\text{\hspace{0.17em}}(\text{mod}\text{}n)$. If ${b}_{1}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}n)$, then $n$ is composite (and $gcd({b}_{0}-1\text{,}\text{}\phantom{\rule{thinmathspace}{0ex}}n)$ gives a nontrivial factor of $n$). If ${b}_{1}\equiv -1\text{\hspace{0.17em}}(\text{mod}\text{}n)$, then stop and declare that $n$ is probably prime. Otherwise, let ${b}_{2}\equiv {b}_{1}^{2}\text{\hspace{0.17em}}(\text{mod}\text{}n)$. If ${b}_{2}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}n)$, then $n$ is composite. If ${b}_{2}\equiv -1\text{\hspace{0.17em}}(\text{mod}\text{}n)$, then stop and declare that $n$ is probably prime. Continue in this way until stopping or reaching ${b}_{k-1}$. If ${b}_{k-1}\text{}\overline{)\equiv}\text{}-1\text{\hspace{0.17em}}(\text{mod}\text{}n)$, then $n$ is composite.

# Example

Let $n=561$. Then $n-1=560=16\cdot 35$, so ${2}^{k}={2}^{4}$ and $m=35$. Let $a=2$. Then

Since ${b}_{3}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}561)$, we conclude that 561 is composite. Moreover, $gcd({b}_{2}-1\text{,}\text{}\phantom{\rule{thinmathspace}{0ex}}561)=33$, which is a nontrivial factor of 561.

If $n$ is composite and ${a}^{n-1}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}n)$, then we say that $n$ is a pseudoprime for the base $a$. If $a$ and $n$ are such that $n$ passes the Miller-Rabin test, we say that $n$ is a strong pseudoprime for the base $a$. We showed in Section 3.6 that ${2}^{560}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}561)$, so 561 is a pseudoprime for the base 2. However, the preceding calculation shows that 561 is not a strong pseudoprime for the base 2. For a given base, strong pseudoprimes are much more rare than pseudoprimes.

Up to ${10}^{10}$, there are 455052511 primes. There are 14884 pseudoprimes for the base 2, and 3291 strong pseudoprimes for the base 2. Therefore, calculating ${2}^{n-1}\text{\hspace{0.17em}}(\text{mod}\text{}n)$ will fail to recognize a composite in this range with probability less than 1 out of 30 thousand, and using the Miller-Rabin test with $a=2$ will fail with probability less than 1 out of 100 thousand.

It can be shown that the probability that the Miller-Rabin test fails to recognize a composite for a randomly chosen $a$ is at most $1/4$. In fact, it fails much less frequently than this. See [Damgå rd et al.]. If we repeat the test 10 times, say, with randomly chosen values of $a$, then we expect that the probability of certifying a composite number as prime is at most $(1/4{)}^{10}\simeq {10}^{-6}$. In practice, using the test for a single $a$ is fairly accurate.

Though strong pseudoprimes are rare, it has been proved (see [Alford et al.]) that, for any finite set $B$ of bases, there are infinitely many integers that are strong pseudoprimes for all $b\in B$. The first strong pseudoprime for all the bases $b=2\text{,}\text{}3\text{,}\text{}5\text{,}\text{}7$ is 3215031751. There is a 337-digit number that is a strong pseudoprime for all bases that are primes $<200$.

Suppose we need to find a prime of around 300 digits. The prime number theorem asserts that the density of primes around $x$ is approximately $1/lnx$. When $x={10}^{300}$, this gives a density of around $1/ln({10}^{300})=1/690$. Since we can skip the even numbers, this can be raised to $1/345$. Pick a random starting point, and throw out the even numbers (and multiples of other small primes). Test each remaining number in succession by the Miller-Rabin test. This will tend to eliminate all the composites. On average, it will take less than 400 uses of the Miller-Rabin test to find a likely candidate for a prime, so this can be done fairly quickly. If we need to be completely certain that the number in question is prime, there are more sophisticated primality tests that can test a number of 300 digits in a few seconds.

Why does the test work? Suppose, for example, that ${b}_{3}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}n)$. This means that ${b}_{2}^{2}\equiv {1}^{2}\text{\hspace{0.17em}}(\text{mod}\text{}n)$. Apply the Basic Principle from before. Either ${b}_{2}\equiv \pm 1\text{\hspace{0.17em}}(\text{mod}\text{}n)$, or ${b}_{2}\text{}\overline{)\equiv}\text{}\pm 1\text{\hspace{0.17em}}(\text{mod}\text{}n)$ and $n$ is composite. In the latter case, $gcd({b}_{2}-1\text{,}\text{}\phantom{\rule{thinmathspace}{0ex}}n)$ gives a nontrivial factor of $n$. In the former case, the algorithm would have stopped by the previous step. If we reach ${b}_{k-1}$, we have computed ${b}_{k-1}\equiv {a}^{(n-1)/2}\text{\hspace{0.17em}}(\text{mod}\text{}n)$. The square of this is ${a}^{n-1}$, which must be $1\text{\hspace{0.17em}}(\text{mod}\text{}n)$ if $n$ is prime, by Fermat’s theorem. Therefore, if $n$ is prime, ${b}_{k-1}\equiv \pm 1\text{\hspace{0.17em}}(\text{mod}\text{}n)$. All other choices mean that $n$ is composite. Moreover, if ${b}_{k-1}\equiv 1$, then, if we didn’t stop at an earlier step, ${b}_{k-2}^{2}\equiv {1}^{2}\text{\hspace{0.17em}}(\text{mod}\text{}n)$ with ${b}_{k-2}\text{}\overline{)\equiv}\text{}\pm 1\text{\hspace{0.17em}}(\text{mod}\text{}n)$. This means that $n$ is composite (and we can factor $n$).

In practice, if $n$ is composite, usually we reach ${b}_{k-1}$ and it is not $\pm 1\text{\hspace{0.17em}}(\text{mod}\text{}n)$. In fact, usually ${a}^{n-1}\text{}\overline{)\equiv}\text{}1\text{\hspace{0.17em}}(\text{mod}\text{}n)$. This means that Fermat’s test says $n$ is not prime.

For example, let $n=299$ and $a=2$. Since ${2}^{298}\equiv 140\text{\hspace{0.17em}}(\text{mod}\text{}299)$, Fermat’s theorem and also the Miller-Rabin test say that 299 is not prime (without factoring it). The reason this happens is the following. Note that $299=13\times 23$. An easy calculation shows that ${2}^{12}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}13)$ and no smaller exponent works. In fact, ${2}^{j}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}13)$ if and only if $j$ is a multiple of 12. Since 298 is not a multiple of 12, we have ${2}^{298}\text{}\overline{)\equiv}\text{}1\text{\hspace{0.17em}}(\text{mod}\text{}13)$, and therefore also ${2}^{298}\text{}\overline{)\equiv}\text{}1\text{\hspace{0.17em}}(\text{mod}\text{}299)$. Similarly, ${2}^{j}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}23)$ if and only if $j$ is a multiple of 11, from which we can again deduce that ${2}^{298}\text{}\overline{)\equiv}\text{}1\text{\hspace{0.17em}}(\text{mod}\text{}299)$. If Fermat’s theorem (and the Miller-Rabin test) were to give us the wrong answer in this case, we would have needed $13\cdot 23-1$ to be a multiple of $12\cdot 11$.

Consider the general case $n=pq$, a product of two primes. For simplicity, consider the case where $p\text{}\text{}q$ and suppose ${a}^{k}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}p)$ if and only if $k\equiv 0\text{\hspace{0.17em}}(\text{mod}\text{}p-1)$. This means that $a$ is a primitive root mod $p$; there are $\varphi (p-1)$ such $a$ mod $p$. Since $0<q-1<p-1$, we have

Therefore, ${a}^{n-1}\text{}\overline{)\equiv}\text{}1\text{\hspace{0.17em}}(\text{mod}\text{}p)$ by our choice of $a$, which implies that ${a}^{n-1}\text{}\overline{)\equiv}\text{}1\text{\hspace{0.17em}}(\text{mod}\text{}n)$. Similar reasoning shows that usually ${a}^{n-1}\text{}\overline{)\equiv}\text{}1\text{\hspace{0.17em}}(\text{mod}\text{}n)$ for many other choices of $a$, too.

But suppose we are in a case where ${a}^{n-1}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}n)$. What happens? Let’s look at the example of $n=561$. Since $561=3\times 11\times 17$, we consider what is happening to the sequence ${b}_{0}\text{,}\text{}{b}_{1}\text{,}\text{}{b}_{2}\text{,}\text{}{b}_{3}$ mod 3, mod 11, and mod 17:

Since ${b}_{3}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}561)$, we have ${b}_{2}^{2}\equiv {b}_{3}\equiv 1$ mod all three primes. But there is no reason that ${b}_{3}$ is the first time we get ${b}_{i}\equiv 1$ mod a particular prime. We already have ${b}_{1}\equiv 1$ mod 3 and mod 11, but we have to wait for ${b}_{3}$ when working mod 17. Therefore, ${b}_{2}^{2}\equiv {b}_{3}\equiv 1$ mod 3, mod 11, and mod 17, but ${b}_{2}$ is congruent to 1 only mod 3 and mod 11. Therefore, ${b}_{2}-1$ contains the factors 3 and 11, but not 17. This is why $gcd({b}_{2}-1\text{,}\text{}\phantom{\rule{thinmathspace}{0ex}}561)$ finds the factor 33 of 561. The reason we could factor 561 by this method is that the sequence ${b}_{0}\text{,}\text{}{b}_{1}\text{,}\text{}\dots $ reached 1 mod the primes not all at the same time.

More generally, consider the case $n=pq$ (a product of several primes is similar) and suppose ${a}^{n-1}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}n)$. As pointed out previously, it is very unlikely that this is the case; but if it does happen, look at what is happening mod $p$ and mod $q$. It is likely that the sequences ${b}_{i}\text{\hspace{0.17em}}(\text{mod}\text{}p)$ and ${b}_{i}\text{\hspace{0.17em}}(\text{mod}\text{}q)$ reach $-1$ and then 1 at different times, just as in the example of 561. In this case, we will be have ${b}_{i}\equiv -1\text{\hspace{0.17em}}(\text{mod}\text{}p)$ but ${b}_{i}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}q)$ for some $i$; therefore, ${b}_{i}^{2}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}n)$ but ${b}_{i}\text{}\overline{)\equiv}\text{}\pm 1\text{\hspace{0.17em}}(\text{mod}\text{}n)$. Therefore, we’ll be able to factor $n$.

The only way that $n$ can pass the Miller-Rabin test is to have ${a}^{n-1}\equiv 1\text{\hspace{0.17em}}(\text{mod}\text{}n)$ and also to have the sequences ${b}_{i}\text{\hspace{0.17em}}(\text{mod}\text{}p)$ and ${b}_{i}\text{\hspace{0.17em}}(\text{mod}\text{}q)$ reach 1 at the same time. This rarely happens.

Another primality test of a nature similar to the Miller-Rabin test is the following, which uses the Jacobi symbol (see Section 3.10).

# Solovay-Strassen Primality Test

Let $n$ be an odd integer. Choose several random integers $a$ with $1<a<n-1$. If

for some $a$, then $n$ is composite. If

for all of these random $a$, then $n$ is probably prime.

Note that if $n$ is prime, then the test will declare $n$ to be a probable prime. This is because of Part 2 of the second Proposition in Section 3.10.

The Jacobi symbol can be evaluated quickly, as in Section 3.10. The modular exponentiation can also be performed quickly.

For example,

so 15 is not prime. As in the Miller-Rabin tests, we usually do not get $\pm 1$ for ${a}^{(n-1)/2}\text{\hspace{0.17em}}(\text{mod}\text{}n)$. Here is a case where it happens:

However, the Solovay-Strassen test says that 341 is composite.

Both the Miller-Rabin and the Solovay-Strassen tests work quickly in practice, but, when $p$ is prime, they do not give rigorous proofs that $p$ is prime. There are tests that actually prove the primality of $p$, but they are somewhat slower and are used only when it is essential that the number be proved to be prime. Most of these methods are probabilistic, in the sense that they work with very high probability in any given case, but success is not guaranteed. In 2002, Agrawal, Kayal, and Saxena [Agrawal et al.] gave what is known as a deterministic polynomial time algorithm for deciding whether or not a number is prime. This means that the computation time is always, rather than probably, bounded by a constant times a power of $logp$. This was a great theoretical advance, but their algorithm has not yet been improved to the point that it competes with the probabilistic algorithms.

For more on primality testing and its history, see [Williams].