9.1 The RSA Algorithm

Alice wants to send a message to Bob, but they have not had previous contact and they do not want to take the time to send a courier with a key. Therefore, all information that Alice sends to Bob will potentially be obtained by the evil observer Eve. However, it is still possible for a message to be sent in such a way that Bob can read it but Eve cannot.

With all the previously discussed methods, this would be impossible. Alice would have to send a key, which Eve would intercept. She could then decrypt all subsequent messages. The possibility of the present scheme, called a public key cryptosystem, was first publicly suggested by Diffie and Hellman in their classic 1976 paper [Diffie-Hellman]. However, they did not yet have a practical implementation (although they did present an alternative key exchange procedure that works over public channels; see Section 10.4). In the next few years, several methods were proposed. The most successful, based on the idea that factorization of integers into their prime factors is hard, was proposed by Rivest, Shamir, and Adleman in 1977 and is known as the RSA algorithm.

It had long been claimed that government cryptographic agencies had discovered the RSA algorithm several years earlier, but secrecy rules prevented them from releasing any evidence. Finally, in 1997, documents released by CESG, a British cryptographic agency, showed that in 1970, James Ellis had discovered public key cryptography, and in 1973, Clifford Cocks had written an internal document describing a version of the RSA algorithm in which the encryption exponent  (see the discussion that follows) was the same as the modulus .

Here is how the RSA algorithm works. Bob chooses two distinct large primes  and  and multiplies them together to form



He also chooses an encryption exponent  such that



He sends the pair  to Alice but keeps the values of  and  secret. In particular, Alice, who could possibly be an enemy of Bob, never needs to know  and  to send her message to Bob securely. Alice writes her message as a number . If  is larger than , she breaks the message into blocks, each of which is less than . However, for simplicity, let’s assume for the moment that . Alice computes



and sends  to Bob. Since Bob knows  and , he can compute  and therefore can find the decryption exponent  with



As we’ll see later,



so Bob can read the message.

We summarize the algorithm in the following table.

Example

Bob chooses



Then



Let the encryption exponent be



The values of  and  are sent to Alice.

Alice’s message is cat. We will depart from our earlier practice of numbering the letters starting with ; instead, we start the numbering at  and continue through . We do this because, in the previous method, if the letter  appeared at the beginning of a message, it would yield a message number  starting with , so the  would disappear.

The message is therefore



Alice computes



She sends  to Bob.

Since Bob knows  and , he knows . He uses the extended Euclidean algorithm (see Section 3.2) to compute  such that





Bob computes



so he obtains the original message.

For more examples, see Examples 24–30 in the Computer Appendices.

There are several aspects that need to be explained, but perhaps the most important is why . Recall Euler’s theorem (Section 3.6): If , then . In our case, . Suppose . This is very likely the case; since  and  are large,  probably has neither as a factor. Since , we can write  for some integer . Therefore,



We have shown that Bob can recover the message. If , Bob still recovers the message. See Exercise 37.

What does Eve do? She intercepts . She does not know . We assume that Eve has no way of factoring . The obvious way of computing  requires knowing . We show later that this is equivalent to knowing  and . Is there another way? We will show that if Eve can find , then she can probably factor . Therefore, it is unlikely that Eve finds the decryption exponent .

Since Eve knows , why doesn’t she simply take the th root of ? This works well if we are not working mod  but is very difficult in our case. For example, if you know that , you cannot calculate the cube root of 3, namely  on your calculator and then reduce mod 85. Of course, a case-by-case search would eventually yield , but this method is not feasible for large .

How does Bob choose  and ? They should be chosen at random, independently of each other. How large depends on the level of security needed, but it seems that they should have at least 300 digits. For reasons that we discuss later, it is perhaps best if they are of slightly different lengths. When we discuss primality testing, we’ll see that finding such primes can be done fairly quickly (see also Section 3.6). A few other tests should be done on  and  to make sure they are not bad. For example, if  has only small prime factors, then  is easy to factor by the  method (see Section 9.3), so  should be rejected and replaced with another prime.

Why does Bob require ? Recall (see Section 3.3) that  has a solution  if and only if . Therefore, this condition is needed in order for  to exist. The extended Euclidean algorithm can be used to compute  quickly. Since  is even,  cannot be used; one might be tempted to use . However, there are dangers in using small values of  (see Section 9.2, Computer Problem 14, and Section 23.3), so something larger is usually recommended. Also, if  is a moderately large prime, then there is no difficulty ensuring that . It is now generally recommended that . Among the data collected for [Lenstra2012 et al.] is the distribution of RSA encryption exponents that is given in Table 9.1.

In the encryption process, Alice calculates . Recall that this can be done fairly quickly and without large memory, for example, by successive squaring. See Section 3.5. This is definitely an advantage of modular arithmetic: If Alice tried to calculate  first, then reduce mod , it is possible that recording  would overflow her computer’s memory. Similarly, the decryption process of calculating  can be done efficiently. Therefore, all the operations needed for encryption and decryption can be done quickly (i.e., in time a power of ). The security is provided by the assumption that  cannot be factored.

We made two claims. We justify them here. Recall that the point of these two claims was that finding  or finding the decryption exponent  is essentially as hard as factoring . Therefore, if factoring is hard, then there should be no fast, clever way of finding .

Claim 1: Suppose  is the product of two distinct primes. If we know  and , then we can quickly find  and .

Note that



Therefore, we know  and . The roots of the polynomial



are  and , but they can also be calculated by the quadratic formula:



This yields  and .

For example, suppose  and we know that . Consider the quadratic equation



The roots are



For another example, see Example 31 in the Computer Appendices.

Claim 2: If we know  and , then we can probably factor .

In the discussion of factorization methods in Section 9.4, we show that if we have an exponent  such that  for several  with , then we can probably factor . Since  is a multiple of , say , we have



whenever . The  factorization method can now be applied.

For an example, see Example 32 in the Computer Appendices.

Claim 2’: If  is small or medium-sized (for example, if  has several fewer digits than ) and we know , then we can factor .

We use the following procedure:

1. Compute  (that is, round up to the nearest integer).

2. Compute


3. Solve the quadratic equation



The solutions are  and .

Example

Let  and . We discover that . Compute



Therefore, . Then



The roots of the equation



are  and , and we can check that .

Why does this work? We know that , so we write . Since , we know that



so . We have



Usually, both  and  are approximately . In practice,  and therefore  (which is less than ) are much smaller than . Therefore,  is much smaller than , which means that  and it rounds off to .

Once we have , we use  to solve for . As we have already seen, once we know  and , we can find  and  by solving the quadratic equation.

One way the RSA algorithm can be used is when there are several banks, for example, that want to be able to send financial data to each other. If there are several thousand banks, then it is impractical for each pair of banks to have a key for secret communication. A better way is the following. Each bank chooses integers  and  as before. These are then published in a public book. Suppose bank A wants to send data to bank B. Then A looks up B’s  and  and uses them to send the message. In practice, the RSA algorithm is not quite fast enough for sending massive amounts of data. Therefore, the RSA algorithm is often used to send a key for a faster encryption method such as AES.

PGP (= Pretty Good Privacy) used to be a standard method for encrypting email. When Alice sends an email message to Bob, she first signs the message using a digital signature algorithm such as those discussed in Chapter 13. She then encrypts the message using a block cipher such as triple DES or AES (other choices are IDEA or CAST-128) with a randomly chosen 128-bit key (a new random key is chosen for each transmission). She then encrypts this key using Bob’s public RSA key (other public key methods can also be used). When Bob receives the email, he uses his private RSA exponent to decrypt the random key. Then he uses this random key to decrypt the message, and he checks the signature to verify that the message is from Alice. For more discussion of PGP, see Section 15.6.