# 9.2 Attacks on RSA

In practice, the RSA algorithm has proven to be effective, as long as it is implemented correctly. We give a few possible implementation mistakes in the Exercises. Here are a few other potential difficulties. For more about attacks on RSA, see [Boneh].

# Theorem

Let  have  digits. If we know the first , or the last , digits of , we can efficiently factor .

In other words, if  and  have 300 digits, and we know the first 150 digits, or the last 150 digits, of , then we can factor . Therefore, if we choose a random starting point to choose our prime , the method should be such that a large amount of  is not predictable. For example, suppose we take a random 150-digit number  and test numbers of the form , , for primality until we find a prime  (which should happen for ). An attacker who knows that this method is used will know 147 of the last 150 digits (they will all be 0 except for the last three or four digits). Trying the method of the theorem for the various values of  will eventually lead to the factorization of .

For details of the preceding result, see [Coppersmith2]. A related result is the following.

# Theorem

Suppose  is an RSA public key and  has  digits. Let  be the decryption exponent. If we have at least the last  digits of , we can efficiently find  in time that is linear in .

This means that the time to find  is bounded as a function linear in . If  is small, it is therefore quite fast to find  when we know a large part of . If  is large, perhaps around , the theorem is no better than a case-by-case search for . For details, see [Boneh et al.].

# 9.2.1 Low Exponent Attacks

Low encryption or decryption exponents are tempting because they speed up encryption or decryption. However, there are certain dangers that must be avoided. One pitfall of using  is given in Computer Problem 14. Another difficulty is discussed in Chapter 23 (Lattice Methods). These problems can be avoided by using a somewhat higher exponent. One popular choice is . This is prime, so it is likely that it is relatively prime to . Since it is one more than a power of 2, exponentiation to this power can be done quickly: To calculate , square  sixteen times, then multiply the result by .

The decryption exponent  should of course be chosen large enough that brute force will not find it. However, even more care is needed, as the following result shows. One way to obtain desired properties of  is to choose  first, then find  with .

Suppose Bob wants to be able to decrypt messages quickly, so he chooses a small value of . The following theorem of M. Wiener [Wiener] shows that often Eve can then find  easily. In practice, if the inequalities in the hypotheses of the proposition are weakened, then Eve can still use the method to obtain  in many cases. Therefore, it is recommended that  be chosen fairly large.

# Theorem

Suppose  are primes with . Let  and let  satisfy . If , then  can be calculated quickly (that is, in time polynomial in ).

Proof. Since , we have . Therefore, since ,



Write  for some integer . Since , we have



so . Therefore,



Also, since , we have . Dividing by  yields



since  by assumption.

We now need a result about continued fractions. Recall from Section 3.12 that if  is a positive real number and  and  are positive integers with



then  arises from the continued fraction expansion of . Therefore, in our case,  arises from the continued fraction expansion of . Therefore, Eve does the following:

1. Computes the continued fraction of . After each step, she obtains a fraction .

2. Eve uses  and  to compute . (Since , this value of  is a candidate for .)

3. If  is not an integer, she proceeds to the next step of the continued fraction.

4. If  is an integer, then she finds the roots  of . (Note that this is possibly the equation  from earlier.) If  and  are integers, then Eve has factored . If not, then Eve proceeds to the next step of the continued fraction algorithm.

Since the number of steps in the continued fraction expansion of  is at most a constant times , and since the continued fraction algorithm stops when the fraction  is reached, the algorithm terminates quickly. Therefore, Eve finds the factorization of  quickly.

# Remarks

Recall that the rational approximations to a number  arising from the continued fraction algorithm are alternately larger than  and smaller than . Since , we only need to consider every second fraction arising from the continued fraction.

What happens if Eve reaches  without finding the factorization of ? This means that the hypotheses of the proposition are not satisfied. However, it is possible that sometimes the method will yield the factorization of  even when the hypotheses fail.

# Example

Let  and . The continued fraction of  is



The first fraction is , so we try . Since  must be odd, we discard this possibility.



Again, we discard this since  must be odd.

The fifth fraction is . This gives , which is not an integer.

The seventh fraction is  This gives  as the candidate for . The roots of



are  and , to several decimal places of accuracy. Since



we have factored .

# 9.2.2 Short Plaintext

A common use of RSA is to transmit keys for use in DES, AES, or other symmetric cryptosystems. However, a naive implementation could lead to a loss of security. Suppose a 56-bit DES key is written as a number . This is encrypted with RSA to obtain . Although  is small, the ciphertext  is probably a number of the same size as , so perhaps around 200 digits. However, Eve attacks the system as follows. She makes two lists:

1.  for all  with .

2.  for all  with .

She looks for a match between an element on the first list and an element on the second list. If she finds one, then she has  for some . This yields



so . Is this attack likely to succeed? Suppose  is the product of two integers  and , both less than . Then these  will yield a match for Eve. Not every  will have this property, but many values of  are the product of two integers less than . For these, Eve will obtain .

This attack is much more efficient than trying all  possibilities for , which is nearly impossible on one computer, and would take a long time even with several computers working in parallel. In the present attack, Eve needs to compute and store a list of length , then compute the elements on the other list and check each one against the first list. Therefore, Eve performs approximately  computations (and compares with the list up to  times). This is easily possible on a single computer. For more on this attack, see [Boneh-Joux-Nguyen].

It is easy to prevent this attack. Instead of using a small value of , adjoin some random digits to the beginning and end of  so as to form a much longer plaintext. When Bob decrypts the ciphertext, he simply removes these random digits and obtains .

A more sophisticated method of preprocessing the plaintext, namely Optimal Asymmetric Encryption Padding (OAEP), was introduced by Bellare and Rogaway [Bellare-Rogaway2] in 1994. Suppose Alice wants to send a message  to Bob, whose RSA public key is , where  has  bits. Two positive integers  and  are specified in advance, with . Alice’s message is allowed to have  bits. Typical values are , , . Let  be a function that inputs strings of  bits and outputs strings of  bits. Let  be a function that inputs  bits and outputs  bits. The functions  and  are usually constructed from hash functions (see Chapter 11 for a discussion of hash functions). To encrypt , Alice first expands it to length  by adjoining  zero bits. The result is denoted . She then chooses a random string  of  bits and computes



If the concatenation  is a binary number larger than , Alice chooses a new random number  and computes new values for  and . As soon as she obtains  (this has a probability of at least 1/2 of happening for each , as long as  produces fairly random outputs), she forms the ciphertext



To decrypt a ciphertext , Bob uses his private RSA decryption exponent  to compute . The result is written in the form



where  has  bits and  has  bits. Bob then computes



The correctness of this decryption can be justified as follows. If the ciphertext is the encryption of , then



Therefore,



and



Bob removes the  zero bits from the end of  and obtains . Also, Bob has check on the integrity of the ciphertext. If there are not  zeros at the end, then the ciphertext does not correspond to a valid encryption.

This method is sometimes called plaintext-aware encryption. Note that the padding with  depends on the message  and on the random parameter . This makes chosen ciphertext attacks on the system more difficult. It also is used for ciphertext indistinguishability. See Section 4.5.

For discussion of the security of OAEP, see [Shoup].

# 9.2.3 Timing Attacks

Another type of attack on RSA and similar systems was discovered by Paul Kocher in 1995, while he was an undergraduate at Stanford. He showed that it is possible to discover the decryption exponent by carefully timing the computation times for a series of decryptions. Though there are ways to thwart the attack, this development was unsettling. There had been a general feeling of security since the mathematics was well understood. Kocher’s attack demonstrated that a system could still have unexpected weaknesses.

Here is how the timing attack works. Suppose Eve is able to observe Bob decrypt several ciphertexts . She times how long this takes for each . Knowing each  and the time required for it to be decrypted will allow her to find the decryption exponent . But first, how could Eve obtain such information? There are several situations where encrypted messages are sent to Bob and his computer automatically decrypts and responds. Measuring the response times suffices for the present purposes.

We need to assume that we know the hardware being used to calculate . We can use this information to calculate the computation times for various steps that potentially occur in the process.

Let’s assume that  is computed by an algorithm given in Exercise 56 in Chapter 3, which is as follows:

Let  be written in binary (for example, when , we have ). Let  and  be integers. Perform the following procedure:

1. Start with  and .

2. If , let . If , let .

3. Let .

4. If , stop. If , add 1 to  and go to (2).

Then .

Note that the multiplication  occurs only when the bit . In many situations, there is a reasonably large variation in how long this multiplication takes. We assume this is the case here.

Before we continue, we need a few facts from probability. Suppose we have a random process that produces real numbers  as outputs. For us,  will be the time it takes for the computer to complete a calculation, given a random input . The mean is the average value of these outputs. If we record outputs , the mean should be approximately . The variance for the random process is approximated by



The standard deviation is the square root of the variance and gives a measure of how much variation there is in the values of the ’s.

The important fact we need is that when two random processes are independent, the variance for the sum of their outputs is the sum of the variances of the two processes. For example, we will break the computation done by the computer into two independent processes, which will take times  and . The total time  will be . Therefore,  should be approximately .

Now assume Eve knows ciphertexts  and the times that it took to compute each . Suppose she knows bits  of the exponent . Since she knows the hardware being used, she knows how much time was used in calculating  in the preceding algorithm. Therefore, she knows, for each , the time  that it takes to compute .

Eve wants to determine . If , a multiplication  will take place for each ciphertext  that is processed. If , there is no such multiplication.

Let  be the amount of time it takes the computer to perform the multiplication , though Eve does not yet know whether this multiplication actually occurs. Let . Eve computes  and . If , then Eve concludes that . If not, . After determining , she proceeds in the same manner to find all the bits.

Why does this work? If the multiplication occurs,  is the amount of time it takes the computer to complete the calculation after the multiplication. It is reasonable to assume  and  are outputs that are independent of each other. Therefore,



If the multiplication does not occur,  is the amount of time for an operation unrelated to the computation, so it is reasonable to assume  and  are independent. Therefore,



Note that we couldn’t use the mean in place of the variance, since the mean of  would be negative, so the last inequality would not hold. All that can be deduced from the mean is the total number of nonzero bits in the binary expansion of .

The preceding gives a fairly simple version of the method. In practice, various modifications would be needed, depending on the specific situation. But the general strategy remains the same. For more details, see [Kocher]. For more on timing attacks, see [Crosby et al.].

A similar attack on RSA works by measuring the power consumed during the computations. See [Kocher et al.]. Another method, called acoustic cryptanalysis, obtains information from the high-pitched noises emitted by the electronic components of a computer during its computations. See [Genkin et al.]. Attacks such as these and the timing attack can be prevented by appropriate design features in the physical implementation.

Timing attacks, power analysis, and acoustic cryptanalysis are examples of side-channel attacks, where the attack is on the implementation rather than on the basic cryptographic algorithm.