# 5.2 Linear Feedback Shift Register Sequences

Note: In this section, all congruences are mod 2.

In many situations involving encryption, there is a trade-off between speed and security. If one wants a very high level of security, speed is often sacrificed, and vice versa. For example, in cable television, many bits of data are being transmitted, so speed of encryption is important. On the other hand, security is not usually as important since there is rarely an economic advantage to mounting an expensive attack on the system.

In this section, we describe a method that could be used when speed is more important than security. However, the real use is as one building block in more complex systems.

The sequence



can be described by giving the initial values



and the linear recurrence relation



This sequence repeats after 31 terms.

For another example, see Example 18 in the Computer Appendices.

More generally, consider a linear recurrence relation of length :



where the coefficients  are integers. If we specify the initial values



then all subsequent values of  can be computed using the recurrence. The resulting sequence of 0s and 1s can be used as the key for encryption. Namely, write the plaintext as a sequence of 0s and 1s, then add an appropriate number of bits of the key sequence to the plaintext mod 2, bit by bit. For example, if the plaintext is  and the key sequence is the example given previously, we have



Decryption is accomplished by adding the key sequence to the ciphertext in exactly the same way.

One advantage of this method is that a key with large period can be generated using very little information. The long period gives an improvement over the Vigenère method, where a short period allowed us to find the key. In the above example, specifying the initial vector  and the coefficients  yielded a sequence of period 31, so 10 bits were used to produce 31 bits. It can be shown that the recurrence



and any nonzero initial vector will produce a sequence with period . Therefore, 62 bits produce more than two billion bits of key. This is a great advantage over a one-time pad, where the full two billion bits must be sent in advance.

This method can be implemented very easily in hardware using what is known as a linear feedback shift register (LFSR) and is very fast. In Figure 5.2 we depict an example of a linear feedback shift register in a simple case. More complicated recurrences are implemented using more registers and more XORs.

For each increment of a counter, the bit in each box is shifted to other boxes as indicated, with  denoting the addition mod 2 of the incoming bits. The output, which is the bit , is added to the next bit of plaintext to produce the ciphertext. The diagram in Figure 5.2 represents the recurrence . Once the initial values  are specified, the machine produces the subsequent bits very efficiently.

Unfortunately, the preceding encryption method succumbs easily to a known plaintext attack. More precisely, if we know only a few consecutive bits of plaintext, along with the corresponding bits of ciphertext, we can determine the recurrence relation and therefore compute all subsequent bits of the key. By subtracting (or adding; it’s all the same mod 2) the plaintext from the ciphertext mod 2, we obtain the bits of the key. Therefore, for the rest of this discussion, we will ignore the ciphertext and plaintext and assume we have discovered a portion of the key sequence. Our goal is to use this portion of the key to deduce the coefficients of the recurrence and consequently compute the rest of the key.

For example, suppose we know the initial segment  of the sequence , which has period 15, and suppose we know it is generated by a linear recurrence. How do we determine the coefficients of the recurrence? We do not necessarily know even the length, so we start with length 2 (length 1 would produce a constant sequence). Suppose the recurrence is . Let  and  and use the known values . We obtain the equations



In matrix form, this is



The solution is , so we guess that the recurrence is . Unfortunately, this is not correct since  Therefore, we try length 3. The resulting matrix equation is



The determinant of the matrix is 0 mod 2; in fact, the equation has no solution. We can see this because every column in the matrix sums to 0 mod 2, while the vector on the right does not.

Now consider length 4. The matrix equation is



The solution is . The resulting recurrence is now conjectured to be



A quick calculation shows that this generates the remaining bits of the piece of key that we already know, so it is our best guess for the recurrence that generates the key sequence.

What happens if we try length 5? The matrix equation is



The determinant of the matrix is 0 mod 2. Why? Notice that the last row is the sum of the first and second rows. This is a consequence of the recurrence relation: , etc. As in linear algebra with real or complex numbers, if one row of a matrix is a linear combination of other rows, then the determinant is 0.

Similarly, if we look at the  matrix, we see that the 5th row is the sum of the first and second rows, and the 6th row is the sum of the second and third rows, so the determinant is 0 mod 2. In general, when the size of the matrix is larger than the length of the recurrence relation, the relation forces one row to be a linear combination of other rows, hence the determinant is 0 mod 2.

The general situation is as follows. To test for a recurrence of length , we assume we know . The matrix equation is



We show later that the matrix is invertible mod 2 if and only if there is no linear recurrence of length less than  that is satisfied by .

A strategy for finding the coefficients of the recurrence is now clear. Suppose we know the first 100 bits of the key. For , form the  matrix as before and compute its determinant. If several consecutive values of  yield 0 determinants, stop. The last  to yield a nonzero (i.e., 1 mod 2) determinant is probably the length of the recurrence. Solve the matrix equation to get the coefficients . It can then be checked whether the sequence that this recurrence generates matches the sequence of known bits of the key. If not, try larger values of .

Suppose we don’t know the first 100 bits, but rather some other 100 consecutive bits of the key. The same procedure applies, using these bits as the starting point. In fact, once we find the recurrence, we can also work backwards to find the bits preceding the starting point.

Here is an example. Suppose we have the following sequence of 100 bits:



The first 20 determinants, starting with , are



A reasonable guess is that  gives the last nonzero determinant. When we solve the matrix equation for the coefficients we get



so we guess that the recurrence is



This recurrence generates all 100 terms of the original sequence, so we have the correct answer, at least based on the knowledge that we have.

Suppose that the 100 bits were in the middle of some sequence, and we want to know the preceding bits. For example, suppose the sequence starts with , so . Write the recurrence as



(it might appear that we made some sign errors, but recall that we are working mod 2, so  and ). Letting  yields



Continuing in this way, we successively determine .

For more examples, see Examples 19 and 20 in the Computer Appendices.

We now prove the result we promised.

# Proposition

Let  be a sequence of bits produced by a linear recurrence mod 2. For each , let



Let  be the length of the shortest recurrence that generates the sequence . Then  and  for all .

Proof. We first make a few remarks on the length of recurrences. A sequence could satisfy a length 3 relation such as . It would clearly then also satisfy shorter relations such as  (at least for ). However, there are less obvious ways that a sequence could satisfy a recurrence of length less than expected. For example, consider the relation . Suppose the initial values of the sequence are . The recurrence allows us to compute subsequent terms: . It is easy to see that the sequence satisfies .

If there is a recurrence of length  and if , then one row of the matrix  is congruent mod 2 to a linear combination of other rows. For example, if the recurrence is , then the fourth row is the sum of the first and third rows. Therefore,  for all .

Now suppose . Then there is a nonzero row vector  such that . We’ll show that this gives a recurrence relation for the sequence  and that the length of this relation is less than . This contradicts the assumption that  is smallest. This contradiction implies that .

Let the recurrence of length  be



For each , let



Then . The recurrence relation implies that



which is the last column of .

By the choice of , we have . Suppose that we know that  for some . Then



Therefore,  annihilates the last column of . Since the remaining columns of  are columns of , we find that . By induction, we obtain  for all .

Let . The first column of  yields



Since  is not the zero vector,  for at least one . Let  be the largest  such that , which means that . We are working mod 2, so . Therefore, we can rearrange the relation to obtain



This is a recurrence of length . Since , and  is assumed to be the shortest possible length, we have a contradiction. Therefore, the assumption that  must be false, so . This completes the proof.

Finally, we make a few comments about the period of a sequence. Suppose the length of the recurrence is . Any  consecutive terms of the sequence determine all future elements, and, by reversing the recurrence, all previous values, too. Clearly, if we have  consecutive 0s, then all future values are 0. Also, all previous values are 0. Therefore, we exclude this case from consideration. There are  strings of 0s and 1s of length  in which at least one term is nonzero. Therefore, as soon as there are more than  terms, some string of length  must occur twice, so the sequence repeats. The period of the sequence is at most .

Associated to a recurrence , there is a polynomial



If  is irreducible mod 2 (this means that it is not congruent to the product of two lower-degree polynomials), then it can be shown that the period divides . An interesting case is when  is prime (these are called Mersenne primes). If the period isn’t 1, that is, if the sequence is not constant, then the period in this special case must be maximal, namely  (see Section 3.11). The example where the period is  is of this type.

Linear feedback shift register sequences have been studied extensively. For example, see [Golomb] or [van der Lubbe].

One way of thwarting the above attack is to use nonlinear recurrences, for example,



Moreover, a look-up table that takes inputs  and outputs a bit  could be used, or several LFSRs could be combined nonlinearly and some of these LFSRs could have irregular clocking. Generally, these systems are somewhat harder to break. However, we shall not discuss them here.