5.2 Linear Feedback Shift Register Sequences – Introduction to Cryptography with Coding Theory, 3rd Edition

5.2 Linear Feedback Shift Register Sequences

Note: In this section, all congruences are mod 2.

In many situations involving encryption, there is a trade-off between speed and security. If one wants a very high level of security, speed is often sacrificed, and vice versa. For example, in cable television, many bits of data are being transmitted, so speed of encryption is important. On the other hand, security is not usually as important since there is rarely an economic advantage to mounting an expensive attack on the system.

In this section, we describe a method that could be used when speed is more important than security. However, the real use is as one building block in more complex systems.

The sequence

01000010010110011111000110111010100001001011001111

can be described by giving the initial values

x10,  x21,  x30,  x40,  x50

and the linear recurrence relation

xn+5xn+xn+2(mod 2).

This sequence repeats after 31 terms.

For another example, see Example 18 in the Computer Appendices.

More generally, consider a linear recurrence relation of length m:

xn+mc0xn+c1xn+1++cm1xn+m1(mod 2), 

where the coefficients c0, c1,  are integers. If we specify the initial values

x1, x2, , xm, 

then all subsequent values of xn can be computed using the recurrence. The resulting sequence of 0s and 1s can be used as the key for encryption. Namely, write the plaintext as a sequence of 0s and 1s, then add an appropriate number of bits of the key sequence to the plaintext mod 2, bit by bit. For example, if the plaintext is 1011001110001111 and the key sequence is the example given previously, we have

(plaintext)1011001110001111(key)0100001001011001_(ciphertext)1111000111010110

Decryption is accomplished by adding the key sequence to the ciphertext in exactly the same way.

One advantage of this method is that a key with large period can be generated using very little information. The long period gives an improvement over the Vigenère method, where a short period allowed us to find the key. In the above example, specifying the initial vector {0, 1, 0, 0, 0} and the coefficients {1, 0, 1, 0, 0} yielded a sequence of period 31, so 10 bits were used to produce 31 bits. It can be shown that the recurrence

xn+31xn+xn+3

and any nonzero initial vector will produce a sequence with period 2311=2147483647. Therefore, 62 bits produce more than two billion bits of key. This is a great advantage over a one-time pad, where the full two billion bits must be sent in advance.

This method can be implemented very easily in hardware using what is known as a linear feedback shift register (LFSR) and is very fast. In Figure 5.2 we depict an example of a linear feedback shift register in a simple case. More complicated recurrences are implemented using more registers and more XORs.

Figure 5.2 A Linear Feedback Shift Register Satisfying xn+3=xn+1+xn.

For each increment of a counter, the bit in each box is shifted to other boxes as indicated, with denoting the addition mod 2 of the incoming bits. The output, which is the bit xn, is added to the next bit of plaintext to produce the ciphertext. The diagram in Figure 5.2 represents the recurrence xn+3xn+1+xn. Once the initial values x1,  x2,  x3 are specified, the machine produces the subsequent bits very efficiently.

Unfortunately, the preceding encryption method succumbs easily to a known plaintext attack. More precisely, if we know only a few consecutive bits of plaintext, along with the corresponding bits of ciphertext, we can determine the recurrence relation and therefore compute all subsequent bits of the key. By subtracting (or adding; it’s all the same mod 2) the plaintext from the ciphertext mod 2, we obtain the bits of the key. Therefore, for the rest of this discussion, we will ignore the ciphertext and plaintext and assume we have discovered a portion of the key sequence. Our goal is to use this portion of the key to deduce the coefficients of the recurrence and consequently compute the rest of the key.

For example, suppose we know the initial segment 011010111100 of the sequence 0110101111000100110101111, which has period 15, and suppose we know it is generated by a linear recurrence. How do we determine the coefficients of the recurrence? We do not necessarily know even the length, so we start with length 2 (length 1 would produce a constant sequence). Suppose the recurrence is xn+2=c0xn+c1xn+1. Let n=1 and n=2 and use the known values x1=0, x2=1, x3=1, x4=0. We obtain the equations

1c00+c11(n=1).0c01+c11(n=2).

In matrix form, this is

(0111)(c0c1)(10).

The solution is c0=1, c1=1, so we guess that the recurrence is xn+2xn+xn+1. Unfortunately, this is not correct since x6  x4+x5. Therefore, we try length 3. The resulting matrix equation is

(011110101)(c0c1c2)(010).

The determinant of the matrix is 0 mod 2; in fact, the equation has no solution. We can see this because every column in the matrix sums to 0 mod 2, while the vector on the right does not.

Now consider length 4. The matrix equation is

(0110110110100101)(c0c1c2c3)(1011).

The solution is c0=1, c1=1, c2=0, c3=0. The resulting recurrence is now conjectured to be

xn+4xn+xn+1.

A quick calculation shows that this generates the remaining bits of the piece of key that we already know, so it is our best guess for the recurrence that generates the key sequence.

What happens if we try length 5? The matrix equation is

0110111010101010101110111c0c1c2c3c401111.

The determinant of the matrix is 0 mod 2. Why? Notice that the last row is the sum of the first and second rows. This is a consequence of the recurrence relation: x5x1+x2, x6x2+x3, etc. As in linear algebra with real or complex numbers, if one row of a matrix is a linear combination of other rows, then the determinant is 0.

Similarly, if we look at the 6×6 matrix, we see that the 5th row is the sum of the first and second rows, and the 6th row is the sum of the second and third rows, so the determinant is 0 mod 2. In general, when the size of the matrix is larger than the length of the recurrence relation, the relation forces one row to be a linear combination of other rows, hence the determinant is 0 mod 2.

The general situation is as follows. To test for a recurrence of length m, we assume we know x1, x2, , x2m. The matrix equation is

x1x2xmx2x3xm+1xmxm+1x2m1c0c1cm1xm+1xm+2x2m.

We show later that the matrix is invertible mod 2 if and only if there is no linear recurrence of length less than m that is satisfied by x1, x2, , x2m1.

A strategy for finding the coefficients of the recurrence is now clear. Suppose we know the first 100 bits of the key. For m=2, 3, 4, , form the m×m matrix as before and compute its determinant. If several consecutive values of m yield 0 determinants, stop. The last m to yield a nonzero (i.e., 1 mod 2) determinant is probably the length of the recurrence. Solve the matrix equation to get the coefficients c0, , cm1. It can then be checked whether the sequence that this recurrence generates matches the sequence of known bits of the key. If not, try larger values of m.

Suppose we don’t know the first 100 bits, but rather some other 100 consecutive bits of the key. The same procedure applies, using these bits as the starting point. In fact, once we find the recurrence, we can also work backwards to find the bits preceding the starting point.

Here is an example. Suppose we have the following sequence of 100 bits:

1001100100111000110001010001111011001111101010100101101101011000011011100101011110000000100010010000.

The first 20 determinants, starting with m=1, are

1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.

A reasonable guess is that m=8 gives the last nonzero determinant. When we solve the matrix equation for the coefficients we get

{c0, c1, , c7}={1, 1, 0, 0, 1, 0, 0, 0}, 

so we guess that the recurrence is

xn+8xn+xn+1+xn+4.

This recurrence generates all 100 terms of the original sequence, so we have the correct answer, at least based on the knowledge that we have.

Suppose that the 100 bits were in the middle of some sequence, and we want to know the preceding bits. For example, suppose the sequence starts with x17, so x17=1, x18=0, x19=0, . Write the recurrence as

xnxn+1+xn+4+xn+8

(it might appear that we made some sign errors, but recall that we are working mod 2, so xnxn and xn+8xn+8). Letting n=16 yields

x16x17+x20+x241+0+10.

Continuing in this way, we successively determine x15, x14, , x1.

For more examples, see Examples 19 and 20 in the Computer Appendices.

We now prove the result we promised.

Proposition

Let x1, x2, x3,  be a sequence of bits produced by a linear recurrence mod 2. For each n1, let

Mn=x1x2xnx2x3xn+1xnxn+1x2n1.

Let N be the length of the shortest recurrence that generates the sequence x1, x2, x3, . Then det(MN)1 (mod 2) and det(Mn)0 (mod 2) for all n>N.

Proof. We first make a few remarks on the length of recurrences. A sequence could satisfy a length 3 relation such as xn+3xn+2. It would clearly then also satisfy shorter relations such as xn+1=xn (at least for n2). However, there are less obvious ways that a sequence could satisfy a recurrence of length less than expected. For example, consider the relation xn+4xn+3+xn+1+xn. Suppose the initial values of the sequence are 1,  1,  0,  1. The recurrence allows us to compute subsequent terms: 1,  0,  1,  1,  0,  1,  1,  0,  1,  1,  0,  1. It is easy to see that the sequence satisfies xn+2xn+1+xn.

If there is a recurrence of length N and if n>N, then one row of the matrix Mn is congruent mod 2 to a linear combination of other rows. For example, if the recurrence is xn+3=xn+2+xn, then the fourth row is the sum of the first and third rows. Therefore, det(Mn)0 (mod 2) for all n>N.

Now suppose det(MN)0 (mod 2). Then there is a nonzero row vector b=(b0, , bN1) such that bMN0. We’ll show that this gives a recurrence relation for the sequence x1, x2, x3,  and that the length of this relation is less than N. This contradicts the assumption that N is smallest. This contradiction implies that det(MN)1 (mod 2).

Let the recurrence of length N be

xn+Nc0xn++cN1xn+N1.

For each i0, let

M(i)=xi+1xi+2xi+Nxi+2xi+3xi+N+1xi+Nxi+N+1xi+2N1.

Then M(0)=MN. The recurrence relation implies that

M(i)c0c1cN1xi+N+1xi+N+2xi+2N, 

which is the last column of M(i+1).

By the choice of b, we have bM(0)=bMN=0. Suppose that we know that bM(i)=0 for some i. Then

bxi+N+1xi+N+2xi+2NbM(i)c0c1cm10.

Therefore, b annihilates the last column of M(i+1). Since the remaining columns of M(i+1) are columns of M(i), we find that bM(i+1)0. By induction, we obtain bM(i)0 for all i0.

Let n1. The first column of M(n1) yields

b0xn+b1xn+1++bN1xn+N10.

Since b is not the zero vector, bj0 for at least one j. Let m be the largest j such that bj0, which means that bm=1. We are working mod 2, so bmxn+mxn+m. Therefore, we can rearrange the relation to obtain

xn+mb0xn+b1xn+1++bm1xn+m1.

This is a recurrence of length m. Since mN1, and N is assumed to be the shortest possible length, we have a contradiction. Therefore, the assumption that det(MN)0 must be false, so det(MN)1. This completes the proof.

Finally, we make a few comments about the period of a sequence. Suppose the length of the recurrence is m. Any m consecutive terms of the sequence determine all future elements, and, by reversing the recurrence, all previous values, too. Clearly, if we have m consecutive 0s, then all future values are 0. Also, all previous values are 0. Therefore, we exclude this case from consideration. There are 2m1 strings of 0s and 1s of length m in which at least one term is nonzero. Therefore, as soon as there are more than 2m1 terms, some string of length m must occur twice, so the sequence repeats. The period of the sequence is at most 2m1.

Associated to a recurrence xn+mc0xn+c1xn+1++cm1xn+m1 (mod 2), there is a polynomial

f(T)=Tmcm1Tm1c0.

If f(T) is irreducible mod 2 (this means that it is not congruent to the product of two lower-degree polynomials), then it can be shown that the period divides 2m1. An interesting case is when 2m1 is prime (these are called Mersenne primes). If the period isn’t 1, that is, if the sequence is not constant, then the period in this special case must be maximal, namely 2m1 (see Section 3.11). The example where the period is 2311 is of this type.

Linear feedback shift register sequences have been studied extensively. For example, see [Golomb] or [van der Lubbe].

One way of thwarting the above attack is to use nonlinear recurrences, for example,

xn+3xn+2xn+xn+1.

Moreover, a look-up table that takes inputs xn, xn+1, xn+2 and outputs a bit xn+3 could be used, or several LFSRs could be combined nonlinearly and some of these LFSRs could have irregular clocking. Generally, these systems are somewhat harder to break. However, we shall not discuss them here.