# 24.4 Linear Codes

When you are having a conversation with a friend over a cellular phone, your voice is turned into digital data that has an error correcting code applied to it before it is sent. When your friend receives the data, the errors in transmission must be accounted for by decoding the error correcting code. Only after decoding are the data turned into sound that represents your voice.

The amount of delay it takes for a packet of data to be decoded is critical in such an application. If it took several seconds, then the delay would become aggravating and make holding a conversation difficult.

The problem of efficiently decoding a code is therefore of critical importance. In order to decode quickly, it is helpful to have some structure in the code rather than taking the code to be a random subset of . This is one of the primary reasons for studying linear codes. For the remainder of this chapter, we restrict our attention to linear codes.

Henceforth, the alphabet  will be a finite field . For an introduction to finite fields, see Section 3.11. For much of what we do, the reader can assume that  is  = the integers mod 2, in which case we are working with binary vectors. Another concrete example of a finite field is  = the integers mod a prime . For other examples, see Section 3.11. In particular, as is pointed out there,  must be one of the finite fields ; but the present notation is more compact. Since we are working with arbitrary finite fields, we’ll use “=" instead of “" in our equations. If you want to think of  as being , just replace all equalities between elements of  with congruences mod 2.

The set of -dimensional vectors with entries in  is denoted by . They form a vector space over . Recall that a subspace of  is a nonempty subset  that is closed under linear combinations, which means that if  are in  and  are in , then . By taking , for example, we see that .

# Definition

A linear code of dimension  and length  over a field  is a -dimensional subspace of . Such a code is called an  code. If the minimum distance of the code is , then the code is called an  code.

When , the definition can be given more simply. A binary code of length  and dimension  is a set of  binary -tuples (the codewords) such that the sum of any two codewords is always a codeword.

Many of the codes we have met are linear codes. For example, the binary repetition code  is a one-dimensional subspace of . The parity check code from Exercise 2 in Section 24.1 is a linear code of dimension 7 and length 8. It consists of those binary vectors of length 8 such that the sum of the entries is 0 mod 2. It is not hard to show that the set of such vectors forms a subspace. The vectors



form a basis of this subspace. Since there are seven basis vectors, the subspace is seven-dimensional.

The Hamming [7, 4] code from Example 4 of Section 24.1 is a linear code of dimension 4 and length 7. Every codeword is a linear combination of the four rows of the matrix . Since these four rows span the code and are linearly independent, they form a basis.

The ISBN code (Example 5 of Section 24.1) is not linear. It consists of a set of 10-dimensional vectors with entries in . However, it is not closed under linear combinations since  is not allowed as one of the first nine entries.

Let  be a linear code of dimension  over a field . If  has  elements, then  has  elements. This may be seen as follows. There is a basis of  with  elements; call them . Every element of  can be written uniquely in the form , with . There are  choices for each  and there are  numbers . This means there are  elements of , as claimed. Therefore, an  linear code is an  code in the notation of Section 24.2.

For an arbitrary, possibly nonlinear, code, computing the minimum distance could require computing  for every pair of codewords. For a linear code, the computation is much easier. Define the Hamming weight  of a vector  to be the number of nonzero places in . It equals , where 0 denotes the vector .

# Proposition

Let  be a linear code. Then  equals the smallest Hamming weight of all nonzero code vectors: .

Proof. Since  is the distance between two codewords, we have  for all codewords . It remains to show that there is a codeword with weight equal to . Note that  for any two vectors . This is because an entry of  is nonzero, and hence gets counted in , if and only if  and  differ in that entry. Choose  and  to be distinct codewords such that . Then , so the minimum weight of the nonzero codewords equals .

To construct a linear  code, we have to construct a -dimensional subspace of . The easiest way to do this is to choose  linearly independent vectors and take their span. This can be done by choosing a  generating matrix  of rank , with entries in . The set of vectors of the form , where  runs through all row vectors in , then gives the subspace.

For our purposes, we’ll usually take , where  is the  identity matrix and  is a  matrix. The rows of  are the basis for a -dimensional subspace of the space of all vectors of length . This subspace is our linear code . In other words, every codeword is uniquely expressible as a linear combination of rows of . If we use a matrix  to construct a code, the first  columns determine the codewords. The remaining  columns provide the redundancy.

The code in the second half of Example 1, Section 24.1, has



The codewords 101010 and 010101 appear as rows in the matrix and the codeword 111111 is the sum of these two rows. This is a  code.

The code in Example 2 has



For example, the codeword 11001001 is the sum mod 2 of the first, second, and fifth rows, and hence is obtained by multiplying  times . This is an  code.

In Exercise 4, the matrix  is given in the description of the code. As you can guess from its name, it is a  code.

As mentioned previously, we could start with any  matrix of rank . Its rows would generate an  code. However, row and column operations can be used to transform the matrix to the form of  we are using, so we usually do not work with the more general situation. A code described by a matrix  as before is said to be systematic. In this case, the first  bits are the information symbols and the last  symbols are the check symbols.

Suppose we have  as the generating matrix for a code . Let



where  is the transpose of . In Exercise 4 of Section 24.1, this is the matrix that was used to correct errors. For Exercise 2, we have . Note that in this case a binary string  is a codeword if and only if the number of nonzero bits is even, which is the same as saying that its dot product with  is zero. This can be rewritten as , where  is the transpose of .

More generally, suppose we have a linear code . A matrix  is called a parity check matrix for  if  has the property that a vector  is in  if and only if . We have the following useful result.

# Theorem

If  is the generating matrix for a code , then  is a parity check matrix for .

Proof. Consider the th row of , which has the form



where the 1 is in the th position. This is a vector of the code . The th column of  is the vector



where the 1 is in the th position. To obtain the th element of , take the dot product of these two vectors, which yields



Therefore,  annihilates every row  of . Since every element of  is a sum of rows of , we find that  for all .

Recall the following fact from linear algebra: The left null space of an  matrix of rank  has dimension . Since  contains  as a submatrix, it has rank . Therefore, its left null space has dimension . But we have just proved that  is contained in this null space. Since  also has dimension , it must equal the null space, which is what the theorem claims.

We now have a way of detecting errors: If  is received during a transmission and , then there is an error. If , we cannot conclude that there is no error, but we do know that  is a codeword. Since it is more likely that no errors occurred than enough errors occurred to change one codeword into another codeword, the best guess is that an error did not occur.

We can also use a parity check matrix to make the task of decoding easier. First, let’s look at an example.

# Example

Let  be the binary linear code with generator matrix



We are going to make a table of all binary vectors of length 4 according to the following procedure. First, list the four elements of the code in the first row, starting with . Then, among the 12 remaining vectors, choose one of smallest weight (there might be several choices). Add this vector to the first row to obtain the second row. From the remaining eight vectors, again choose one with smallest weight and add it to the first row to obtain the third row. Finally, choose a vector with smallest weight from the remaining four vectors, add it to the first row, and obtain the fourth row. We obtain the following:



This can be used as a decoding table. When we receive a vector, find it in the table. Decode by changing the vector to the one at the top of its column. The error that is removed is first element of its row. For example, suppose we receive . It is the last element of the second row. Decode it to , which means removing the error . In this small example, this is not exactly the same as nearest neighbor decoding, since  decodes as  when it has an equally close neighbor . The problem is that the minimum distance of the code is 2, so general error correction is not possible. However, if we use a code that can correct up to  errors, this procedure correctly decodes all vectors that are distance at most  from a codeword.

In a large example, finding the vector in the table can be tedious. In fact, writing the table can be rather difficult (that’s why we used such a small example). This is where a parity check matrix  comes to the rescue.

The first vector  in a row is called the coset leader. Let  be any vector in the same row as . Then  for some codeword , since this is how the table was constructed. Therefore,



since  by the definition of a parity check matrix. The vector  is called the syndrome of . What we have shown is that two vectors in the same row have the same syndrome. Replace the preceding table with the following much smaller table.

This table may be used for decoding as follows. For a received vector , calculate its syndrome . Find this syndrome on the list and subtract the corresponding coset leader from . This gives the same decoding as above. For example, if , then



This is the syndrome for the second row. Subtract the coset leader  from  to obtain the codeword .

We now consider the general situation. The method of the example leads us to two definitions.

# Definition

Let  be a linear code and let  be an -dimensional vector. The set  given by



is called a coset of .

It is easy to see that if , then the sets  and  are the same (Exercise 9).

# Definition

A vector having minimum Hamming weight in a coset is called a coset leader.

The syndrome of a vector  is defined to be . The following lemma allows us to determine the cosets easily.

# Lemma

Two vectors  and  belong to the same coset if and only if they have the same syndrome.

Proof. Two vectors  and  to belong to the same coset if and only if their difference belongs to the code ; that is, . This happens if and only if , which is equivalent to .

Decoding can be achieved by building a syndrome lookup table, which consists of the coset leaders and their corresponding syndromes. With a syndrome lookup table, we can decode with the following steps:

1. For a received vector , calculate its syndrome .

2. Next, find the coset leader with the same syndrome as . Call the coset leader .

3. Decode  as .

Syndrome decoding requires significantly fewer steps than searching for the nearest codeword to a received vector. However, for large codes it is still too inefficient to be practical. In general, the problem of finding the nearest neighbor in a general linear code is hard; in fact, it is what is known as an NP-complete problem. However, for certain special types of codes, efficient decoding is possible. We treat some examples in the next few sections.

# 24.4.1 Dual Codes

The vector space  has a dot product, defined in the usual way:



For example, if , then



so we find the possibly surprising fact that the dot product of a nonzero vector with itself can sometimes be 0, in contrast to the situation with real numbers. Therefore, the dot product does not tell us the length of a vector. But it is still a useful concept.

If  is a linear  code, define the dual code



# Proposition

If  is a linear  code with generating matrix , then  is a linear  code with generating matrix . Moreover,  is a parity check matrix for .

Proof. Since every element of  is a linear combination of the rows of , a vector  is in  if and only if . This means that  is the left null space of . Also, we see that  is a parity check matrix for . Since  has rank , so does . The left null space of  therefore has dimension , so  has dimension . Because  is a parity check matrix for , and the rows of  are in , we have . Taking the transpose of this relation, and recalling that transpose reverses order (), we find . This means that the rows of  are in the left null space of ; therefore, in . Since  has rank , the span of its rows has dimension , which is the same as the dimension of . It follows that the rows of  span , so  is a generating matrix for .

A code  is called self-dual is . The Golay code  of Section 24.6 is an important example of a self-dual code.

# Example

Let  be the binary repetition code. Since  for every , a vector  is in  if and only if . This means that  is a parity check code:  if and only if .

# Example

Let  be the binary code with generating matrix



The proposition says that  has generating matrix



This is  with the rows switched, so the rows of  and the rows of  generate the same subspace. Therefore, , which says that  is self-dual.