When you are having a conversation with a friend over a cellular phone, your voice is turned into digital data that has an error correcting code applied to it before it is sent. When your friend receives the data, the errors in transmission must be accounted for by decoding the error correcting code. Only after decoding are the data turned into sound that represents your voice.
The amount of delay it takes for a packet of data to be decoded is critical in such an application. If it took several seconds, then the delay would become aggravating and make holding a conversation difficult.
The problem of efficiently decoding a code is therefore of critical importance. In order to decode quickly, it is helpful to have some structure in the code rather than taking the code to be a random subset of . This is one of the primary reasons for studying linear codes. For the remainder of this chapter, we restrict our attention to linear codes.
Henceforth, the alphabet will be a finite field . For an introduction to finite fields, see Section 3.11. For much of what we do, the reader can assume that is = the integers mod 2, in which case we are working with binary vectors. Another concrete example of a finite field is = the integers mod a prime . For other examples, see Section 3.11. In particular, as is pointed out there, must be one of the finite fields ; but the present notation is more compact. Since we are working with arbitrary finite fields, we’ll use “=" instead of “" in our equations. If you want to think of as being , just replace all equalities between elements of with congruences mod 2.
The set of -dimensional vectors with entries in is denoted by . They form a vector space over . Recall that a subspace of is a nonempty subset that is closed under linear combinations, which means that if are in and are in , then . By taking , for example, we see that .
A linear code of dimension and length over a field is a -dimensional subspace of . Such a code is called an code. If the minimum distance of the code is , then the code is called an code.
When , the definition can be given more simply. A binary code of length and dimension is a set of binary -tuples (the codewords) such that the sum of any two codewords is always a codeword.
Many of the codes we have met are linear codes. For example, the binary repetition code is a one-dimensional subspace of . The parity check code from Exercise 2 in Section 24.1 is a linear code of dimension 7 and length 8. It consists of those binary vectors of length 8 such that the sum of the entries is 0 mod 2. It is not hard to show that the set of such vectors forms a subspace. The vectors
form a basis of this subspace. Since there are seven basis vectors, the subspace is seven-dimensional.
The Hamming [7, 4] code from Example 4 of Section 24.1 is a linear code of dimension 4 and length 7. Every codeword is a linear combination of the four rows of the matrix . Since these four rows span the code and are linearly independent, they form a basis.
The ISBN code (Example 5 of Section 24.1) is not linear. It consists of a set of 10-dimensional vectors with entries in . However, it is not closed under linear combinations since is not allowed as one of the first nine entries.
Let be a linear code of dimension over a field . If has elements, then has elements. This may be seen as follows. There is a basis of with elements; call them . Every element of can be written uniquely in the form , with . There are choices for each and there are numbers . This means there are elements of , as claimed. Therefore, an linear code is an code in the notation of Section 24.2.
For an arbitrary, possibly nonlinear, code, computing the minimum distance could require computing for every pair of codewords. For a linear code, the computation is much easier. Define the Hamming weight of a vector to be the number of nonzero places in . It equals , where 0 denotes the vector .
Let be a linear code. Then equals the smallest Hamming weight of all nonzero code vectors: .
Proof. Since is the distance between two codewords, we have for all codewords . It remains to show that there is a codeword with weight equal to . Note that for any two vectors . This is because an entry of is nonzero, and hence gets counted in , if and only if and differ in that entry. Choose and to be distinct codewords such that . Then , so the minimum weight of the nonzero codewords equals .
To construct a linear code, we have to construct a -dimensional subspace of . The easiest way to do this is to choose linearly independent vectors and take their span. This can be done by choosing a generating matrix of rank , with entries in . The set of vectors of the form , where runs through all row vectors in , then gives the subspace.
For our purposes, we’ll usually take , where is the identity matrix and is a matrix. The rows of are the basis for a -dimensional subspace of the space of all vectors of length . This subspace is our linear code . In other words, every codeword is uniquely expressible as a linear combination of rows of . If we use a matrix to construct a code, the first columns determine the codewords. The remaining columns provide the redundancy.
The codewords 101010 and 010101 appear as rows in the matrix and the codeword 111111 is the sum of these two rows. This is a code.
The code in Example 2 has
For example, the codeword 11001001 is the sum mod 2 of the first, second, and fifth rows, and hence is obtained by multiplying times . This is an code.
In Exercise 4, the matrix is given in the description of the code. As you can guess from its name, it is a code.
As mentioned previously, we could start with any matrix of rank . Its rows would generate an code. However, row and column operations can be used to transform the matrix to the form of we are using, so we usually do not work with the more general situation. A code described by a matrix as before is said to be systematic. In this case, the first bits are the information symbols and the last symbols are the check symbols.
Suppose we have as the generating matrix for a code . Let
where is the transpose of . In Exercise 4 of Section 24.1, this is the matrix that was used to correct errors. For Exercise 2, we have . Note that in this case a binary string is a codeword if and only if the number of nonzero bits is even, which is the same as saying that its dot product with is zero. This can be rewritten as , where is the transpose of .
More generally, suppose we have a linear code . A matrix is called a parity check matrix for if has the property that a vector is in if and only if . We have the following useful result.
If is the generating matrix for a code , then is a parity check matrix for .
Proof. Consider the th row of , which has the form
where the 1 is in the th position. This is a vector of the code . The th column of is the vector
where the 1 is in the th position. To obtain the th element of , take the dot product of these two vectors, which yields
Therefore, annihilates every row of . Since every element of is a sum of rows of , we find that for all .
Recall the following fact from linear algebra: The left null space of an matrix of rank has dimension . Since contains as a submatrix, it has rank . Therefore, its left null space has dimension . But we have just proved that is contained in this null space. Since also has dimension , it must equal the null space, which is what the theorem claims.
We now have a way of detecting errors: If is received during a transmission and , then there is an error. If , we cannot conclude that there is no error, but we do know that is a codeword. Since it is more likely that no errors occurred than enough errors occurred to change one codeword into another codeword, the best guess is that an error did not occur.
We can also use a parity check matrix to make the task of decoding easier. First, let’s look at an example.
Let be the binary linear code with generator matrix
We are going to make a table of all binary vectors of length 4 according to the following procedure. First, list the four elements of the code in the first row, starting with . Then, among the 12 remaining vectors, choose one of smallest weight (there might be several choices). Add this vector to the first row to obtain the second row. From the remaining eight vectors, again choose one with smallest weight and add it to the first row to obtain the third row. Finally, choose a vector with smallest weight from the remaining four vectors, add it to the first row, and obtain the fourth row. We obtain the following:
This can be used as a decoding table. When we receive a vector, find it in the table. Decode by changing the vector to the one at the top of its column. The error that is removed is first element of its row. For example, suppose we receive . It is the last element of the second row. Decode it to , which means removing the error . In this small example, this is not exactly the same as nearest neighbor decoding, since decodes as when it has an equally close neighbor . The problem is that the minimum distance of the code is 2, so general error correction is not possible. However, if we use a code that can correct up to errors, this procedure correctly decodes all vectors that are distance at most from a codeword.
In a large example, finding the vector in the table can be tedious. In fact, writing the table can be rather difficult (that’s why we used such a small example). This is where a parity check matrix comes to the rescue.
The first vector in a row is called the coset leader. Let be any vector in the same row as . Then for some codeword , since this is how the table was constructed. Therefore,
since by the definition of a parity check matrix. The vector is called the syndrome of . What we have shown is that two vectors in the same row have the same syndrome. Replace the preceding table with the following much smaller table.
This table may be used for decoding as follows. For a received vector , calculate its syndrome . Find this syndrome on the list and subtract the corresponding coset leader from . This gives the same decoding as above. For example, if , then
This is the syndrome for the second row. Subtract the coset leader from to obtain the codeword .
We now consider the general situation. The method of the example leads us to two definitions.
Let be a linear code and let be an -dimensional vector. The set given by
is called a coset of .
It is easy to see that if , then the sets and are the same (Exercise 9).
A vector having minimum Hamming weight in a coset is called a coset leader.
The syndrome of a vector is defined to be . The following lemma allows us to determine the cosets easily.
Two vectors and belong to the same coset if and only if they have the same syndrome.
Proof. Two vectors and to belong to the same coset if and only if their difference belongs to the code ; that is, . This happens if and only if , which is equivalent to .
Decoding can be achieved by building a syndrome lookup table, which consists of the coset leaders and their corresponding syndromes. With a syndrome lookup table, we can decode with the following steps:
For a received vector , calculate its syndrome .
Next, find the coset leader with the same syndrome as . Call the coset leader .
Decode as .
Syndrome decoding requires significantly fewer steps than searching for the nearest codeword to a received vector. However, for large codes it is still too inefficient to be practical. In general, the problem of finding the nearest neighbor in a general linear code is hard; in fact, it is what is known as an NP-complete problem. However, for certain special types of codes, efficient decoding is possible. We treat some examples in the next few sections.
The vector space has a dot product, defined in the usual way:
For example, if , then
so we find the possibly surprising fact that the dot product of a nonzero vector with itself can sometimes be 0, in contrast to the situation with real numbers. Therefore, the dot product does not tell us the length of a vector. But it is still a useful concept.
If is a linear code, define the dual code
If is a linear code with generating matrix , then is a linear code with generating matrix . Moreover, is a parity check matrix for .
Proof. Since every element of is a linear combination of the rows of , a vector is in if and only if . This means that is the left null space of . Also, we see that is a parity check matrix for . Since has rank , so does . The left null space of therefore has dimension , so has dimension . Because is a parity check matrix for , and the rows of are in , we have . Taking the transpose of this relation, and recalling that transpose reverses order (), we find . This means that the rows of are in the left null space of ; therefore, in . Since has rank , the span of its rows has dimension , which is the same as the dimension of . It follows that the rows of span , so is a generating matrix for .
A code is called self-dual is . The Golay code of Section 24.6 is an important example of a self-dual code.
Let be the binary repetition code. Since for every , a vector is in if and only if . This means that is a parity check code: if and only if .
Let be the binary code with generating matrix
The proposition says that has generating matrix
This is with the rows switched, so the rows of and the rows of generate the same subspace. Therefore, , which says that is self-dual.