# 5.4 Invariant Subspaces and the Cayley-Hamilton Theorem

In Section 5.1, we observed that if v is an eigenvector of a linear operator T, then T maps the span of  into itself. Subspaces that are mapped into themselves are of great importance in the study of linear operators (see, e.g., Exercises 29-33 of Section 2.1).

# Definition.

Let T be a linear operator on a vector space V. A subspace W of V is called a T-invariant subspace of V if , that is, if  for all .

# Example 1

Suppose that T is a linear operator on a vector space V. Then the following subspaces of V are T-invariant:

1. (1) 

2. (2) V

3. (3) R(T)

4. (4) N(T)

5. (5) , for any eigenvalue  of T.

The proofs that these subspaces are T-invariant are left as exercises. (see Exercise 3.)

# Example 2

Let T be the linear operator on  defined by



Then the  and the  are T-invariant subspaces of .

Let T be a linear operator on a vector space V, and let x be a nonzero vector in V. The subspace



is called the T-cyclic subspace of V generated by x. It is a simple matter to show that W is T-invariant. In fact, W is the “smallest” T-invariant sub- space of V containing x. That is, any T-invariant subspace of V containing x must also contain W (see Exercise 11). Cyclic subspaces have various uses. We apply them in this section to establish the Cayley-Hamilton theorem. In Exercise 31, we outline a method for using cyclic subspaces to compute the characteristic polynomial of a linear operator without resorting to determinants. Cyclic subspaces also play an important role in Chapter 7, where we study matrix representations of nondiagonalizable linear operators.

# Example 3

Let T be the linear operator on  defined by



We determine the T-cyclic subspace generated by . Since



and



it follows that



# Example 4

Let T be the linear operator on P(R) defined by . Then the T-cyclic subspace generated by  is .

The existence of a T-invariant subspace provides the opportunity to define a new linear operator whose domain is this subspace. If T is a linear operator on V and W is a T-invariant subspace of V, then the restriction  of T to W (see Appendix B) is a mapping from W to W, and it follows that  is a linear operator on W (see Exercise 7). As a linear operator,  inherits certain properties from its parent operator T. The following result illustrates one way in which the two operators are linked.

# Theorem 5.20.

Let T be a linear operator on a finite-dimensional vector space V, and let W be a T-invariant subspace of V. Then the characteristic polynomial of T divides the characteristic polynomial of T.

# Proof.

Choose an ordered basis  for W, and extend it to an ordered basis  for V. Let  and . Then, by Exercise 12, A can be written in the form



Let f(t) be the characteristic polynomial of T and g(t) the characteristic polynomial of . Then



by Exercise 21 of Section 4.3. Thus g(t) divides f(t).

# Example 5

Let T be the linear operator on  defined by



and let . Observe that W is a T-invariant subspace of  because, for any vector ,



Let , which is an ordered basis for W. Extend  to the standard ordered basis  for . Then



in the notation of Theorem 5.20. Let f(t) be the characteristic polynomial of T and g(t) be the characteristic polynomial of . Then



In view of Theorem 5.20, we may use the characteristic polynomial of  to gain information about the characteristic polynomial of T itself. In this regard, cyclic subspaces are useful because the characteristic polynomial of the restriction of a linear operator T to a cyclic subspace is readily computable.

# Theorem 5.21.

Let T be a linear operator on a finite-dimensional vector space V, and let W denote the T-cyclic subspace of V generated by a nonzero vector . Let . Then

1. (a)  is a basis for W.

2. (b) If , then the characteristic polynomial of  is .

# Proof.

(a) Since , the set  is linearly independent. Let j be the largest positive integer for which



is linearly independent. Such a j must exist because V is finite-dimensional. Let . Then  is a basis for Z. Furthermore,  by Theorem 1.7 (p. 40). We use this information to show that Z is a T-invariant subspace of V. Let . Since w is a linear combination of the vectors of , there exist scalars  such that



and hence



Thus T(w) is a linear combination of vectors in Z, and hence belongs to Z. So Z is T-invariant. Furthermore, . By Exercise 11, W is the smallest T-invariant subspace of V that contains v, so that . Clearly, , and so we conclude that . It follows that  is a basis for W, and therefore . Thus . This proves (a).

(b) Now view  (from (a)) as an ordered basis for W. Let  be the scalars such that



Observe that



which has the characteristic polynomial



by Exercise 19. Thus f(t) is the characteristic polynomial of , proving (b).

# Example 6

Let T be the linear operator of Example 3, and let , the T-cyclic subspace generated by . We compute the characteristic polynomial f(t) of  in two ways: by means of Theorem 5.21 and by means of determinants.

(a) By means of Theorem 5.21. From Example 3, we have that  is a cycle that generates W, and that . Hence



Therefore, by Theorem 5.21(b),



(b) By means of determinants. Let , which is an ordered basis for W. Since  and , we have



and therefore,



# The Cayley-Hamilton Theorem

As an illustration of the importance of Theorem 5.21, we prove a well- known result that is used in Chapter 7. The reader should refer to Appendix E for the definition of f(T), where T is a linear operator and f(x) is a polynomial.

# Theorem 5.22. (Cayley-Hamilton)

Let T be a linear operator on a finite-dimensional vector space V, and let f(t) be the characteristic polynomial of T. Then , the zero transformation. That is, T “satisfies” its characteristic equation.

# Proof.

We show that  for all . This is obvious if  because (T) is linear; so suppose that . Let W be the T-cyclic subspace generated by v, and suppose that . By Theorem 5.21(a), there exist scalars  such that



Hence Theorem 5.21(b) implies that



is the characteristic polynomial of . Combining these two equations yields



By Theorem 5.20, g(t) divides (t); hence there exists a polynomial q(t) such that . So



# Example 7

Let T be the linear operator on  defined by , and let . Then



where . The characteristic polynomial of T is, therefore,



It is easily verified that . Similarly,



Example 7 suggests the following result.

# Corollary (Cayley-Hamilton Theorem for Matrices).

Let A be an  matrix, and let f(t) be the characteristic polynomial of A. Then , the  zero matrix.

see Exercise 15.

# Invariant Subspaces and Direct Sums3

It is useful to decompose a finite-dimensional vector space V into a direct sum of as many T-invariant subspaces as possible because the behavior of T on V can be inferred from its behavior on the direct summands. For example, T is diagonalizable if and only if V can be decomposed into a direct sum of one-dimensional T-invariant subspaces (see Exercise 35). In Chapter 7, we consider alternate ways of decomposing V into direct sums of T-invariant subspaces if T is not diagonalizable. We proceed to gather a few facts about direct sums of T-invariant subspaces that are used in Section 7.4. The first of these facts is about characteristic polynomials.

# Theorem 5.23.

Let T be a linear operator on a finite-dimensional vector space V, and suppose that , where  is a T-invariant subspace of V for each . Suppose that  is the characteristic polynomial of . Then  is the characteristic polynomial of T.

# Proof.

The proof is by mathematical induction on k. In what follows,f(t) denotes the characteristic polynomial of T. Suppose first that . Let  be an ordered basis for  an ordered basis for , and . Then  is an ordered basis for V by Theorem 5.9(d) (p. 275). Let , and . By Exercise 33, it follows that



where O and  are zero matrices of the appropriate sizes. Then



as in the proof of Theorem 5.20, proving the result for .

Now assume that the theorem is valid for  summands, where , and suppose that V is a direct sum of k subspaces, say,



Let . It is easily verified that W is T-invariant and that . So by the case for , where g(t) is the characteristic polynomial of . Clearly , and therefore  by the induction hypothesis. We conclude that .

As an illustration of this result, suppose that T is a diagonalizable linear operator on a finite-dimensional vector space V with distinct eigenvalues . By Theorem 5.10 (p. 277), V is a direct sum of the eigenspaces of T. Since each eigenspace is T-invariant, we may view this situation in the context of Theorem 5.23. For each eigenvalue , the restriction of T to has characteristic polynomial , where  is the dimension of . By Theorem 5.23, the characteristic polynomialf(t) of T is the product



It follows that the multiplicity of each eigenvalue is equal to the dimension of the corresponding eigenspace, as expected.

# Example 8

Let T be the linear operator on  defined by



and let  and . Notice that  and  are each T-invariant and that . Let , and . Then  is an ordered basis for  is an ordered basis for , and  is an ordered basis for . Let  and . Then



and



Let , and  denote the characteristic polynomials of T, , and , respectively. Then



The matrix A in Example 8 can be obtained by joining the matrices  and  in the manner explained in the next definition.

# Definition.

Let , and let . We define the direct sum of  and , denoted , as the  matrix A such that



If  are square matrices with entries from F, then we define the direct sum of  recursively by



If , then we often write



# Example 9

Let



Then

The final result of this section relates direct sums of matrices to direct sums of invariant subspaces. It is an extension of Exercise 33 to the case .

# Theorem 5.24.

Let T be a linear operator on a finite-dimensional vector space V, and let  be T-invariant subspaces of V such that . For each i, let  be an ordered basis for , and let . Let  and  for . Then .

see Exercise 34.

# Exercises

1. Label the following statements as true or false.

1. (a) There exists a linear operator T with no T-invariant subspace.

2. (b) If T is a linear operator on a finite-dimensional vector space V and W is a T-invariant subspace of V, then the characteristic polynomial of  divides the characteristic polynomial of T.

3. (c) Let T be a linear operator on a finite-dimensional vector space V, and let v and w be in V. If W is the T-cyclic subspace generated by v,  is the T-cyclic subspace generated by w, and , then .

4. (d) If T is a linear operator on a finite-dimensional vector space V, then for any  the T-cyclic subspace generated by v is the same as the T-cyclic subspace generated by T(v).

5. (e) Let T be a linear operator on an n-dimensional vector space. Then there exists a polynomial g(t) of degree n such that .

6. (f) Any polynomial of degree n with leading coefficient  is the characteristic polynomial of some linear operator.

7. (g) If T is a linear operator on a finite-dimensional vector space V, and if V is the direct sum of k T-invariant subspaces, then there is an ordered basis  for V such that  is a direct sum of k matrices.

2. For each of the following linear operators T on the vector space V, determine whether the given subspace W is a T-invariant subspace of V.

1. (a) , and 

2. (b) , and 

3. (c) , and 

4. (d) , and 

5. (f) , and 

3. Let T be a linear operator on a finite-dimensional vector space V. Prove that the following subspaces are T-invariant.

1. (a)  and V

2. (b) N(T) and R(T)

3. (c) , for any eigenvalue  of T

4. Let T be a linear operator on a vector space V, and let W be a T-invariant subspace of V. Prove that W is g(T)-invariant for any polynomial g(t).

5. Let T be a linear operator on a vector space V. Prove that the intersection of any collection of T-invariant subspaces of V is a T-invariant subspace of V.

6. For each linear operator T on the vector space V, find an ordered basis for the T-cyclic subspace generated by the vector z.

1. (a) , and .

2. (b) , and .

3. (c) , and .

4. (d) , and .

7. Prove that the restriction of a linear operator T to a T-invariant sub-space is a linear operator on that subspace.

8. Let T be a linear operator on a vector space with a T-invariant subspace W. Prove that if v is an eigenvector of  with corresponding eigenvalue , then v is also an eigenvector of T with corresponding eigenvalue .

9. For each linear operator T and cyclic subspace W in Exercise 6, compute the characteristic polynomial of  in two ways, as in Example 6.

10. For each linear operator in Exercise 6, find the characteristic polynomialf(t) of T, and verify that the characteristic polynomial of  (computed in Exercise 9) dividesf(t).

11. Let T be a linear operator on a vector space V, let v be a nonzero vector in V, and let W be the T-cyclic subspace of V generated by v. Prove that

1. (a) W is T-invariant.

2. (b) Any T-invariant subspace of V containing v also contains W.

12. Prove that  in the proof of Theorem 5.20.

13. Let T be a linear operator on a vector space V, let v be a nonzero vector in V, and let W be the T-cyclic subspace of V generated by v. For any , prove that  if and only if there exists a polynomial g(t) such that .

14. Prove that the polynomial g(t) of Exercise 13 can always be chosen so that its degree is less than .

15. Use the Cayley-Hamilton theorem (Theorem 5.22) to prove its corollary for matrices. Warning: If  is the characteristic polynomial of A, it is tempting to “prove” that  by saying “.” Why is this argument incorrect? Visit goo.gl/ZMVn9i for a solution.

16. Let T be a linear operator on a finite-dimensional vector space V.

1. (a) Prove that if the characteristic polynomial of T splits, then so does the characteristic polynomial of the restriction of T to any T-invariant subspace of V.

2. (b) Deduce that if the characteristic polynomial of T splits, then any nontrivial T-invariant subspace of V contains an eigenvector of T.

17. Let A be an  matrix. Prove that


18. Let A be an  matrix with characteristic polynomial


1. (a) Prove that A is invertible if and only if .

2. (b) Prove that if A is invertible, then


3. (c) Use (b) to compute  for


19. Let A denote the  matrix



where  are arbitrary scalars. Prove that the characteristic polynomial of A is



Hint: Use mathematical induction on k, computing the determinant by cofactor expansion along the first row.

20. Let T be a linear operator on a vector space V, and suppose that V is a T-cyclic subspace of itself. Prove that if U is a linear operator on V, then  if and only if  for some polynomial g(t). Hint: Suppose that V is generated by v. Choose g(t) according to Exercise 13 so that .

21. Let T be a linear operator on a two-dimensional vector space V. Prove that either V is a T-cyclic subspace of itself or  for some scalar c.

22. Let T be a linear operator on a two-dimensional vector space V and suppose that  for any scalar c. Show that if U is any linear operator on V such that , then  for some polynomial g(t).

23. Let T be a linear operator on a finite-dimensional vector space V, and let W be a T-invariant subspace of V. Suppose that  are eigenvectors of T corresponding to distinct eigenvalues. Prove that if  is in W, then  for all i. Hint: Use mathematical induction on k.

24. Prove that the restriction of a diagonalizable linear operator T to any nontrivial T-invariant subspace is also diagonalizable. Hint: Use the result of Exercise 23.

1. (a) Prove the converse to Exercise 19(a) of Section 5.2: If T and U are diagonalizable linear operators on a finite-dimensional vector space V such that , then T and U are simultaneously diagonalizable. (See the definitions in the exercises of Section 5.2.) Hint: For any eigenvalue  of T, show that  is U-invariant, and apply Exercise 24 to obtain a basis for  of eigenvectors of U.

2. (b) State and prove a matrix version of (a).

25. Let T be a linear operator on an n-dimensional vector space V such that T has n distinct eigenvalues. Prove that V is a T-cyclic subspace of itself. Hint: Use Exercise 23 to find a vector v such that  is linearly independent.

Exercises 27 through 31 require familiarity with quotient spaces as defined in Exercise 31 of Section 1.3. Before attempting these exercises, the reader should first review the other exercises treating quotient spaces: Exercise 35 of Section 1.6, Exercise 42 of Section 2.1, and Exercise 24 of Section 2.4.

For the purposes of Exercises 27 through 31, T is a fixed linear operator on a finite-dimensional vector space V, and W is a nonzero T-invariant subspace of V. We require the following definition.

# Definition.

Let T be a linear operator on a vector space V, and let W be a T-invariant subspace of V. Define  by


1. (a) Prove that  is well defined. That is, show that  whenever .

2. (b) Prove that  is a linear operator on V/W.

3. (c) Let  be the linear transformation defined in Exercise 42 of Section 2.1 by . Show that the diagram of Figure 5.6 commutes; that is, prove that . (This exercise does not require the assumption that V is finite-dimensional.)

1. Letf(t), g(t), and h(t) be the characteristic polynomials of T, , and , respectively. Prove that . Hint: Extend an ordered basis  for W to an ordered basis  for V. Then show that the collection of cosets  is an ordered basis for V/W, and prove that



where  and .

2. Use the hint in Exercise 28 to prove that if T is diagonalizable, then so is .

3. Prove that if both  and  are diagonalizable and have no common eigenvalues, then T is diagonalizable.

The results of Theorem 5.21 and Exercise 28 are useful in devising methods for computing characteristic polynomials without the use of determinants. This is illustrated in the next exercise.

1. Let , let , and let W be the cyclic subspace of  generated by .

1. (a) Use Theorem 5.21 to compute the characteristic polynomial of .

2. (b) Show that  is a basis for , and use this fact to compute the characteristic polynomial of .

3. (c) Use the results of (a) and (b) to find the characteristic polynomial of A.

Exercises 32 through 39 are concerned with direct sums.

1. Let T be a linear operator on a vector space V, and let  be T-invariant subspaces of V. Prove that  is also a T-invariant subspace of V.

2. Give a direct proof of Theorem 5.24 for the case . (This result is used in the proof of Theorem 5.23.)

3. Prove Theorem 5.24. Hint: Begin with Exercise 33 and extend it using mathematical induction on k, the number of subspaces.

4. Let T be a linear operator on a finite-dimensional vector space V. Prove that T is diagonalizable if and only if V is the direct sum of one-dimensional T-invariant subspaces.

5. Let T be a linear operator on a finite-dimensional vector space V, and let  be T-invariant subspaces of V such that . Prove that


6. Let T be a linear operator on a finite-dimensional vector space V, and let  be T-invariant subspaces of V such that . Prove that T is diagonalizable if and only if  is diagonalizable for all i.

7. Let C be a collection of diagonalizable linear operators on a finite- dimensional vector space V. Prove that there is an ordered basis  such that  is a diagonal matrix for all  if and only if the operators of C commute under composition. (This is an extension of Exercise 25.) Hints for the case that the operators commute: The result is trivial if each operator has only one eigenvalue. Otherwise, establish the general result by mathematical induction on dim(V), using the fact that V is the direct sum of the eigenspaces of some operator in C that has more than one eigenvalue.

8. Let  be square matrices with entries in the same field, and let . Prove that the characteristic polynomial of A is the product of the characteristic polynomials of the .

9. Let



Find the characteristic polynomial of A. Hint: First prove that A has rank 2 and that  is -invariant.

10. Let  be the matrix defined by  for all i and j. Find the characteristic polynomial of A.