5.2 Diagonalizability – Linear Algebra, 5th Edition

5.2 Diagonalizability

In Section 5.1, we presented the diagonalization problem and observed that not all linear operators or matrices are diagonalizable. Although we are able to diagonalize operators and matrices and even obtain a necessary and sufficient condition for diagonalizability (Theorem 5.1 p. 247), we have not yet solved the diagonalization problem. What is still needed is a simple test to determine whether an operator or a matrix can be diagonalized, as well as a method for actually finding a basis of eigenvectors. In this section, we develop such a test and method.

In Example 6 of Section 5.1, we obtained a basis of eigenvectors by choosing one eigenvector corresponding to each eigenvalue. In general, such a procedure does not yield a basis, but the following theorem shows that any set constructed in this manner is linearly independent.

Theorem 5.5.

Let T be a linear operator on a vector space, and let λ1, λ2, , λk be distinct eigenvalues of T. For each i=1, 2, ..., k, let Si be a finite set of eigenvectors of T corresponding to λi. If each Si(i=1, 2, ..., k), is linearly independent, then S1S2Sk is linearly independent.

Proof.

The proof is by mathematical induction on k. If k=1, there is nothing to prove. So assume that the theorem holds for k1 distinct eigenvalues, where k11, and that we have k distinct eigenvalues λ1, λ2, ..., λk of T. For each i=1, 2, ..., k, let Si={vi1, vi2, ..., vini} be a linearly independent set of eigenvectors of T corresponding to λi. We wish to show that S=S1S2Sk is linearly independent.

Consider any scalars {aij}, where i=1, 2, ..., k and j=1, 2, ..., ni, such that

i=1kj=1niaijvij=0.                               (1)

Because vij is an eigenvector of T corresponding to λi, applying TλkI to both sides of (1) yields

i=1k-1j=1niaij(λiλk)vij=0.                              (2)

But S1S2Sk1 is linearly independent by the induction hypothesis, so that (2) implies aij(λiλk)=0 for i=1, 2, , k1 and j=1, 2, , ni. Since λ1, λ2, , λk are distinct, it follows that λiλk0 for 1ik1. Hence aij=0 for i=1, 2, , k1 and j=1, 2, ... ni, and therefore (1) reduces to j=1nkakjvkj=0. But Sk is also linearly independent, and so akj=0 for j=1, 2, ... nk. Consequently aij=0 for i=1, 2, ..., k and j=1, 2, ... ni, proving that S is linearly independent.

Corollary.

Let T be a linear operator on an n-dimensional vector space V. If T has n distinct eigenvalues, then T is diagonalizable.

Proof.

Suppose that T has n distinct eigenvalues λ1, ..., λn. For each i choose an eigenvector vi corresponding to λi. By Theorem 5.5, {v1, ..., vn} is linearly independent, and since dim(V)=n, this set is a basis for V. Thus, by Theorem 5.1 (p. 247), T is diagonalizable.

Example 1

Let

A=(1111)M2×2(R).

The characteristic polynomial of A (and hence of LA) is

det(AtI)=det(1t111t)=t(t2), 

and thus the eigenvalues of LA are 0 and 2. Since LA is a linear operator on the two-dimensional vector space R2, we conclude from the preceding corollary that LA (and hence A) is diagonalizable.

The converse of the corollary to Theorem 5.5 is false. That is, it is not true that if T is diagonalizable, then it has n distinct eigenvalues. For example, the identity operator is diagonalizable even though it has only one eigenvalue, namely, λ=1.

We have seen that diagonalizability requires the existence of eigenvalues. Actually, diagonalizability imposes a stronger condition on the characteristic polynomial.

Definition.

A polynomial f(t) in P(F) splits over F if there are scalars c, a1, ..., an (not necessarily distinct) in F such that

f(t)=c(ta1)(ta2)(tan).

For example, t21=(t+1)(t1) splits over R, but (t2+1)(t2) does not split over R because t2+1 cannot be factored into a product of linear factors. However, (t2+1)(t2) does split over C because it factors into the product (t+i)(ti)(t2). If f(t) is the characteristic polynomial of a linear operator or a matrix over a field F, then the statement that f(t) splits is understood to mean that it splits over F.

Theorem 5.6.

The characteristic polynomial of any diagonalizable linear operator on a vector space V over a field F splits over F.

Proof.

Let T be a diagonalizable linear operator on the n-dimensional vector space V, and let β be an ordered basis for V such that [T]β=D is a diagonal matrix. Suppose that

D=(λ1000λ2000λn), 

and let f (t) be the characteristic polynomial of T. Then

f(t)=det(DtI)=det(λ1t000λ2t000λnt)=(λ1t)(λ2t)(λnt)=(1)n(tλ1)(tλ2)(tλn).

From this theorem, it is clear that if T is a diagonalizable linear operator on an n-dimensional vector space that fails to have n distinct eigenvalues, then the characteristic polynomial of T must have repeated zeros.

The converse of Theorem 5.6 is false; that is, the characteristic polynomial of T may split, but T need not be diagonalizable. (See Example 3, which follows.) The following concept helps us determine when an operator whose characteristic polynomial splits is diagonalizable.

Definition.

Let λ be an eigenvalue of a linear operator or matrix with characteristic polynomial f(t). The multiplicity (sometimes called the algebraic multiplicity) of λ is the largest positive integer k for which (tλ)k is a factor of f(t).

Example 2

Let

A=(310034004), 

which has characteristic polynomial f(t)=(t3)2(t4). Hence λ=3 is an eigenvalue of A with multiplicity 2, and λ=4 is an eigenvalue of A with multiplicity 1.

If T is a diagonalizable linear operator on a finite-dimensional vector space V, then there is an ordered basis β for V consisting of eigenvectors of T. We know from Theorem 5.1 (p. 247) that [T]β is a diagonal matrix in which the diagonal entries are the eigenvalues of T. Since the characteristic polynomial of T is det([T]βtI), it is easily seen that each eigenvalue of T must occur as a diagonal entry of [T]β exactly as many times as its multiplicity. Hence β contains as many (linearly independent) eigenvectors corresponding to an eigenvalue as the multiplicity of that eigenvalue. So the number of linearly independent eigenvectors corresponding to a given eigenvalue is of interest in determining whether an operator can be diagonalized. Recalling from Theorem 5.4 (p. 250) that the eigenvectors of T corresponding to the eigenvalue λ are the nonzero vectors in the null space of TλI, we are led naturally to the study of this set.

Definition.

Let T be a linear operator on a vector space V, and let λ be an eigenvalue of T. Define Eλ={xV: T(x)=λx}=N(TλIV). The set Eλ is called the eigenspace of T corresponding to the eigenvalue λ. Analogously, we define the eigenspace of a square matrix A corresponding to the eigenvalue λ to be the eigenspace of LA corresponding to λ.

Clearly, Eλ is a subspace of V consisting of the zero vector and the eigenvectors of T corresponding to the eigenvalue λ. The maximum number of linearly independent eigenvectors of T corresponding to the eigenvalue λ is therefore the dimension of Eλ. Our next result relates this dimension to the multiplicity of λ.

Theorem 5.7.

Let T be a linear operator on a finite-dimensional vector space V, and let λ be an eigenvalue of T having multiplicity m. Then 1dim(Eλ)m.

Proof.

Choose an ordered basis {v1, v2, , vp} for Eλ, extend it to an ordered basis β={ v1,  v2, ,  v p,  vp+1, ,  v n} for V, and let A=[T]β. Observe that vi(1ip) is an eigenvector of T corresponding to λ, and therefore

A=(λIpBOC).

By Exercise 21 of Section 4.3, the characteristic polynomial of T is

f(t)=det(AtIn)=det((λt)IpBOCtInp)=det((λt)Ip)·det(CtInp)=(λt)pg(t),

where g(t) is a polynomial. Thus (λt)p is a factor of f(t), and hence the multiplicity of λ is at least p. But dim(Eλ)=p, and so dim(Eλ)m.

Example 3

Let T be the linear operator on P2(R) defined by T(f(x))=f(x). The matrix representation of T with respect to the standard ordered basis β for P2(R) is

[T]β=(010002000).

Consequently, the characteristic polynomial of T is

det([T]βtI)=det(t100t200t)=t3.

Thus T has only one eigenvalue (λ=0) with multiplicity 3. Solving T(f(x))=f(x)=0 shows that Eλ=N(TλI)=N(T) is the subspace of P2(R) consisting of the constant polynomials. So {1} is a basis for Eλ, and therefore dim(Eλ)=1. Consequently, there is no basis for P2(R) consisting of eigenvectors of T, and therefore T is not diagonalizable. Even though T is not diagonalizable, we will see in Chapter 7 that its eigenvalue and eigenvectors are still useful for describing the behavior of T.

Example 4

Let T be the linear operator on R3 defined by

T(a1a2a3)=(4a1+a32a1+3a2+2a3a1+4a3).

We determine the eigenspace of T corresponding to each eigenvalue. Let β be the standard ordered basis for R3. Then

[T]β=(401232104), 

and hence the characteristic polynomial of T is

det([T]βtI)=det(4t0123t2104t)=(t5)(t3)2.

So the eigenvalues of T are λ1=5 and λ2=3 with multiplicities 1 and 2, respectively.

Since

Eλ1=N(Tλ1I)={(x1x2x3)R3: (101222101)(x1x2x3)=(000)}, 

Eλ1 is the solution space of the system of linear equations

x1+x3=02x12x2+2x3=0x1x3=0.

It is easily seen (using the techniques of Chapter 3) that

{(121)}

is a basis for Eλ1. Hence dim(Eλ1)=1.

Similarly, Eλ2=N(Tλ2I) is the solution space of the system

x1+x3=02x1+2x3=0x1+x3=0.

Since the unknown x2 does not appear in this system, we assign it a parametric value, say, x2=s, and solve the system for x1 and x3, introducing another parameter t. The result is the general solution to the system

(x1x2x3)=s(010)+t(101),   for s, tR.

It follows that

{(010), (101)}

is a basis for Eλ2, and dim(Eλ2)=2.

In this case, the multiplicity of each eigenvalue λi is equal to the dimension of the corresponding eigenspace Eλi. Observe that the union of the two bases just derived, namely,

{(121), (010), (101)}, 

is linearly independent by Theorem 5.5 and hence is a basis for R3 consisting of eigenvectors of T. Consequently, T is diagonalizable.

Examples 3 and 4 suggest that an operator on V whose characteristic polynomial splits is diagonalizable if and only if the dimension of each eigenspace is equal to the multiplicity of the corresponding eigenvalue. This is indeed true, as our next theorem shows. Moreover, when the operator is diagonalizable, we can use Theorem 5.5 to construct a basis for V consisting of eigenvectors of the operator by collecting bases for the individual eigenspaces.

Theorem 5.8.

Let T be a linear operator on a finite-dimensional vector space V such that the characteristic polynomial of T splits. Let λ1, λ2, , λk be the distinct eigenvalues of T. Then

  1. (a) T is diagonalizable if and only if the multiplicity of λi is equal to dim(Eλi) for all i.

  2. (b) If T is diagonalizable and βi is an ordered basis for Eλi for each i, then β=β1β2βk is an ordered basis2 for V consisting of eigenvectors of T.

Proof.

For each i, let mi denote the multiplicity of λi, di=dim(Eλi), and n=dim(V).

First, suppose that T is diagonalizable. Let β be a basis for V consisting of eigenvectors of T. For each i, let βi=βEλi, the set of vectors in β that are eigenvectors corresponding to λi, and let ni denote the number of vectors in βi. Then nidi for each i because βi is a linearly independent subset of a subspace of dimension di, and dimi by Theorem 5.7. The ni’s sum to n because β contains n vectors. The mi’s also sum to n because the degree of the characteristic polynomial of T is equal to the sum of the multiplicities of the eigenvalues. Thus

n=i=1knii=1kdii=1kmi=n.

It follows that

i=1k(midi)=0.

Since (midi)0 for all i, we conclude that mi=di for all i.

Conversely, suppose that mi=di for all i. We simultaneously show that T is diagonalizable and prove (b). For each i, let βi be an ordered basis for Eλi, and let β=β1β2βk. By Theorem 5.5, β is linearly independent. Furthermore, since di=mi for all i, β contains

i=1kdi=i=1kmi=n

vectors. Therefore β is an ordered basis for V consisting of eigenvectors of V, and we conclude that T is diagonalizable.

This theorem completes our study of the diagonalization problem. We summarize our results.

Test for Diagonalizability

Let T be a linear operator on an n-dimensional vector space V. Then T is diagonalizable if and only if both of the following conditions hold.

  1. The characteristic polynomial of T splits.

  2. For each eigenvalue λ of T, the multiplicity of λ equals nullity(TλI), that is, the multiplicity of λ equals nrank(TλI).

These same conditions can be used to test if a square matrix A is diagonalizable because diagonalizability of A is equivalent to diagonalizability of the operator LA.

If T is a diagonalizable operator and β1, β2, , βk are ordered bases for the eigenspaces of T, then the union β=β1β2βk is an ordered basis for V consisting of eigenvectors of T, and hence [T]β is a diagonal matrix.

When testing T for diagonalizability, it is usually easiest to choose a convenient basis α for V and work with B= [T]α. If the characteristic polynomial of B splits, then use condition 2 above to check if the multiplicity of each repeated eigenvalue of B equals nrank(BλI). (By Theorem 5.7, condition 2 is automatically satisfied for eigenvalues of multiplicity 1.) If so, then B, and hence T, is diagonalizable.

If T is diagonalizable and a basis β for V consisting of eigenvectors of T is desired, then we first find a basis for each eigenspace of B. The union of these bases is a basis γ for Fn consisting of eigenvectors of B. Each vector in γ is the coordinate vector relative to α of an eigenvector of T. The set consisting of these n eigenvectors of T is the desired basis β.

Furthermore, if A is an n×n diagonalizable matrix, we can use the corollary to Theorem 2.23 (p. 115) to find an invertible n×n matrix Q and a diagonal n×n matrix D such that Q1AQ=D. The matrix Q has as its columns the vectors in a basis of eigenvectors of A, and D has as its j th diagonal entry the eigenvalue of A corresponding to the j th column of Q.

We now consider some examples illustrating the preceding ideas.

Example 5

We test the matrix

A=(310030004) M3×3(R)

for diagonalizability.

The characteristic polynomial of A is det(AtI)=(t4) (t3)2, which splits, and so condition 1 of the test for diagonalization is satisfied. Also A has eigenvalues λ1=4 and λ2=3 with multiplicities 1 and 2, respectively. Since λ1 has multiplicity 1, condition 2 is satisfied for λ1. Thus we need only test condition 2 for λ2. Because

Aλ2I=(010000001)

has rank 2, we see that 3rank(Aλ2I)=1, which is not the multiplicity of λ2. Thus condition 2 fails for λ2, and A is therefore not diagonalizable.

Example 6

Let T be the linear operator on P2(R) defined by

T(f(x))=f(1)+f(0)x+(f(0)+ f(0))x2.

We first test T for diagonalizability. Let α denote the standard ordered basis for P2(R) and B= [T]α. Then

B=(111010012).

The characteristic polynomial of B, and hence of T, is (t1)2(t2), which splits. Hence condition 1 of the test for diagonalization is satisfied. Also B has the eigenvalues λ1=1 and λ2=2 with multiplicities 2 and 1, respectively. Condition 2 is satisfied for λ2 because it has multiplicity 1. So we need only verify condition 2 for λ1=1. For this case,

3rank(Bλ1I)=3rank(011000011)=31=2, 

which is equal to the multiplicity of λ1. Therefore T is diagonalizable.

We now find an ordered basis γ for R3 consisting of eigenvectors of B. We consider each eigenvalue separately.

The eigenspace corresponding to λ1=1 is

E λ1 ={( x1 x2 x3 ) R3: (011000011)( x1 x2 x3 )=0}, 

which is the solution space for the system

x2+x3=0, 

and has

γ1={(100), (011)}

as a basis.

The eigenspace corresponding to λ2=2 is

E λ2 ={( x1 x2 x3 ) R3: (111010010)( x1 x2 x3 )=0}, 

which is the solution space for the system

x1+x2+x3=0x2=0, 

and has

γ2={(101)}

as a basis.

Let

γ=γ1γ2={(100), (011), (101)}.

Then γ is an ordered basis for R3 consisting of eigenvectors of B.

Finally, observe that the vectors in γ are the coordinate vectors relative to α of the vectors in the set

β={1, x+x2, 1+x2}, 

which is an ordered basis for P2(R) consisting of eigenvectors of T. Thus

[T]β=(100010002).

Our next example is an application of diagonalization that is of interest in Section 5.3.

Example 7

Let

A=(0213).

We show that A is diagonalizable and find a 2×2 matrix Q such that Q1AQ is a diagonal matrix. We then show how to use this result to compute An for any positive integer n.

First observe that the characteristic polynomial of A is (t1)(t2), and hence A has two distinct eigenvalues, λ1=1 and λ2=2. By applying the corollary to Theorem 5.5 to the operator LA, we see that A is diagonalizable. Moreover,

γ1={(21)}andγ2={(11)}

are bases for the eigenspaces E λ1 and E λ2 , respectively. Therefore

γ=γ1γ2={(21), (11)}

is an ordered basis for R2 consisting of eigenvectors of A. Let

Q=(2111), 

the matrix whose columns are the vectors in γ. Then, by the corollary to Theorem 2.23 (p. 115),

D=Q1AQ= [ LA ]β=(1002).

To find An for any positive integer n, observe that A=QDQ1. Therefore

An=(QDQ1)n=(QDQ1)(QDQ1)(QDQ1)=QDnQ1=Q(1n002n)Q1=(2111)(1002n)(1112)=(22n22n+11+2n1+2n+1).

We now consider an application that uses diagonalization to solve a system of differential equations.

Systems of Differential Equations

Consider the system of differential equations

x1=3x1+x2+x3x2=2x1+4x2+2x3x3=x1x2+x3, 

where, for each i, xi=xi(t) is a differentiable real-valued function of the real variable t. Clearly, this system has a solution, namely, the solution in which each xi(t) is the zero function. We determine all of the solutions to this system.

Let x: RR3 be the function defined by

x(t)=(x1(t)x2(t)x3(t)).

The derivative of x, denoted x, is defined by

x(t)=(x1(t)x2(t)x3(t)).

Let

A=(311242111)

be the coefficient matrix of the given system, so that we can rewrite the system as the matrix equation x=Ax.

It can be verified that for

Q=(101012111)andD=(200020004), 

we have Q1AQ=D. Substitute A=QDQ1 into x=Ax to obtain x=QDQ1x or, equivalently, Q1x=DQ1x. The function y: RR3 defined by y(t)=Q1x(t) can be shown to be differentiable, and y=Q1x (see Exercise 17). Hence the original system can be written as y=Dy.

Since D is a diagonal matrix, the system y=Dy is easy to solve. Setting

y(t)=(y1(t)y2(t)y3(t)), 

we can rewrite y=Dy as

(y1(t)y2(t)y3(t))=(200020004)(y1(t)y2(t)y3(t))=(2y1(t)2y2(t)4y3(t)).

The three equations

y1=2y1y2=2y2y3=4y3

are independent of each other, and thus can be solved individually. It is easily seen (as in Example 3 of Section 5.1) that the general solution to these equations is y1(t)=c1e2t, y2(t)=c2e2t, and y3(t)=c3e4t, where c1, c2, and c3 are arbitrary constants. Finally,

(x1(t)x2(t)x3(t))=x(t)=Qy(t)=(101012111)(c1e2tc2e2tc3e4t)=(c1e2tc3e4tc2e2t2c3e4tc1e2t+c2e2t+c3e4t)

yields the general solution of the original system. Note that this solution can be written as

x(t)=e2t[c1(101)+c2(011)]+e4t[c3(121)].

The expressions in brackets are arbitrary vectors in Eλ1 and Eλ2, respectively, where λ1=2 and λ2=4. Thus the general solution of the original system is x(t)=e2tz1+e4tz2, where z1Eλ1 and z2Eλ2. This result is generalized in Exercise 16.

Direct Sums*

Let T be a linear operator on a finite-dimensional vector space V. There is a way of decomposing V into simpler subspaces that offers insight into the behavior of T. This approach is especially useful in Chapter 7, where we study nondiagonalizable linear operators. In the case of diagonalizable operators, the simpler subspaces are the eigenspaces of the operator.

Definition.

Let W1, W2, , Wk be subspaces of a vector space V. We define the sum of these subspaces to be the set

{v1+v2++vk: viWi for 1ik}, 

which we denote by W1+W2++Wk or i=1kWi.

It is a simple exercise to show that the sum of subspaces of a vector space is also a subspace.

Example 8

Let V=R3, let W1 denote the xy-plane, and let W2 denote the yz-plane. Then R3=W1+W2 because, for any vector (a, b, c)R3, we have

(a, b, c)=(a, 0, 0)+(0, b, c), 

where (a, 0, 0)W1 and (0, b, c)W2.

Notice that in Example 8 the representation of (a, b, c) as a sum of vectors in W1 and W2 is not unique. For example, (a, b, c)=(a, b, 0)+(0, 0, c) is another representation. Because we are often interested in sums for which representations are unique, we introduce a condition that assures this outcome. The definition of direct sum that follows is a generalization of the definition given in the exercises of Section 1.3.

Definition.

Let W, W1, W2, , Wk be subspaces of a vector space V such that WiW for i=1, 2, , k. We call W the direct sum of the subspaces W1, W2, , Wk, and write V=W1W2Wk, if

V=i=1kWiandWjij Wi={0}for each  j (1jk).

Example 9

In R5,  let W={(x1, x2, x3, x4, x5): x5=0}, W1={(a, b, 0, 0, 0): a, bR}, W2={(0, 0, c, 0, 0): cR}, and W3={(0, 0, 0, d, 0): dR}. For any (a, b, c, d, 0)W,

(a, b, c, d, 0)=(a, b, 0, 0, 0)+(0, 0, c, 0, 0)+(0, 0, 0, d, 0)W1+W2+W3.

Thus

W=i=13Wi.

To show that W is the direct sum of W1, W2, and W3, we must prove that W1(W2+W3)=W2(W1+W3)=W3(W1+W2)={0}. But these equalities are obvious, and so W=W1W2W3.

Our next result contains several conditions that are equivalent to the definition of a direct sum.

Theorem 5.9.

Let W1, W2, , Wk be subspaces of a finite-dimensional vector space V. The following conditions are equivalent.

  1. (a) V=W1W2Wk.

  2. (b) V=i=1kWi and, for any vectors v1, v2, , vk such that viWi (1ik), if v1+v2++vk=0, then vi=0 for all i.

  3. (c) Each vector vV can be uniquely written as v=v1+v2++vk, where viWi.

  4. (d) If γi is an ordered basis for Wi(1ik), then γiγ2γk is an ordered basis for V.

  5. (e) For each i=1, 2, , k, there exists an ordered basis γi for Wi such that γ1γ2γk is an ordered basis for V.

Proof.

Assume (a). We prove (b). Clearly

V=i=1kWi.

Now suppose that v1, v2, , vk are vectors such that viWi for all i and v1+v2++vk=0. Then for any j

vj=ijviij Wi.

But vjWj and hence

vjWjijWi={0}.

So vj=0, proving (b).

Now assume (b). We prove (c). Let vV. By (b), there exist vectors v1, v2, ,vk such that viWi and v=v1+v2++vk. We must show that this representation is unique. Suppose also that v=w1+w2++wk, where wiWi for all i. Then

(v1w1)+(v2w2)++(vkwk)=0.

But viwiWi for all i, and therefore viwi=0 for all i by (b). Thus vi=wi for all i, proving the uniqueness of the representation.

Now assume (c). We prove (d). For each i, let γi be an ordered basis for Wi. Since

V=i=1kWi

by (c), it follows that γ1γ2γk generates V. To show that this set is linearly independent, consider vectors vijγi (j=1, 2, , mi and i=1, 2, , k) and scalars aij such that

i, jaijvij=0.

For each i, set

wi=j=1mi aijvij.

Then for each i, wispan(γi)=Wi and

w1+w2++wk=i, jaijvij=0.

Since 0Wi for each i and 0+0++0=w1+w2++wk, (c) implies that wi=0 for all i. Thus

0=wi=j=1mi aijvij

for each i. But each γi is linearly independent, and hence aij=0 for all i and j. Consequently γ1γ2γk is linearly independent and therefore is a basis for V.

Clearly (e) follows immediately from (d).

Finally, we assume (e) and prove (a). For each i, let γi be an ordered basis for Wi such that γ1γ2γk is an ordered basis for V. Then

V=span(γ1γ2γk)=span(γ1)+span(γ2)++span(γk)=i=1kWi

by repeated applications of Exercise 14 of Section 1.4. Fix j(1jk), and suppose that, for some nonzero vector vV,

vWjijWi.

Then

vWj=span(γj)andvijWi=span(ijγi).

Hence v is a nontrivial linear combination of both γj and ijγi, so that v can be expressed as a linear combination of γ1γ2γk in more than one way. But these representations contradict Theorem 1.8 (p. 44), and so we conclude that

WjijWi={0}, 

proving (a).

With the aid of Theorem 5.9, we are able to characterize diagonalizability in terms of direct sums.

Theorem 5.10.

A linear operator T on a finite-dimensional vector space V is diagonalizable if and only if V is the direct sum of the eigenspaces of T.

Proof.

Let λ1, λ2, , λk be the distinct eigenvalues of T.

First suppose that T is diagonalizable, and for each i choose an ordered basis γi for the eigenspace Eλi. By Theorem 5.8, γ1γ2γk is a basis for V, and hence V is a direct sum of the Eλi's by Theorem 5.9.

Conversely, suppose that V is a direct sum of the eigenspaces of T. For each i, choose an ordered basis γi of Eλi. By Theorem 5.9, the union γ1γ2γk is a basis for V. Since this basis consists of eigenvectors of T, we conclude that T is diagonalizable.

Example 10

Let T be the linear operator on R4 defined by

T(a, b, c, d)=(a, b, 2c, 3d).

It is easily seen that T is diagonalizable with eigenvalues λ1=1, λ2=2, and λ3=3. Furthermore, the corresponding eigenspaces coincide with the subspaces W1, W2, and W3 of Example 9. Thus Theorem 5.10 provides us with another proof that R4=W1W2W3.

Exercises

  1. Label the following statements as true or false.

    1. (a) Any linear operator on an n-dimensional vector space that has fewer than n distinct eigenvalues is not diagonalizable.

    2. (b) Two distinct eigenvectors corresponding to the same eigenvalue are always linearly dependent.

    3. (c) If λ is an eigenvalue of a linear operator T, then each vector in Eλ is an eigenvector of T.

    4. (d) If λ1 and λ2 are distinct eigenvalues of a linear operator T, then Eλ1Eλ2={0}.

    5. (e) Let AMn×n(F) and β={v1, v2, , vn} be an ordered basis for Fn consisting of eigenvectors of A. If Q is the n×n matrix whose j th column is vj(1jn), then Q1AQ is a diagonal matrix.

    6. (f) A linear operator T on a finite-dimensional vector space is diagonalizable if and only if the multiplicity of each eigenvalue λ equals the dimension of Eλ.

    7. (g) Every diagonalizable linear operator on a nonzero vector space has at least one eigenvalue.

      The following two items relate to the optional subsection on direct sums.

    8. (h) If a vector space is the direct sum of subspaces W1, W2, , Wk, then WiWj={0} for ij.

    9. (i) If

      V=i=1kWiandWiWj={0}for ij, 

      then V=W1W2Wk.

  2. For each of the following matrices AMn×n(R), test A for diagonalizability, and if A is diagonalizable, find an invertible matrix Q and a diagonal matrix D such that Q1AQ=D.

    1. (a) (1201)

    2. (b) (1331)

    3. (c) (1432)

    4. (d) (740850663)

    5. (e) (001101011)

    6. (f) (110012003)

    7. (g) (311242111)

  3. For each of the following linear operators T on a vector space V, test T for diagonalizability, and if T is diagonalizable, find a basis β for V such that [T]β is a diagonal matrix.

    1. (a) V=P3(R) and T is defined by T(f(x))=f(x)+f(x).

    2. (b) V=P2(R) and T is defined by T(ax2+bx+c)=cx2+bx+a.

    3. (c) V=R3 and T is defined by

      T(a1a2a3)=(a2a12a3).
    4. (d) V=P2(R) and T is defined by T(f(x))=f(0)+f(1)(x+x2).

    5. (e) V=C2 and T is defined by T(z, w)=(z+iw, iz+w).

    6. (f) V=M2×2(R) and T is defined by T(A)=At.

  4. Prove the matrix version of the corollary to Theorem 5.5: If AMn×n(F) has n distinct eigenvalues, then A is diagonalizable.

  5. State and prove the matrix version of Theorem 5.6.

    1. (a) Justify the test for diagonalizability and the method for diagonalization stated in this section.

    2. (b) Formulate the results in (a) for matrices.

  6. For

    A=(1423)M2×2(R), 

    find an expression for An, where n is an arbitrary positive integer.

  7. Suppose that AMn×n(F) has two distinct eigenvalues, λ1 and λ2, and that dim(Eλ1)=n1. Prove that A is diagonalizable.

  8. Let T be a linear operator on a finite-dimensional vector space V, and suppose there exists an ordered basis β for V such that [T]β is an upper triangular matrix.

    1. (a) Prove that the characteristic polynomial for T splits.

    2. (b) State and prove an analogous result for matrices.

    The converse of (a) is treated in Exercise 12(b).

  9. Let T be a linear operator on a finite-dimensional vector space V with the distinct eigenvalues λ1, λ2, ,λk and corresponding multiplicities m1, m2, , mk. Suppose that β is a basis for V such that [T]β is an upper triangular matrix. Prove that the diagonal entries of [T]β are λ1, λ2, , λk and that each λi occurs mi times (1ik).

  10. Let A be an n×n matrix that is similar to an upper triangular matrix and has the distinct eigenvalues λ1, λ2, , λk with corresponding multiplicities m1, m2, , mk. Prove the following statements.

    1. (a) tr(A)=i=1kmiλi

    2. (b) det(A)=(λ1)m1(λ2)m2(λk)mk.

    1. (a) Prove that if AMn×n(F) and the characteristic polynomial of A splits, then A is similar to an upper triangular matrix. (This proves the converse of Exercise 9(b).) Hint: Use mathematical induction on n. For the general case, let v1 be an eigenvector of A, and extend {v1} to a basis {v1, v2, , vn} for Fn. Let P be the n×n matrix whose jth column is vj, and consider P1AP. Exercise 13(a) in Section 5.1 and Exercise 21 in Section 4.3 can be helpful.

    2. (b) Prove the converse of Exercise 9(a).

      Visit goo.gl/gJSjRU for a solution.

  11. Let T be an invertible linear operator on a finite-dimensional vector space V.

    1. (a) Recall that for any eigenvalue λ of T, λ1 is an eigenvalue of T1 (Exercise 9 of Section 5.1). Prove that the eigenspace of T corresponding to λ is the same as the eigenspace of T1 corresponding to λ1.

    2. (b) Prove that if T is diagonalizable, then T1 is diagonalizable.

  12. Let AMn×n(F). Recall from Exercise 15 of Section 5.1 that A and At have the same characteristic polynomial and hence share the same eigenvalues with the same multiplicities. For any eigenvalue λ of A and At, let Eλ and Eλ denote the corresponding eigenspaces for A and At, respectively.

    1. (a) Show by way of example that for a given common eigenvalue, these two eigenspaces need not be the same.

    2. (b) Prove that for any eigenvalue λ, dim(Eλ)=dim(Eλ).

    3. (c) Prove that if A is diagonalizable, then At is also diagonalizable.

  13. Find the general solution to each system of differential equations.

    1. (a) x=x+yy=3xy

    2. (b) x1=8x1+10x2x2=5x17x2

    3. (c) x1=x1+x3x2=x2+x3x3=2x3

  14. Let

    A=(a11a12a1na21a22a2nan1an2ann)

    be the coefficient matrix of the system of differential equations

    x1=a11x1+a12x2++a1nxnx2=a21x1+a22x2++a2nxnxn=an1x1+an2x2++annxn.

    Suppose that A is diagonalizable and that the distinct eigenvalues of A are λ1, λ2, , λk. Prove that a differentiable function x: RRn is a solution to the system if and only if x is of the form

    x(t)=eλ1tz1+eλ2tz2++eλktzk, 

    where ziEλi for i=1, 2, , k. Use this result to prove that the set of solutions to the system is an n-dimensional real vector space.

  15. Let CMm×n(R), and let Y be an n×p matrix of differentiable functions. Prove (CY)=CY, where (Y)ij=Yij for all i, j.

Exercises 18 through 20 are concerned with simultaneous diagonalization.

Definitions.

Two linear operators T and U on a finite-dimensional vector space V are called simultaneously diagonalizable if there exists an ordered basis β for V such that both [T]β and [U]β are diagonal matrices. Similarly, A, BMn×n(F) are called simultaneously diagonalizable if there exists an invertible matrix QMn×n(F) such that both Q1AQ and Q1BQ are diagonal matrices.

    1. (a) Prove that if T and U are simultaneously diagonalizable linear operators on a finite-dimensional vector space V, then the matrices [T]β and [U]β are simultaneously diagonalizable for any ordered basis β.

    2. (b) Prove that if A and B are simultaneously diagonalizable matrices, then LA and LB are simultaneously diagonalizable linear operators.

    1. (a) Prove that if T and U are simultaneously diagonalizable operators, then T and U commute (i.e., TU=UT).

    2. (b) Show that if A and B are simultaneously diagonalizable matrices, then A and B commute.

    The converses of (a) and (b) are established in Exercise 25 of Section 5.4.

  1. Let T be a diagonalizable linear operator on a finite-dimensional vector space, and let m be any positive integer. Prove that T and Tm are simultaneously diagonalizable.

Exercises 21 through 24 are concerned with direct sums.

  1. Let W1, W2, , Wk be subspaces of a finite-dimensional vector space V such that

    i=1kWi=V.

    Prove that V is the direct sum of W1, W2, , Wk if and only if

    dim(V)=i=1kdim(Wi).
  2. Let V be a finite-dimensional vector space with a basis β, and let β1, β2, , βk be a partition of β (i.e., β1, β2, , βk are subsets of β such that β=β1β2βk and βiβj= if ij). Prove that V=span(β1)span(β2)span(βk).

  3. Let T be a linear operator on a finite-dimensional vector space V, and suppose that the distinct eigenvalues of T are λ1, λ2, , λk. Prove that

    span({xV: x is an eigenvector of T})=Eλ1Eλ2Eλk.
  4. Let W1, W2, K1, K2, , Kp, M1, M2, , Mq be subspaces of a vector space V such that W1=K1K2Kp and W2=M1M2Mq. Prove that if W1W2={0}, then

    W1+W2=W1W2=K1K2KpM1M2Mq.