6.5 Unitary and Orthogonal Operators and Their Matrices – Linear Algebra, 5th Edition

6.5 Unitary and Orthogonal Operators and Their Matrices

In this section, we continue our analogy between complex numbers and linear operators. Recall that the adjoint of a linear operator acts similarly to the conjugate of a complex number (see, for example, Theorem 6.11 p. 357). A complex number z has length 1 if zz¯=1. In this section, we study those linear operators T on an inner product space V such that TT*=T*T=I. We will see that these are precisely the linear operators that “preserve length” in the sense that ||T(x)||=||x|| for all xV. As another characterization, we prove that, on a finite-dimensional complex inner product space, these are the normal operators whose eigenvalues all have absolute value 1.

In past chapters, we were interested in studying those functions that preserve the structure of the underlying space. In particular, linear operators preserve the operations of vector addition and scalar multiplication, and isomorphisms preserve all the vector space structure. It is now natural to consider those linear operators T on an inner product space that preserve length. We will see that this condition guarantees, in fact, that T preserves the inner product.

Definitions.

Let T be a linear operator on a finite-dimensional inner product space V (over F). If ||T(x)||=||x|| for all xV, we call T a unitary operator if F=C and an orthogonal operator if F=R.

It should be noted that in the infinite-dimensional case, an operator that preserves the norm is one-to-one, but not necessarily onto. If it is also onto, then we call it a unitary or orthogonal operator.

Clearly, any rotation or reflection in R2 preserves length and hence is an orthogonal operator. We study these operators in much more detail in Section 6.11.

Example 1

Recall the inner product space H defined on page 330. Let hH satisfy |h(x)|=1 for all x. Define the linear operator T on H by T(f)=hf. Then

| | T ( f ) | | 2 = | | h f | | 2 = 1 2 π 0 2 π h ( t ) f ( t ) h ( t ) f ( t ) ¯   d t = | | f | | 2

since |h(t)|2=1 for all t. So T is a unitary operator.

Theorem 6.18.

Let T be a linear operator on a finite-dimensional inner product space V. Then the following statements are equivalent.

  1. (a) T*T=I.

  2. (b) TT*=I.

  3. (c) T(x), T(y) = x, y for all x, yV.

  4. (d) If β is an orthonormal basis for V, then T(β) is an orthonormal basis for V.

  5. (e) There exists an orthonormal basis β for V such that T(β) is an orthonormal basis for V.

  6. (f) ||T(x)||=||x|| for all xV.

Thus all the conditions above are equivalent to the definition of a unitary or orthogonal operator. From (a) and (b), it follows that unitary or orthogonal operators are normal.

Before proving the theorem, we first prove a lemma. Compare this lemma to Exercise 11(b) of Section 6.4.

Lemma. Let U be a self-adjoint operator on an inner product space V, and suppose that x, U(x) =0 for all xV. Then U=T0.

Proof.

For any xV,

0 = x + U ( x ) ,  U ( x + U ( x ) ) = x + U ( x ) ,  U ( x ) + U 2 ( x ) = x ,  U ( x ) + x ,  U 2 ( x ) + U ( x ) ,  U ( x ) + U ( x ) ,  U 2 ( x ) = 0 + x ,  U 2 ( x ) + U ( x ) ,  U ( x ) + 0 = x ,  U*U ( x ) + | | U ( x ) | | 2 = 2 | | U ( x ) | | 2 .

So for any xV, ||U(x)||=0. It follows that U=T0.

Proof of Theorem 6.18.

Part (a) implies (b) by Theorem 6.10 and Exercise 10(c) of Section 2.4.

To prove that (b) implies (c), let x, yV. Then x, y = T*T)(x), y = T(x), T(y) .

Next, we prove that (c) implies (d). Let β={v1, v2, , vn} be an orthonormal basis for V; so T(β)={T(v1), T(v2), , T(vn)}. It follows that T(vi), T(vj) = vi, vj =δij. Therefore T(β) is an orthonormal basis for V. That (d) implies (e) is obvious.

Next we prove that (e) implies (f). Let xV, and let β={v1, v2, , vn}. Now

x = i = 1 n a i v i

for some scalars ai, and so

| | x | | 2 = i = 1 n a i v i , j = 1 n a j v j = i = 1 n j = 1 n a i a ¯ j v i ,   v j = i = 1 n j = 1 n a i a ¯ j δ i j = i = 1 n | a i | 2

since β is orthonormal.

Applying the same manipulations to

T ( x ) = i = 1 n a i T ( v i )

and using the fact that T(β) is also orthonormal, we obtain

| | T ( x ) | | 2 = i = 1 n | a i | 2 .

Hence ||T(x)||=||x||.

Finally, we prove that (f) implies (a). For any xV, we have

x ,   y = | | x | | 2 = | | T ( x ) | | 2 = T ( x ) ,  T ( x ) = x ,  T*T ( x ) .

So x, (IT*T)(x) =0 for all xV. Let U=IT*T then U is self-adjoint, and x, U(x) =0 for all xV. Hence, by the lemma, we have T0=U=IT*T, and therefore T*T=I.

In the case that T satisfies (c), we say that T preserves inner products. In the case that T satisfies (f), we say that T preserves norms.

It follows immediately from the definition that every eigenvalue of a unitary or orthogonal operator has absolute value 1. In fact, even more is true.

Corollary 1.

Let T be a linear operator on a finite-dimensional real inner product space V. Then V has an orthonormal basis of eigenvectors of T with corresponding eigenvalues of absolute value 1 if and only if T is both self-adjoint and orthogonal.

Proof.

Suppose that V has an orthonormal basis {v1, v2,, vn} such that T(vi)=λivi and |λi|=1 for all i. By Theorem 6.17 (p. 371), T is self-adjoint. Thus (TT*)(vi)=T(λivi)=λiλivi=λi2vi for each i. So TT*=I, and again by Exercise 10 of Section 2.4, T is orthogonal by Theorem 6.18(a).

If T is self-adjoint, then, by Theorem 6.17, we have that V possesses an orthonormal basis {v1, v2, , vn} such that T(vi)=λivi for all i. If T is also orthogonal, we have

| λ i | | | v i | | = | | λ i v i | | | | T ( v i ) | | = | | v i | | ;

so |λi|=1 for every i.

Corollary 2.

Let T be a linear operator on a finite-dimensional complex inner product space V. Then V has an orthonormal basis of eigenvectors of T with corresponding eigenvalues of absolute value 1 if and only if T is unitary.

Proof.

The proof is similar to the proof of Corollary 1.

Example 2

Let T: R2R2 be a rotation by θ, where 0<θ<π It is clear geometrically that T “preserves length,” that is, that ||T(x)||=||x|| for all xR2. The fact that rotations by a fixed angle preserve perpendicularity not only can be seen geometrically but now follows from (b) of Theorem 6.18. Perhaps the fact that such a transformation preserves the inner product is not so obvious; however, we obtain this fact from (b) also. Finally, an inspection of the matrix representation of T with respect to the standard ordered basis, which is

( cos   θ sin   θ sin   θ cos   θ ) ,

reveals that T is not self-adjoint for the given restriction on θ. As we mentioned earlier, this fact also follows from the geometric observation that T has no eigenvectors and from Theorem 6.15 (p. 368). It is seen easily from the preceding matrix that T* is the rotation by θ.

Definition.

Let L be a one-dimensional subspace of R2 . We may view L as a line in the plane through the origin. A linear operator T on R2 is called a reflection of R2 about L if T(x)=x for all xL and T(x)=x for all xL.

As an example of a reflection, consider the operator defined in Example 3 of Section 2.5.

Example 3

Let T be a reflection of R2 about a line L through the origin. We show that T is an orthogonal operator. Select vectors v1L and v2L such that ||v1||=||v2||=1. Then T(v1)=v1 and T(v2)=v2. Thus v1 and v2 are eigenvectors of T with corresponding eigenvalues 1 and 1, respectively. Furthermore, {v1, v2} is an orthonormal basis for R2 . It follows that T is an orthogonal operator by Corollary 1 to Theorem 6.18.

We now examine the matrices that represent unitary and orthogonal transformations.

Definitions.

A square matrix A is called an orthogonal matrix if AtA=AAt=I and unitary if A*A=AA=I.

Since for a real matrix A we have A*=At, a real unitary matrix is also orthogonal. In this case, we call A orthogonal rather than unitary.

Note that the condition AA*=I is equivalent to the statement that the rows of A form an orthonormal basis for Fn because

δ i j = I i j = ( A A * ) i j = k = 1 n A i k ( A * ) k j = k = 1 n A i k A ¯ j k ,

and the last term represents the inner product of the ith and jth rows of A.

A similar remark can be made about the columns of A and the condition A*A=I.

It also follows from the definition above and from Theorem 6.10 (p. 356) that a linear operator T on an inner product space V is unitary [orthogonal] if and only if [T]β is unitary [orthogonal] for some orthonormal basis β for V.

Example 4

From Example 2, the matrix

( cos   θ sin   θ sin   θ cos   θ )

is clearly orthogonal. One can easily see that the rows of the matrix form an orthonormal basis for R2 . Similarly, the columns of the matrix form an orthonormal basis for R2 .

Example 5

Let T be a reflection of R2 about a line L through the origin, let β be the standard ordered basis for R2 , and let A=[T]β. Then T=LA. Since T is an orthogonal operator and β is an orthonormal basis, A is an orthogonal matrix. We describe A.

Suppose that α is the angle from the positive x-axis to L. Let v1=(cos α, sin α) and v2=(sin α, cos α). Then ||v1||=||v2||=1, v1L, and v2L. Hence γ={v1, v2} is an orthonormal basis for R2 . Because T(v1)=v1 and T(v2)=v2, we have

[ T ] γ = [ L A ] γ = ( 1 0 0 1 ) .

Let

Q = ( cos   α sin   α sin   α cos   α ) .

By the corollary to Theorem 2.23 (p. 115),

A = Q [ L A ] γ Q 1 = ( cos   α sin   α sin   α cos   α ) ( 1 0 0 1 ) ( cos   α sin   α sin   α cos   α ) = ( cos 2   α sin 2   α 2   sin   α   cos   α 2   sin   α   cos   α ( cos 2   α sin 2   α ) ) = ( cos   2 α sin   2 α sin   2 α cos   2 α ) .

We know that, for a complex normal [real symmetric] matrix A, there exists an orthonormal basis β for Fn consisting of eigenvectors of A. Hence A is similar to a diagonal matrix D. By the corollary to Theorem 2.23 (p. 115), the matrix Q whose columns are the vectors in β is such that D=Q1AQ. But since the columns of Q are an orthonormal basis for Fn, it follows that Q is unitary [orthogonal]. In this case, we say that A is unitarily equivalent [orthogonally equivalent] to D. It is easily seen (see Exercise 18) that this relation is an equivalence relation on Mn×n(C) [Mn×n(R)]. More generally, A and B are unitarily equivalent [orthogonally equivalent] if and only if there exists a unitary [orthogonal] matrix P such that A=P*BP.

The preceding paragraph has proved half of each of the next two theorems.

Theorem 6.19.

Let A be a complex n×n matrix. Then A is normal if and only if A is unitarily equivalent to a diagonal matrix.

Proof.

By the preceding remarks, we need only prove that if A is unitarily equivalent to a diagonal matrix, then A is normal.

Suppose that A=P*DP, where P is a unitary matrix and D is a diagonal matrix. Then

A A * = ( P * D P ) ( P * D P ) * = ( P * D P ) ( P * D * P ) = P * D I D * P = P * D D * P .

Similarly, A*A=P*D*DP. Since D is a diagonal matrix, however, we have DD*=D*D. Thus AA*=A*A.

Theorem 6.20.

Let A be a real n×n matrix. Then A is symmetric if and only if A is orthogonally equivalent to a real diagonal matrix.

Proof.

The proof is similar to the proof of Theorem 6.19 and is left as an exercise.

Theorem 6.20 is used extensively in many areas of mathematics and statistics. See, for example, goo.gl/cbqApK.

Example 6

Let

A = ( 4 2 2 2 4 2 2 2 4 ) .

Since A is symmetric, Theorem 6.20 tells us that A is orthogonally equivalent to a diagonal matrix. We find an orthogonal matrix P and a diagonal matrix D such that PtAP=D.

To find P, we obtain an orthonormal basis of eigenvectors. It is easy to show that the eigenvalues of A are 2 and 8. The set {(1, 1, 0), (1, 0, 1)} is a basis for the eigenspace corresponding to 2. Because this set is not orthogonal, we apply the Gram-Schmidt process to obtain the orthogonal set {(1, 1, 0), 12(1, 1, 2)}. The set {(1, 1, 1)} is a basis for the eigenspace corresponding to 8. Notice that (1, 1, 1) is orthogonal to the preceding two vectors, as predicted by Theorem 6.15(d) (p. 368). Taking the union of these two bases and normalizing the vectors, we obtain the following orthonormal basis for R3 consisting of eigenvectors of A:

{ 1 2 ( 1 ,   1 ,   0 ) ,   1 6 ( 1 ,   1 ,   2 ) ,   1 3 ( 1 ,   1 ,   1 ) } .

Thus one possible choice for P is

P = ( 1 2 1 6 1 3 1 2 1 6 1 3 0 2 6 1 3 ) , and D = ( 2 0 0 0 2 0 0 0 8 ) .

Because of Schur’s theorem (Theorem 6.14 p. 367), the next result is immediate. As it is the matrix form of Schur’s theorem, we also refer to it as Schur’s theorem.

Theorem 6.21. (Schur).

Let AMn×n(F) be a matrix whose characteristic polynomial splits over F.

  1. (a) If F=C, then A is unitarily equivalent to a complex upper triangular matrix.

  2. (b) If F=R, then A is orthogonally equivalent to a real upper triangular matrix.

Rigid Motions*

The purpose of this application is to characterize the so-called rigid motions of a finite-dimensional real inner product space. One may think intuitively of such a motion as a transformation that does not affect the shape of a figure under its action, hence the term rigid. The key requirement for such a transformation is that it preserves distances.

Definition.

Let V be a real inner product space. A function f: VV is called a rigid motion if

| | f ( x ) f ( y ) | | = | | x = y | |

for all x, yV.

For example, any orthogonal operator on a finite-dimensional real inner product space is a rigid motion.

Another class of rigid motions are the translations. A function g: VV, where V is a real inner product space, is called a translation if there exists a vector v0V such that g(x)=x+v0 for all xV. We say that g is the translation by v0. It is a simple exercise to show that translations, as well as composites of rigid motions on a real inner product space, are also rigid motions. (See Exercise 22.) Thus an orthogonal operator on a finite-dimensional real inner product space V followed by a translation on V is a rigid motion on V. Remarkably, every rigid motion on V may be characterized in this way.

Theorem 6.22.

Let f: VV be a rigid motion on a finite-dimensional real inner product space V. Then there exists a unique orthogonal operator T on V and a unique translation g on V such that f=gT.

Any orthogonal operator is a special case of this composite, in which the translation is by 0. Any translation is also a special case, in which the orthogonal operator is the identity operator.

Proof.

Let T:VV be defined by

T ( x ) = f ( x ) f ( 0 )

for all xV. Note that f=gT, where g is the translation by f(0). Moreover, T is the composite of f and the translation by f(0); hence T is a rigid motion. We begin by showing that T is a linear operator. For any xV, we have

| | T ( x ) | | 2 = | | f ( x ) f ( 0 ) | | 2 = | | x 0 | | 2 = | | x | | 2 ,

and consequently ||T(x)||=||x|| for any xV. Thus for any x, yV,

| | T ( x ) T ( y ) | | 2 = | | T ( x ) | | 2 2 T ( x ) ,  T ( y ) + | | T ( y ) | | 2 = | | x | | 2 2 T ( x ) ,  T ( y ) + | | y | | 2

and

| | x y | | 2 = | | x | | 2 2 x ,   y + | | y | | 2 .

But ||T(x)T(y)||2=||xy||2; so T(x), T(y) = x,y for all x, yV.

We are now in a position to show that T is a linear transformation. Let x, yV, and let aR. Then

| | T ( x + a y ) T ( x ) a T ( y ) | | 2 = | | [ T ( x + a y ) T ( x ) ] a T ( y ) | | 2 = | | T ( x + a y ) T ( x ) | | 2 + a 2 | | T ( y ) | | 2 2 a T ( x + a y ) T ( x ) ,  T ( y ) = | | ( x + a y ) x | | 2 + a 2 | | y | | 2 2 a [ T ( x ) + a y ,   y x ,   y ] = a 2 | | y | | 2 + a 2 | | y | | 2 2 a [ x + a y ,   y x ,   y ] = 2 a 2 | | y | | 2 2 a [ x ,   y + a | | y | | 2 x ,   y ] = 0.

Thus T(x+ay)=T(x)+aT(y), and hence T is linear. Since we have already shown that T preserves inner products, T is an orthogonal operator.

To prove uniqueness, suppose that u0 and v0 are in V and T and U are orthogonal operators on V such that

f ( x ) = T ( x ) + u 0 = U ( x ) + v 0

for all xV. Substituting x=0 in the preceding equation yields u0=v0, and hence the translation is unique. This equation, therefore, reduces to T(x)=U(x) for all xV, and hence T=U.

Orthogonal Operators on R2

Because of Theorem 6.22, an understanding of rigid motions requires a characterization of orthogonal operators. The next result characterizes orthogonal operators on R2 . We postpone the case of orthogonal operators on more general spaces to Section 6.11.

Theorem 6.23.

Let T be an orthogonal operator on R2 , and let A=[T]β, where β is the standard ordered basis for R2 . Then exactly one of the following conditions is satisfied:

  1. (a) T is a rotation, and det(A)=1.

  2. (b) T is a reflection about a line through the origin, and det(A)=1.

Proof.

Because T is an orthogonal operator, T(β)={T(e1), T(e2)} is an orthonormal basis for R2 by Theorem 6.18(c). Since T(e1) is a unit vector, there is a unique angle θ, 0θ<2π, such that T(e1)=(cos θ, sin θ). Since T(e2) is a unit vector and is orthogonal to T(e1), there are only two possible choices for T(e2). Either

T ( e 2 ) = ( sin   θ ,   cos   θ ) or T ( e 2 ) = ( sin   θ ,   cos   θ ) .

First, suppose that T(e2)=(sin θ, cos θ). Then A=(cos θsin θsin θcos θ).

It follows from Example 1 of Section 6.4 that T is a rotation by the angle θ. Also

det ( A ) = cos 2   θ + sin 2   θ = 1.

Now suppose that T(e2)=(sin θ,cos θ). Then A=(cos θsin θsin θcos θ).

Comparing this matrix to the matrix A of Example 5, we see that T is the reflection of R2 about a line L such that α=θ/2 is the angle from the positive x-axis to L. Furthermore,

det ( A ) = cos 2   θ sin 2   θ = 1.

Combining Theorems 6.22 and 6.23, we obtain the following characterization of rigid motions on R2 .

Corollary.

Any rigid motion on R2 is either a rotation followed by a translation or a reflection about a line through the origin followed by a translation.

Example 7

Let

A = ( 1 5 2 5 2 5 1 5 ) .

We show that LA is the reflection of R2 about a line L through the origin, and then describe L.

Clearly AA*=A*A=I, and therefore A is an orthogonal matrix. Hence LA is an orthogonal operator. Furthermore,

det ( A ) = 1 5 4 5 = 1 ,

and thus LA is a reflection of R2 about a line L through the origin by Theorem 6.23. Since L is the one-dimensional eigenspace corresponding to the eigenvalue 1 of LA, it suffices to find an eigenvector of LA corresponding to 1. One such vector is v=(2, 51). Thus L is the span of {v}. Alternatively, L is the line through the origin with slope (51)/2, and hence is the line with the equation

y = 5 1 2 x .

Conic Sections

As an application of Theorem 6.20, we consider the quadratic equation

a x 2 + 2 b x y + c y 2 + d x + e y + f = 0. (2)

For special choices of the coefficients in (2), we obtain the various conic sections. For example, if a=c=1, b=d=e=0, and f=1, we obtain the circle x2+y2=1 with center at the origin. The remaining conic sections, namely, the ellipse, parabola, and hyperbola, are obtained by other choices of the coefficients. If b=0, then it is easy to graph the equation by the method of completing the square because the xy-term is absent. For example, the equation x2+2x+y2+4y+2=0 may be rewritten as (x+1)2+(y+2)2=3, which describes a circle with radius 3 and center at (1, 2) in the xy-coordinate system. If we consider the transformation of coordinates (x, y)(x, y), where x=x+1 and y=y+2, then our equation simplifies to (x)2+(y)2=3. This change of variable allows us to eliminate the x- and y-terms.

We now concentrate solely on the elimination of the xy-term. To accomplish this, we consider the expression

a x 2 + 2 b x y + c y 2 , (3)

which is called the associated quadratic form of (2). Quadratic forms are studied in more generality in Section 6.8.

If we let

A = ( a b b x ) and X = ( x y ) ,

then (3) may be written as XtAX= AX, X . For example, the quadratic form 3x2+4xy+6y2 may be written as

X t ( 3 2 2 6 ) X .

The fact that A is symmetric is crucial in our discussion. For, by Theorem 6.20, we may choose an orthogonal matrix P and a diagonal matrix D with real diagonal entries λ1 and λ2 such that PtAP=D. Now define

X = ( x y )

by X=PtX or, equivalently, by PX=PPtX=X. Then

X t A X = ( P X ) t A ( P X ) = X t ( P t A P ) X = X t D X = λ 1 ( x ) 2 + λ 2 ( y ) 2 .

Thus the transformation (x, y)(x, y) allows us to eliminate the xy-term in (3), and hence in (2).

Furthermore, since P is orthogonal, we have by Theorem 6.23 (with T=LP) that det(P)=±1. If det(P)=1, we may interchange the columns of P to obtain a matrix Q. Because the columns of P form an orthonormal basis of eigenvectors of A, the same is true of the columns of Q. Therefore

Q t A Q = ( λ 2 0 0 λ 1 ) .

Notice that det(Q)=det(p)=1. So, if det(P)=1, we can take Q for our new P; consequently, we may always choose P so that det(P)=1. By Lemma 4 to Theorem 6.22 (with T=LP), it follows that matrix P represents a rotation.

In summary, the xy-term in (2) may be eliminated by a rotation of the x-axis and y-axis to new axes x and y given by X=PX, where P is an orthogonal matrix and det(P)=1. Furthermore, the coefficients of (x)2 and (y)2 are the eigenvalues of

A = ( a b b c ) .

This result is a restatement of a result known as the principal axis theorem for R2. The arguments above, of course, are easily extended to quadratic equations in n variables. For example, in the case n=3, by special choices of the coefficients, we obtain the quadratic surfaces—the elliptic cone, the ellipsoid, the hyperbolic paraboloid, etc.

As an illustration of the preceding transformation, consider the quadratic equation

2 x 2 4 x y + 5 y 2 36 = 0 ,

for which the associated quadratic form is 2x24xy+5y2. In the notation we have been using,

A = ( 2 2 2 5 ) ,

so that the eigenvalues of A are 1 and 6 with associated eigenvectors

( 2 1 ) and ( 1 2 ) .

As expected (from Theorem 6.15(d) p. 368), these vectors are orthogonal. The corresponding orthonormal basis of eigenvectors

β = { ( 2 5 1 5 ) ,   ( 1 5 2 5 ) }

determines new axes x and y as in Figure 6.4. Hence if

Figure 6.4

P = ( 2 5 1 5 1 5 2 5 ) = 1 5 ( 2 1 1 2 ) ,

then

P t A P = ( 1 0 0 6 ) .

Under the transformation X=PX or

x = 2 5 x 1 5 y y = 1 5 x + 2 5 y ,

we have the new quadratic form (x)2+6(y)2. Thus the original equation 2x24xy+5y2=36 may be written in the form (x)2+6(y)2=36 relative to a new coordinate system with the x- and y-axes in the directions of the first and second vectors of β, respectively. It is clear that this equation represents an ellipse. (See Figure 6.4.) Note that the preceding matrix P has the form

( cos   θ sin θ sin   θ cos   θ ) ,

where θ=cos12526.6°. So P is the matrix representation of a rotation of R2 through the angle θ. Thus the change of variable X=PX can be accomplished by this rotation of the x- and y-axes. There is another possibility for P, however. If the eigenvector of A corresponding to the eigenvalue 6 is taken to be (1, 2) instead of (1, 2), and the eigenvalues are interchanged, then we obtain the matrix

( 1 5 2 5 2 5 1 5 ) ,

which is the matrix representation of a rotation through the angle θ=sin1(25)63.4°. This possibility produces the same ellipse as the one in Figure 6.4, but interchanges the names of the x- and y-axes.

Exercises

  1. Label the following statements as true or false. Assume that the underlying inner product spaces are finite-dimensional.

    1. (a) Every unitary operator is normal.

    2. (b) Every orthogonal operator is diagonalizable.

    3. (c) A matrix is unitary if and only if it is invertible.

    4. (d) If two matrices are unitarily equivalent, then they are also similar.

    5. (e) The sum of unitary matrices is unitary.

    6. (f) The adjoint of a unitary operator is unitary.

    7. (g) If T is an orthogonal operator on V, then [T]β is an orthogonal matrix for any ordered basis β for V.

    8. (h) If all the eigenvalues of a linear operator are 1, then the operator must be unitary or orthogonal.

    9. (i) A linear operator may preserve norms without preserving inner products.

  2. For each of the following matrices A, find an orthogonal or unitary matrix P and a diagonal matrix D such that P*AP=D.

    1. (a) (1221)

    2. (b) (0110)

    3. (c) (233i3+3i5)

    4. (d) (022202220)

    5. (e) (211121112)

  3. Prove that the composite of unitary [orthogonal] operators is unitary [orthogonal].

  4. For zC, define Tz: CC by Tz(u)=zu. Characterize those z for which Tz is normal, self-adjoint, or unitary.

  5. Which of the following pairs of matrices are unitarily equivalent?

    1. (a) (1001) and (0110)

    2. (b) (0110) and (012120)

    3. (c) (010100001) and (200010000)

    4. (d) (010100001) and (1000i000i)

    5. (e) (110022003) and (100020003)

  6. Let V be the inner product space of complex-valued continuous functions on [0, 1] with the inner product

    f ,   g = 0 1 f ( t ) g ( t ) ¯   d t .

    Let hV, and define T: VV by T(f)=hf. Prove that T is a unitary operator if and only if |h(t)|=1 for 0t1.

    red Hint for the “only if” part: Suppose that T is unitary. Set f(t)=1|h(t)|2 and g(t)=1. Show that

    0 1 ( 1 | h ( t ) | 2 ) 2   d t = 0 ,

    and use the fact that if the integral of a nonnegative continuous function is zero, then the function is identically zero.

  7. Prove that if T is a unitary operator on a finite-dimensional inner product space V, then T has a unitary square root; that is, there exists a unitary operator U such that T=U2. Visit goo.gl/jADTaS for a solution.

  8. Let T be a self-adjoint linear operator on a finite-dimensional inner product space. Prove that (T+iI)(TiI)1 is unitary using Exercise 10 of Section 6.4.

  9. Let U be a linear operator on a finite-dimensional inner product space V. If ||U(x)||=||x|| for all x in some orthonormal basis for V, must U be unitary? Justify your answer with a proof or a counterexample.

  10. Let A be an n×n real symmetric or complex normal matrix. Prove that

    tr ( A ) = i = 1 n λ i and tr ( A * A ) = i = 1 n | λ i | 2 ,

    where the λi’s are the (not necessarily distinct) eigenvalues of A.

  11. Find an orthogonal matrix whose first row is (13, 23, 23).

  12. Let A be an n×n real symmetric or complex normal matrix. Prove that

    det ( A ) = i = 1 n λ i ,

    where the λi ‘s are the (not necessarily distinct) eigenvalues of A.

  13. Suppose that A and B are diagonalizable matrices. Prove or disprove that A is similar to B if and only if A and B are unitarily equivalent.

  14. Prove that if A and B are unitarily equivalent matrices, then A is positive definite [semidefinite] if and only if B is positive definite [semidefinite]. (See the definitions in the exercises in Section 6.4.)

  15. Let U be a unitary operator on an inner product space V, and let W be a finite-dimensional U-invariant subspace of V. Prove that

    1. (a) U(W)=W;

    2. (b) W is U-invariant.

    Contrast (b) with Exercise 16.

  16. Find an example of a unitary operator U on an inner product space and a U-invariant subspace W such that W is not U-invariant.

  17. Prove that a matrix that is both unitary and upper triangular must be a diagonal matrix.

  18. Show that “is unitarily equivalent to” is an equivalence relation on Mn×n(C).

  19. Let W be a finite-dimensional subspace of an inner product space V. By Theorem 6.7 (p. 349) and the exercises of Section 1.3, V=WW. Define U:VV by U(v1+v2)=v1v2, where v1W and v2W. Prove that U is a self-adjoint unitary operator.

  20. Let V be a finite-dimensional inner product space. A linear operator U on V is called a partial isometry if there exists a subspace W of V such that ||U(x)||||x|| for all xW and U(x)=0 for all xW. Observe that W need not be U-invariant. Suppose that U is such an operator and {v1, v2, , vk} is an orthonormal basis for W. Prove the following results.

    1. (a) U(x), U(y) = x, y for all x, yW. Hint: Use Exercise 20 of Section 6.1.

    2. (b) {U(v1), U(v2), , U(vk)} is an orthonormal basis for R(U).

    3. (c) There exists an orthonormal basis γ for V such that the first k columns of [U]γ form an orthonormal set and the remaining columns are zero.

    4. (d) Let {w1, w2, , wj} be an orthonormal basis for R(U) and β={U(v1), U(v2), , U(vk), w1, , wj}. Then β is an orthonormal basis for V.

    5. (e) Let T be the linear operator on V that satisfies T(U(vi))=vi(1ik) and T(wi)=0 (1ij). Then T is well defined, and T=U*. Hint: Show that U(x), y = x, T(y) for all x, yβ. There are four cases.

    6. (f) U* is a partial isometry.

    This exercise is continued in Exercise 9 of Section 6.6.

  21. Let A and B be n×n matrices that are unitarily equivalent.

    1. (a) Prove that tr(A*A)=tr(B*B).

    2. (b) Use (a) to prove that

      i ,   j = 1 n | A i j | 2 = i ,   j = 1 n | B i j | 2 .
    3. (c) Use (b) to show that the matrices

      ( 1 2 2 i ) and ( i 4 1 1 )

      are not unitarily equivalent.

  22. Let V be a real inner product space.

    1. (a) Prove that any translation on V is a rigid motion.

    2. (b) Prove that the composite of any two rigid motions on V is a rigid motion on V.

  23. Prove the following variant of Theorem 6.22: If f: VV is a rigid motion on a finite-dimensional real inner product space V, then there exists a unique orthogonal operator T on V and a unique translation g on V such that f=Tg. (Note that the conclusion of Theorem 6.22 has f=gT).

  24. Let T and U be orthogonal operators on R2 . Use Theorem 6.23 to prove the following results.

    1. (a) If T and U are both reflections about lines through the origin, then UT is a rotation.

    2. (b) If T is a rotation and U is a reflection about a line through the origin, then both UT and TU are reflections about lines through the origin.

  25. Suppose that T and U are reflections of R2 about the respective lines L and L through the origin and that ϕ and ψ are the angles from the positive x-axis to L and L, respectively. By Exercise 24, UT is a rotation. Find its angle of rotation.

  26. Suppose that T and U are orthogonal operators on R2 such that T is the rotation by the angle ϕ and U is the reflection about the line L through the origin. Let ψ be the angle from the positive x-axis to L. By Exercise 24, both UT and TU are reflections about lines L1 and L2, respectively, through the origin.

    1. (a) Find the angle θ from the positive x-axis to L1.

    2. (b) Find the angle θ from the positive x-axis to L2.

  27. Find new coordinates x, y so that the following quadratic forms can be written as A λ1(x)2+λ2(y)2.

    1. (a) x2+4xy+y2

    2. (b) 2x2+2xy+2y2

    3. (c) x212xy4y2

    4. (d) 3x2+2xy+3y2

    5. (e) x22xy+y2

  28. Consider the expression XtAX, where Xt=(x, y, z) and A is as defined in Exercise 2(e). Find a change of coordinates x, y ,z so that the preceding expression is of the form λ1(x)2+λ2(y)2+λ3(z)2.

  29. QR-Factorization. Let w1, w2, , wn be linearly independent vectors in Fn, and let v1, v2, , vn be the orthogonal vectors obtained from w1, w2, , wn by the Gram-Schmidt process. Let u1, u2, , un be the orthonormal basis obtained by normalizing the vi’s.

    1. (a) Solving (1) in Section 6.2 for wk in terms of uk, show that

      w k = | | v k | | u k + j = 1 k 1 w k ,   u j u j ( 1 k n ) .
    2. (b) Let A and Q denote the n×n matrices in which the kth columns are wk and uk, respectively. Define RMn×n(F) by

      R j k = { | | v j | | if  j = k w k ,   u j if  j < k 0 if  j > k .

      Prove A=QR.

    3. (c) Compute Q and R as in (b) for the 3×3 matrix whose columns are the vectors (1, 1, 0), (2, 0, 1), and (2, 2, 1).

    4. (d) Since Q is unitary [orthogonal] and R is upper triangular in (b), we have shown that every invertible matrix is the product of a unitary [orthogonal] matrix and an upper triangular matrix. Suppose that AMn×n(F) is invertible and A=Q1R1=Q2R2, where Q1, Q2Mn×n(F) are unitary and R1,R2,Mn×n(F) are upper triangular. Prove that D=R2R11 is a unitary diagonal matrix. Hint: Use Exercise 17.

    5. (e) The QR factorization described in (b) provides an orthogonalization method for solving a linear system Ax=b when A is invertible. Decompose A to QR, by the Gram-Schmidt process or other means, where Q is unitary and R is upper triangular. Then QRx=b, and hence Rx=Q*b. This last system can be easily solved since R is upper triangular.1

      Use the orthogonalization method and (c) to solve the system

      x 1 + 2 x 2 + 2 x 3 = 1 x 1 + 2 x 3 = 11 x 2 + x 3 = 1 .
  30. Suppose that β and γ are ordered bases for an n-dimensional real [complex] inner product space V. Prove that if Q is an orthogonal [unitary] n×n matrix that changes γ-coordinates into β-coordinates, then β is orthonormal if and only if γ is orthonormal.

The following definition is used in Exercises 31 and 32.

Definition.

Let V be a finite-dimensional complex [real] inner product space, and let u be a unit vector in V. Define the Householder operator Hu: VV by Hu(x)=x2 x, u u for all xV.

  1. Let Hu be a Householder operator on a finite-dimensional inner product space V. Prove the following results.

    1. (a) Hu is linear.

    2. (b) Hu(x)=x if and only if x is orthogonal to u.

    3. (c) Hu(x)=u.

    4. (d) Hu*=Hu and Hu2=I, and hence Hu is a unitary [orthogonal] operator on V.

    (Note: If V is a real inner product space, then in the language of Section 6.11, Hu is a reflection.)

  2. Let V be a finite-dimensional inner product space over F. Let x and y be linearly independent vectors in V such that ||x||=||y||.

    1. (a) If F=C, prove that there exists a unit vector u in V and a complex number θ with |θ|=1 such that Hu(x)=θy. Hint: Choose θ so that x, θy is real, and set u=1||xθy||(xθy).

    2. (b) If F=R, prove that there exists a unit vector u in V such that Hu(x)=y.