6.5 Unitary and Orthogonal Operators and Their Matrices

In this section, we continue our analogy between complex numbers and linear operators. Recall that the adjoint of a linear operator acts similarly to the conjugate of a complex number (see, for example, Theorem 6.11 p. 357). A complex number z has length 1 if . In this section, we study those linear operators T on an inner product space V such that . We will see that these are precisely the linear operators that “preserve length” in the sense that  for all . As another characterization, we prove that, on a finite-dimensional complex inner product space, these are the normal operators whose eigenvalues all have absolute value 1.

In past chapters, we were interested in studying those functions that preserve the structure of the underlying space. In particular, linear operators preserve the operations of vector addition and scalar multiplication, and isomorphisms preserve all the vector space structure. It is now natural to consider those linear operators T on an inner product space that preserve length. We will see that this condition guarantees, in fact, that T preserves the inner product.

Definitions.

Let T be a linear operator on a finite-dimensional inner product space V (over F). If  for all , we call T a unitary operator if  and an orthogonal operator if .

It should be noted that in the infinite-dimensional case, an operator that preserves the norm is one-to-one, but not necessarily onto. If it is also onto, then we call it a unitary or orthogonal operator.

Clearly, any rotation or reflection in  preserves length and hence is an orthogonal operator. We study these operators in much more detail in Section 6.11.

Example 1

Recall the inner product space H defined on page 330. Let  satisfy  for all x. Define the linear operator T on H by . Then



since  for all t. So T is a unitary operator.

Theorem 6.18.

Let T be a linear operator on a finite-dimensional inner product space V. Then the following statements are equivalent.

1. (a) 

2. (b) 

3. (c)  for all .

4. (d) If  is an orthonormal basis for V, then  is an orthonormal basis for V.

5. (e) There exists an orthonormal basis  for V such that  is an orthonormal basis for V.

6. (f)  for all .

Thus all the conditions above are equivalent to the definition of a unitary or orthogonal operator. From (a) and (b), it follows that unitary or orthogonal operators are normal.

Before proving the theorem, we first prove a lemma. Compare this lemma to Exercise 11(b) of Section 6.4.

Lemma. Let U be a self-adjoint operator on an inner product space V, and suppose that  for all . Then .

Proof.

For any ,



So for any . It follows that .

Proof of Theorem 6.18.

Part (a) implies (b) by Theorem 6.10 and Exercise 10(c) of Section 2.4.

To prove that (b) implies (c), let . Then .

Next, we prove that (c) implies (d). Let  be an orthonormal basis for V; so . It follows that . Therefore  is an orthonormal basis for V. That (d) implies (e) is obvious.

Next we prove that (e) implies (f). Let , and let . Now



for some scalars , and so



since  is orthonormal.

Applying the same manipulations to



and using the fact that  is also orthonormal, we obtain



Hence .

Finally, we prove that (f) implies (a). For any , we have



So  for all . Let  then U is self-adjoint, and  for all . Hence, by the lemma, we have , and therefore .

In the case that T satisfies (c), we say that T preserves inner products. In the case that T satisfies (f), we say that T preserves norms.

It follows immediately from the definition that every eigenvalue of a unitary or orthogonal operator has absolute value 1. In fact, even more is true.

Corollary 1.

Let T be a linear operator on a finite-dimensional real inner product space V. Then V has an orthonormal basis of eigenvectors of T with corresponding eigenvalues of absolute value 1 if and only if T is both self-adjoint and orthogonal.

Proof.

Suppose that V has an orthonormal basis  such that  and  for all i. By Theorem 6.17 (p. 371), T is self-adjoint. Thus  for each i. So , and again by Exercise 10 of Section 2.4, T is orthogonal by Theorem 6.18(a).

If T is self-adjoint, then, by Theorem 6.17, we have that V possesses an orthonormal basis  such that  for all i. If T is also orthogonal, we have



so  for every i.

Corollary 2.

Let T be a linear operator on a finite-dimensional complex inner product space V. Then V has an orthonormal basis of eigenvectors of T with corresponding eigenvalues of absolute value 1 if and only if T is unitary.

Proof.

The proof is similar to the proof of Corollary 1.

Example 2

Let  be a rotation by , where  It is clear geometrically that T “preserves length,” that is, that  for all . The fact that rotations by a fixed angle preserve perpendicularity not only can be seen geometrically but now follows from (b) of Theorem 6.18. Perhaps the fact that such a transformation preserves the inner product is not so obvious; however, we obtain this fact from (b) also. Finally, an inspection of the matrix representation of T with respect to the standard ordered basis, which is



reveals that T is not self-adjoint for the given restriction on . As we mentioned earlier, this fact also follows from the geometric observation that T has no eigenvectors and from Theorem 6.15 (p. 368). It is seen easily from the preceding matrix that T* is the rotation by .

Definition.

Let L be a one-dimensional subspace of  . We may view L as a line in the plane through the origin. A linear operator T on  is called a reflection of  about L if  for all  and  for all .

As an example of a reflection, consider the operator defined in Example 3 of Section 2.5.

Example 3

Let T be a reflection of  about a line L through the origin. We show that T is an orthogonal operator. Select vectors  and  such that . Then  and . Thus  and  are eigenvectors of T with corresponding eigenvalues 1 and , respectively. Furthermore,  is an orthonormal basis for  . It follows that T is an orthogonal operator by Corollary 1 to Theorem 6.18.

We now examine the matrices that represent unitary and orthogonal transformations.

Definitions.

A square matrix A is called an orthogonal matrix if  and unitary if .

Since for a real matrix A we have , a real unitary matrix is also orthogonal. In this case, we call A orthogonal rather than unitary.

Note that the condition  is equivalent to the statement that the rows of A form an orthonormal basis for  because



and the last term represents the inner product of the ith and jth rows of A.

A similar remark can be made about the columns of A and the condition .

It also follows from the definition above and from Theorem 6.10 (p. 356) that a linear operator T on an inner product space V is unitary [orthogonal] if and only if  is unitary [orthogonal] for some orthonormal basis  for V.

Example 4

From Example 2, the matrix



is clearly orthogonal. One can easily see that the rows of the matrix form an orthonormal basis for  . Similarly, the columns of the matrix form an orthonormal basis for  .

Example 5

Let T be a reflection of  about a line L through the origin, let  be the standard ordered basis for  , and let . Then . Since T is an orthogonal operator and  is an orthonormal basis, A is an orthogonal matrix. We describe A.

Suppose that  is the angle from the positive x-axis to L. Let  and . Then , and . Hence  is an orthonormal basis for  . Because  and , we have



Let



By the corollary to Theorem 2.23 (p. 115),



We know that, for a complex normal [real symmetric] matrix A, there exists an orthonormal basis  for  consisting of eigenvectors of A. Hence A is similar to a diagonal matrix D. By the corollary to Theorem 2.23 (p. 115), the matrix Q whose columns are the vectors in  is such that . But since the columns of Q are an orthonormal basis for , it follows that Q is unitary [orthogonal]. In this case, we say that A is unitarily equivalent [orthogonally equivalent] to D. It is easily seen (see Exercise 18) that this relation is an equivalence relation on . More generally, A and B are unitarily equivalent [orthogonally equivalent] if and only if there exists a unitary [orthogonal] matrix P such that .

The preceding paragraph has proved half of each of the next two theorems.

Theorem 6.19.

Let A be a complex  matrix. Then A is normal if and only if A is unitarily equivalent to a diagonal matrix.

Proof.

By the preceding remarks, we need only prove that if A is unitarily equivalent to a diagonal matrix, then A is normal.

Suppose that , where P is a unitary matrix and D is a diagonal matrix. Then



Similarly, . Since D is a diagonal matrix, however, we have . Thus .

Theorem 6.20.

Let A be a real  matrix. Then A is symmetric if and only if A is orthogonally equivalent to a real diagonal matrix.

Proof.

The proof is similar to the proof of Theorem 6.19 and is left as an exercise.

Theorem 6.20 is used extensively in many areas of mathematics and statistics. See, for example, goo.gl/cbqApK.

Example 6

Let



Since A is symmetric, Theorem 6.20 tells us that A is orthogonally equivalent to a diagonal matrix. We find an orthogonal matrix P and a diagonal matrix D such that .

To find P, we obtain an orthonormal basis of eigenvectors. It is easy to show that the eigenvalues of A are 2 and 8. The set  is a basis for the eigenspace corresponding to 2. Because this set is not orthogonal, we apply the Gram-Schmidt process to obtain the orthogonal set . The set  is a basis for the eigenspace corresponding to 8. Notice that (1, 1, 1) is orthogonal to the preceding two vectors, as predicted by Theorem 6.15(d) (p. 368). Taking the union of these two bases and normalizing the vectors, we obtain the following orthonormal basis for  consisting of eigenvectors of A:



Thus one possible choice for P is



Because of Schur’s theorem (Theorem 6.14 p. 367), the next result is immediate. As it is the matrix form of Schur’s theorem, we also refer to it as Schur’s theorem.

Theorem 6.21. (Schur).

Let  be a matrix whose characteristic polynomial splits over F.

1. (a) If , then A is unitarily equivalent to a complex upper triangular matrix.

2. (b) If , then A is orthogonally equivalent to a real upper triangular matrix.

Rigid Motions*

The purpose of this application is to characterize the so-called rigid motions of a finite-dimensional real inner product space. One may think intuitively of such a motion as a transformation that does not affect the shape of a figure under its action, hence the term rigid. The key requirement for such a transformation is that it preserves distances.

Definition.

Let V be a real inner product space. A function  is called a rigid motion if



for all .

For example, any orthogonal operator on a finite-dimensional real inner product space is a rigid motion.

Another class of rigid motions are the translations. A function , where V is a real inner product space, is called a translation if there exists a vector  such that  for all . We say that g is the translation by . It is a simple exercise to show that translations, as well as composites of rigid motions on a real inner product space, are also rigid motions. (See Exercise 22.) Thus an orthogonal operator on a finite-dimensional real inner product space V followed by a translation on V is a rigid motion on V. Remarkably, every rigid motion on V may be characterized in this way.

Theorem 6.22.

Let  be a rigid motion on a finite-dimensional real inner product space V. Then there exists a unique orthogonal operator T on V and a unique translation g on V such that .

Any orthogonal operator is a special case of this composite, in which the translation is by 0. Any translation is also a special case, in which the orthogonal operator is the identity operator.

Proof.

Let  be defined by



for all . Note that , where g is the translation by f(0). Moreover, T is the composite of f and the translation by ; hence T is a rigid motion. We begin by showing that T is a linear operator. For any , we have



and consequently  for any . Thus for any ,



and



But ; so  for all .

We are now in a position to show that T is a linear transformation. Let , and let . Then



Thus , and hence T is linear. Since we have already shown that T preserves inner products, T is an orthogonal operator.

To prove uniqueness, suppose that  and  are in V and T and U are orthogonal operators on V such that



for all . Substituting  in the preceding equation yields , and hence the translation is unique. This equation, therefore, reduces to  for all , and hence .

Orthogonal Operators on R2

Because of Theorem 6.22, an understanding of rigid motions requires a characterization of orthogonal operators. The next result characterizes orthogonal operators on  . We postpone the case of orthogonal operators on more general spaces to Section 6.11.

Theorem 6.23.

Let T be an orthogonal operator on  , and let , where  is the standard ordered basis for  . Then exactly one of the following conditions is satisfied:

1. (a) T is a rotation, and .

2. (b) T is a reflection about a line through the origin, and .

Proof.

Because T is an orthogonal operator,  is an orthonormal basis for  by Theorem 6.18(c). Since  is a unit vector, there is a unique angle , such that . Since  is a unit vector and is orthogonal to , there are only two possible choices for . Either



First, suppose that . Then .

It follows from Example 1 of Section 6.4 that T is a rotation by the angle . Also



Now suppose that . Then .

Comparing this matrix to the matrix A of Example 5, we see that T is the reflection of  about a line L such that  is the angle from the positive x-axis to L. Furthermore,



Combining Theorems 6.22 and 6.23, we obtain the following characterization of rigid motions on  .

Corollary.

Any rigid motion on  is either a rotation followed by a translation or a reflection about a line through the origin followed by a translation.

Example 7

Let



We show that  is the reflection of  about a line L through the origin, and then describe L.

Clearly , and therefore A is an orthogonal matrix. Hence  is an orthogonal operator. Furthermore,



and thus  is a reflection of  about a line L through the origin by Theorem 6.23. Since L is the one-dimensional eigenspace corresponding to the eigenvalue 1 of , it suffices to find an eigenvector of  corresponding to 1. One such vector is . Thus L is the span of . Alternatively, L is the line through the origin with slope , and hence is the line with the equation



Conic Sections

As an application of Theorem 6.20, we consider the quadratic equation

 (2)

For special choices of the coefficients in (2), we obtain the various conic sections. For example, if  and , we obtain the circle  with center at the origin. The remaining conic sections, namely, the ellipse, parabola, and hyperbola, are obtained by other choices of the coefficients. If , then it is easy to graph the equation by the method of completing the square because the xy-term is absent. For example, the equation  may be rewritten as , which describes a circle with radius  and center at  in the xy-coordinate system. If we consider the transformation of coordinates , where  and , then our equation simplifies to . This change of variable allows us to eliminate the x- and y-terms.

We now concentrate solely on the elimination of the xy-term. To accomplish this, we consider the expression

 (3)

which is called the associated quadratic form of (2). Quadratic forms are studied in more generality in Section 6.8.

If we let



then (3) may be written as . For example, the quadratic form  may be written as



The fact that A is symmetric is crucial in our discussion. For, by Theorem 6.20, we may choose an orthogonal matrix P and a diagonal matrix D with real diagonal entries  and  such that . Now define



by  or, equivalently, by . Then



Thus the transformation  allows us to eliminate the xy-term in (3), and hence in (2).

Furthermore, since P is orthogonal, we have by Theorem 6.23 (with ) that . If , we may interchange the columns of P to obtain a matrix Q. Because the columns of P form an orthonormal basis of eigenvectors of A, the same is true of the columns of Q. Therefore



Notice that . So, if , we can take Q for our new P; consequently, we may always choose P so that . By Lemma 4 to Theorem 6.22 (with ), it follows that matrix P represents a rotation.

In summary, the xy-term in (2) may be eliminated by a rotation of the x-axis and y-axis to new axes  and  given by , where P is an orthogonal matrix and . Furthermore, the coefficients of  and  are the eigenvalues of



This result is a restatement of a result known as the principal axis theorem for . The arguments above, of course, are easily extended to quadratic equations in n variables. For example, in the case , by special choices of the coefficients, we obtain the quadratic surfaces—the elliptic cone, the ellipsoid, the hyperbolic paraboloid, etc.

As an illustration of the preceding transformation, consider the quadratic equation



for which the associated quadratic form is . In the notation we have been using,



so that the eigenvalues of A are 1 and 6 with associated eigenvectors



As expected (from Theorem 6.15(d) p. 368), these vectors are orthogonal. The corresponding orthonormal basis of eigenvectors



determines new axes  and  as in Figure 6.4. Hence if



then



Under the transformation  or



we have the new quadratic form . Thus the original equation  may be written in the form  relative to a new coordinate system with the - and -axes in the directions of the first and second vectors of , respectively. It is clear that this equation represents an ellipse. (See Figure 6.4.) Note that the preceding matrix P has the form



where . So P is the matrix representation of a rotation of  through the angle . Thus the change of variable  can be accomplished by this rotation of the x- and y-axes. There is another possibility for P, however. If the eigenvector of A corresponding to the eigenvalue 6 is taken to be  instead of , and the eigenvalues are interchanged, then we obtain the matrix



which is the matrix representation of a rotation through the angle . This possibility produces the same ellipse as the one in Figure 6.4, but interchanges the names of the - and -axes.

Exercises

1. Label the following statements as true or false. Assume that the underlying inner product spaces are finite-dimensional.

1. (a) Every unitary operator is normal.

2. (b) Every orthogonal operator is diagonalizable.

3. (c) A matrix is unitary if and only if it is invertible.

4. (d) If two matrices are unitarily equivalent, then they are also similar.

5. (e) The sum of unitary matrices is unitary.

6. (f) The adjoint of a unitary operator is unitary.

7. (g) If T is an orthogonal operator on V, then  is an orthogonal matrix for any ordered basis  for V.

8. (h) If all the eigenvalues of a linear operator are 1, then the operator must be unitary or orthogonal.

9. (i) A linear operator may preserve norms without preserving inner products.

2. For each of the following matrices A, find an orthogonal or unitary matrix P and a diagonal matrix D such that .

1. (a) 

2. (b) 

3. (c) 

4. (d) 

5. (e) 

3. Prove that the composite of unitary [orthogonal] operators is unitary [orthogonal].

4. For , define  by . Characterize those z for which  is normal, self-adjoint, or unitary.

5. Which of the following pairs of matrices are unitarily equivalent?

1. (a)  and 

2. (b)  and 

3. (c)  and 

4. (d)  and 

5. (e)  and 

6. Let V be the inner product space of complex-valued continuous functions on [0, 1] with the inner product



Let , and define  by . Prove that T is a unitary operator if and only if  for .

red Hint for the “only if” part: Suppose that T is unitary. Set  and . Show that



and use the fact that if the integral of a nonnegative continuous function is zero, then the function is identically zero.

7. Prove that if T is a unitary operator on a finite-dimensional inner product space V, then T has a unitary square root; that is, there exists a unitary operator U such that . Visit goo.gl/jADTaS for a solution.

8. Let T be a self-adjoint linear operator on a finite-dimensional inner product space. Prove that  is unitary using Exercise 10 of Section 6.4.

9. Let U be a linear operator on a finite-dimensional inner product space V. If  for all x in some orthonormal basis for V, must U be unitary? Justify your answer with a proof or a counterexample.

10. Let A be an  real symmetric or complex normal matrix. Prove that



where the ’s are the (not necessarily distinct) eigenvalues of A.

11. Find an orthogonal matrix whose first row is .

12. Let A be an  real symmetric or complex normal matrix. Prove that



where the  ‘s are the (not necessarily distinct) eigenvalues of A.

13. Suppose that A and B are diagonalizable matrices. Prove or disprove that A is similar to B if and only if A and B are unitarily equivalent.

14. Prove that if A and B are unitarily equivalent matrices, then A is positive definite [semidefinite] if and only if B is positive definite [semidefinite]. (See the definitions in the exercises in Section 6.4.)

15. Let U be a unitary operator on an inner product space V, and let W be a finite-dimensional U-invariant subspace of V. Prove that

1. (a) 

2. (b)  is U-invariant.

Contrast (b) with Exercise 16.

16. Find an example of a unitary operator U on an inner product space and a U-invariant subspace W such that  is not U-invariant.

17. Prove that a matrix that is both unitary and upper triangular must be a diagonal matrix.

18. Show that “is unitarily equivalent to” is an equivalence relation on .

19. Let W be a finite-dimensional subspace of an inner product space V. By Theorem 6.7 (p. 349) and the exercises of Section 1.3, . Define  by , where  and . Prove that U is a self-adjoint unitary operator.

20. Let V be a finite-dimensional inner product space. A linear operator U on V is called a partial isometry if there exists a subspace W of V such that  for all  and  for all . Observe that W need not be U-invariant. Suppose that U is such an operator and  is an orthonormal basis for W. Prove the following results.

1. (a)  for all . Hint: Use Exercise 20 of Section 6.1.

2. (b)  is an orthonormal basis for R(U).

3. (c) There exists an orthonormal basis  for V such that the first k columns of  form an orthonormal set and the remaining columns are zero.

4. (d) Let  be an orthonormal basis for  and . Then  is an orthonormal basis for V.

5. (e) Let T be the linear operator on V that satisfies  and . Then T is well defined, and . Hint: Show that  for all . There are four cases.

6. (f) U* is a partial isometry.

This exercise is continued in Exercise 9 of Section 6.6.

21. Let A and B be  matrices that are unitarily equivalent.

1. (a) Prove that .

2. (b) Use (a) to prove that


3. (c) Use (b) to show that the matrices



are not unitarily equivalent.

22. Let V be a real inner product space.

1. (a) Prove that any translation on V is a rigid motion.

2. (b) Prove that the composite of any two rigid motions on V is a rigid motion on V.

23. Prove the following variant of Theorem 6.22: If  is a rigid motion on a finite-dimensional real inner product space V, then there exists a unique orthogonal operator T on V and a unique translation g on V such that . (Note that the conclusion of Theorem 6.22 has ).

24. Let T and U be orthogonal operators on  . Use Theorem 6.23 to prove the following results.

1. (a) If T and U are both reflections about lines through the origin, then UT is a rotation.

2. (b) If T is a rotation and U is a reflection about a line through the origin, then both UT and TU are reflections about lines through the origin.

25. Suppose that T and U are reflections of  about the respective lines L and  through the origin and that  and  are the angles from the positive x-axis to L and , respectively. By Exercise 24, UT is a rotation. Find its angle of rotation.

26. Suppose that T and U are orthogonal operators on  such that T is the rotation by the angle  and U is the reflection about the line L through the origin. Let  be the angle from the positive x-axis to L. By Exercise 24, both UT and TU are reflections about lines  and , respectively, through the origin.

1. (a) Find the angle  from the positive x-axis to .

2. (b) Find the angle  from the positive x-axis to .

27. Find new coordinates  so that the following quadratic forms can be written as A .

1. (a) 

2. (b) 

3. (c) 

4. (d) 

5. (e) 

28. Consider the expression , where  and A is as defined in Exercise 2(e). Find a change of coordinates  so that the preceding expression is of the form .

29. QR-Factorization. Let  be linearly independent vectors in , and let  be the orthogonal vectors obtained from  by the Gram-Schmidt process. Let  be the orthonormal basis obtained by normalizing the ’s.

1. (a) Solving (1) in Section 6.2 for  in terms of , show that


2. (b) Let A and Q denote the  matrices in which the kth columns are  and , respectively. Define  by



Prove .

3. (c) Compute Q and R as in (b) for the  matrix whose columns are the vectors (1, 1, 0), (2, 0, 1), and (2, 2, 1).

4. (d) Since Q is unitary [orthogonal] and R is upper triangular in (b), we have shown that every invertible matrix is the product of a unitary [orthogonal] matrix and an upper triangular matrix. Suppose that  is invertible and , where  are unitary and  are upper triangular. Prove that  is a unitary diagonal matrix. Hint: Use Exercise 17.

5. (e) The QR factorization described in (b) provides an orthogonalization method for solving a linear system  when A is invertible. Decompose A to QR, by the Gram-Schmidt process or other means, where Q is unitary and R is upper triangular. Then , and hence . This last system can be easily solved since R is upper triangular.1

Use the orthogonalization method and (c) to solve the system


30. Suppose that  and  are ordered bases for an n-dimensional real [complex] inner product space V. Prove that if Q is an orthogonal [unitary]  matrix that changes -coordinates into -coordinates, then  is orthonormal if and only if  is orthonormal.

The following definition is used in Exercises 31 and 32.

Definition.

Let V be a finite-dimensional complex [real] inner product space, and let u be a unit vector in V. Define the Householder operator  by  for all .

1. Let  be a Householder operator on a finite-dimensional inner product space V. Prove the following results.

1. (a)  is linear.

2. (b)  if and only if x is orthogonal to u.

3. (c) 

4. (d)  and , and hence  is a unitary [orthogonal] operator on V.

(Note: If V is a real inner product space, then in the language of Section 6.11,  is a reflection.)

2. Let V be a finite-dimensional inner product space over F. Let x and y be linearly independent vectors in V such that .

1. (a) If , prove that there exists a unit vector u in V and a complex number  with  such that . Hint: Choose  so that  is real, and set .

2. (b) If , prove that there exists a unit vector u in V such that .