6.5 Unitary and Orthogonal Operators and Their Matrices
In this section, we continue our analogy between complex numbers and linear operators. Recall that the adjoint of a linear operator acts similarly to the conjugate of a complex number (see, for example, Theorem 6.11 p. 357). A complex number z has length 1 if $z\overline{z}=1$. In this section, we study those linear operators T on an inner product space V such that $\text{TT*}=\text{T*T}=\text{I}$. We will see that these are precisely the linear operators that “preserve length” in the sense that $\text{T}(x)=x$ for all $x\in \text{V}$. As another characterization, we prove that, on a finitedimensional complex inner product space, these are the normal operators whose eigenvalues all have absolute value 1.
In past chapters, we were interested in studying those functions that preserve the structure of the underlying space. In particular, linear operators preserve the operations of vector addition and scalar multiplication, and isomorphisms preserve all the vector space structure. It is now natural to consider those linear operators T on an inner product space that preserve length. We will see that this condition guarantees, in fact, that T preserves the inner product.
Definitions.
Let T be a linear operator on a finitedimensional inner product space V (over F). If $\text{T}(x)=x$ for all $x\in \text{V}$, we call T a unitary operator if $F=C$ and an orthogonal operator if $F=R$.
It should be noted that in the infinitedimensional case, an operator that preserves the norm is onetoone, but not necessarily onto. If it is also onto, then we call it a unitary or orthogonal operator.
Clearly, any rotation or reflection in ${\text{R}}^{2}$ preserves length and hence is an orthogonal operator. We study these operators in much more detail in Section 6.11.
Example 1
Recall the inner product space H defined on page 330. Let $h\in \text{H}$ satisfy $h(x)=1$ for all x. Define the linear operator T on H by $\text{T}(f)=hf$. Then
since $h(t){}^{2}=1$ for all t. So T is a unitary operator.
Theorem 6.18.
Let T be a linear operator on a finitedimensional inner product space V. Then the following statements are equivalent.

(a) $\text{T*T}=\text{I.}$

(b) $\text{TT*}=\text{I.}$

(c) $\langle \text{T}(x),\text{T}(y)\rangle =\langle x,\text{}y\rangle $ for all $x,\text{}y\in \text{V}$.

(d) If $\beta $ is an orthonormal basis for V, then $\text{T}(\beta )$ is an orthonormal basis for V.

(e) There exists an orthonormal basis $\beta $ for V such that $\text{T}(\beta )$ is an orthonormal basis for V.

(f) $\text{T}(x)=x$ for all $x\in \text{V}$.
Thus all the conditions above are equivalent to the definition of a unitary or orthogonal operator. From (a) and (b), it follows that unitary or orthogonal operators are normal.
Before proving the theorem, we first prove a lemma. Compare this lemma to Exercise 11(b) of Section 6.4.
Lemma. Let U be a selfadjoint operator on an inner product space V, and suppose that $\langle x,\text{U}(x)\rangle =0$ for all $x\in \text{V}$. Then $\text{U}={\text{T}}_{0}$.
Proof.
For any $x\in \text{V}$,
So for any $x\in \text{V},\text{}\text{U}(x)=0$. It follows that $\text{U}={\text{T}}_{0}$.
Proof of Theorem 6.18.
Part (a) implies (b) by Theorem 6.10 and Exercise 10(c) of Section 2.4.
To prove that (b) implies (c), let $x,\text{}y\in \text{V}$. Then $\langle x,\text{}y\rangle =\langle \text{T*T)}(x),\text{}y\rangle =\langle \text{T}(x),\text{T}(y)\rangle $.
Next, we prove that (c) implies (d). Let $\beta =\{{v}_{1},\text{}{v}_{2},\text{}\dots ,\text{}{v}_{n}\}$ be an orthonormal basis for V; so $\text{T}(\beta )=\{\text{T}({v}_{1}),\text{T}({v}_{2}),\text{}\dots ,\text{T}({v}_{n})\}$. It follows that $\langle \text{T}({v}_{i}),\text{T}({v}_{j})\rangle =\langle {v}_{i},\text{}{v}_{j}\rangle ={\delta}_{ij}$. Therefore $\text{T}(\beta )$ is an orthonormal basis for V. That (d) implies (e) is obvious.
Next we prove that (e) implies (f). Let $x\in \text{V}$, and let $\beta =\{{v}_{1},\text{}{v}_{2},\text{}\dots ,\text{}{v}_{n}\}$. Now
for some scalars ${a}_{i}$, and so
since $\beta $ is orthonormal.
Applying the same manipulations to
and using the fact that $\text{T}(\beta )$ is also orthonormal, we obtain
Hence $\text{T}(x)=x$.
Finally, we prove that (f) implies (a). For any $x\in \text{V}$, we have
So $\langle x,\text{}(\text{I}\text{T*T})(x)\rangle =0$ for all $x\in \text{V}$. Let $\text{U}=\text{I}\text{T*T}$ then U is selfadjoint, and $\langle x,\text{U}(x)\rangle =0$ for all $x\in \text{V}$. Hence, by the lemma, we have ${\text{T}}_{0}=\text{U}=\text{I}\text{T*T}$, and therefore $\text{T*T}=\text{I}$.
In the case that T satisfies (c), we say that T preserves inner products. In the case that T satisfies (f), we say that T preserves norms.
It follows immediately from the definition that every eigenvalue of a unitary or orthogonal operator has absolute value 1. In fact, even more is true.
Corollary 1.
Let T be a linear operator on a finitedimensional real inner product space V. Then V has an orthonormal basis of eigenvectors of T with corresponding eigenvalues of absolute value 1 if and only if T is both selfadjoint and orthogonal.
Proof.
Suppose that V has an orthonormal basis $\{{v}_{1},\text{}{v}_{2},\dots ,\text{}{v}_{n}\}$ such that $\text{T}({v}_{i})={\lambda}_{i}{v}_{i}$ and ${\lambda}_{i}=1$ for all i. By Theorem 6.17 (p. 371), T is selfadjoint. Thus $(\text{TT*})({v}_{i})=\text{T}({\lambda}_{i}{v}_{i})={\lambda}_{i}{\lambda}_{i}{v}_{i}={\lambda}_{i}^{2}{v}_{i}$ for each i. So $\text{TT*}=\text{I}$, and again by Exercise 10 of Section 2.4, T is orthogonal by Theorem 6.18(a).
If T is selfadjoint, then, by Theorem 6.17, we have that V possesses an orthonormal basis $\{{v}_{1},\text{}{v}_{2},\text{}\dots ,\text{}{v}_{n}\}$ such that $\text{T}({v}_{i})={\lambda}_{i}{v}_{i}$ for all i. If T is also orthogonal, we have
so ${\lambda}_{i}=1$ for every i.
Corollary 2.
Let T be a linear operator on a finitedimensional complex inner product space V. Then V has an orthonormal basis of eigenvectors of T with corresponding eigenvalues of absolute value 1 if and only if T is unitary.
Proof.
The proof is similar to the proof of Corollary 1.
Example 2
Let $\text{T}:{\text{R}}^{2}\to {\text{R}}^{2}$ be a rotation by $\theta $, where $0<\theta <\pi $ It is clear geometrically that T “preserves length,” that is, that $\text{T}(x)=x$ for all $x\in {\text{R}}^{2}$. The fact that rotations by a fixed angle preserve perpendicularity not only can be seen geometrically but now follows from (b) of Theorem 6.18. Perhaps the fact that such a transformation preserves the inner product is not so obvious; however, we obtain this fact from (b) also. Finally, an inspection of the matrix representation of T with respect to the standard ordered basis, which is
reveals that T is not selfadjoint for the given restriction on $\theta $. As we mentioned earlier, this fact also follows from the geometric observation that T has no eigenvectors and from Theorem 6.15 (p. 368). It is seen easily from the preceding matrix that T* is the rotation by $\theta $.
Definition.
Let L be a onedimensional subspace of ${\text{R}}^{2}$ . We may view L as a line in the plane through the origin. A linear operator T on ${\text{R}}^{2}$ is called a reflection of ${\text{R}}^{2}$ about L if $\text{T}(x)=x$ for all $x\in \text{L}$ and $\text{T}(x)=x$ for all $x\in {\text{L}}^{\perp}$.
As an example of a reflection, consider the operator defined in Example 3 of Section 2.5.
Example 3
Let T be a reflection of ${\text{R}}^{2}$ about a line L through the origin. We show that T is an orthogonal operator. Select vectors ${v}_{1}\in \text{L}$ and ${v}_{2}\in {\text{L}}^{\perp}$ such that ${v}_{1}={v}_{2}=1$. Then $\text{T}({v}_{1})={v}_{1}$ and $\text{T}({v}_{2})={v}_{2}$. Thus ${v}_{1}$ and ${v}_{2}$ are eigenvectors of T with corresponding eigenvalues 1 and $1$, respectively. Furthermore, $\{{v}_{1},\text{}{v}_{2}\}$ is an orthonormal basis for ${\text{R}}^{2}$ . It follows that T is an orthogonal operator by Corollary 1 to Theorem 6.18.
We now examine the matrices that represent unitary and orthogonal transformations.
Definitions.
A square matrix A is called an orthogonal matrix if ${A}^{t}A=A{A}^{t}=I$ and unitary if $A\text{*}A=AA=I$.
Since for a real matrix A we have $A\text{*}={A}^{t}$, a real unitary matrix is also orthogonal. In this case, we call A orthogonal rather than unitary.
Note that the condition $AA\text{*}=I$ is equivalent to the statement that the rows of A form an orthonormal basis for ${\text{F}}^{n}$ because
and the last term represents the inner product of the ith and jth rows of A.
A similar remark can be made about the columns of A and the condition $A\text{*}A=I$.
It also follows from the definition above and from Theorem 6.10 (p. 356) that a linear operator T on an inner product space V is unitary [orthogonal] if and only if ${[\text{T}]}_{\beta}$ is unitary [orthogonal] for some orthonormal basis $\beta $ for V.
Example 4
From Example 2, the matrix
is clearly orthogonal. One can easily see that the rows of the matrix form an orthonormal basis for ${\text{R}}^{2}$ . Similarly, the columns of the matrix form an orthonormal basis for ${\text{R}}^{2}$ .
Example 5
Let T be a reflection of ${\text{R}}^{2}$ about a line L through the origin, let $\beta $ be the standard ordered basis for ${\text{R}}^{2}$ , and let $A={[\text{T}]}_{\beta}$. Then $\text{T}={\text{L}}_{A}$. Since T is an orthogonal operator and $\beta $ is an orthonormal basis, A is an orthogonal matrix. We describe A.
Suppose that $\alpha $ is the angle from the positive xaxis to L. Let ${v}_{1}=(\mathrm{cos}\text{}\alpha ,\text{}\mathrm{sin}\text{}\alpha )$ and ${v}_{2}=(\mathrm{sin}\text{}\alpha ,\text{}\mathrm{cos}\text{}\alpha )$. Then ${v}_{1}={v}_{2}=1,\text{}{v}_{1}\in \text{L}$, and ${v}_{2}\in {\text{L}}^{\perp}$. Hence $\gamma =\{{v}_{1},\text{}{v}_{2}\}$ is an orthonormal basis for ${\text{R}}^{2}$ . Because $\text{T}({v}_{1})={v}_{1}$ and $\text{T}({v}_{2})={v}_{2}$, we have
Let
$$Q=\left(\begin{array}{rr}\mathrm{cos}\text{}\alpha & \mathrm{sin}\text{}\alpha \\ \mathrm{sin}\text{}\alpha & \mathrm{cos}\text{}\alpha \end{array}\right).$$
By the corollary to Theorem 2.23 (p. 115),
$$\begin{array}{rcl}A& =& Q{[{\text{L}}_{A}]}_{\gamma}{Q}^{1}\\ & =& \left(\begin{array}{rr}\mathrm{cos}\text{}\alpha & \mathrm{sin}\text{}\alpha \\ \mathrm{sin}\text{}\alpha & \mathrm{cos}\text{}\alpha \end{array}\right)\left(\begin{array}{rr}1& 0\\ 0& 1\end{array}\right)\left(\begin{array}{rr}\mathrm{cos}\text{}\alpha & \mathrm{sin}\text{}\alpha \\ \mathrm{sin}\text{}\alpha & \mathrm{cos}\text{}\alpha \end{array}\right)\\ & =& \left(\begin{array}{cc}{\mathrm{cos}}^{2}\text{}\alpha {\mathrm{sin}}^{2}\text{}\alpha & 2\text{}\mathrm{sin}\text{}\alpha \text{}\mathrm{cos}\text{}\alpha \\ 2\text{}\mathrm{sin}\text{}\alpha \text{}\mathrm{cos}\text{}\alpha & ({\mathrm{cos}}^{2}\text{}\alpha {\mathrm{sin}}^{2}\text{}\alpha )\end{array}\right)\\ & =& \left(\begin{array}{rr}\mathrm{cos}\text{}2\alpha & \mathrm{sin}\text{}2\alpha \\ \mathrm{sin}\text{}2\alpha & \mathrm{cos}\text{}2\alpha \end{array}\right).\end{array}$$
We know that, for a complex normal [real symmetric] matrix A, there exists an orthonormal basis $\beta $ for ${\text{F}}^{n}$ consisting of eigenvectors of A. Hence A is similar to a diagonal matrix D. By the corollary to Theorem 2.23 (p. 115), the matrix Q whose columns are the vectors in $\beta $ is such that $D={Q}^{1}AQ$. But since the columns of Q are an orthonormal basis for ${\text{F}}^{n}$, it follows that Q is unitary [orthogonal]. In this case, we say that A is unitarily equivalent [orthogonally equivalent] to D. It is easily seen (see Exercise 18) that this relation is an equivalence relation on ${\text{M}}_{n\times n}(C)\text{}[{\text{M}}_{n\times n}(R)]$. More generally, A and B are unitarily equivalent [orthogonally equivalent] if and only if there exists a unitary [orthogonal] matrix P such that $A=P\text{*}BP$.
The preceding paragraph has proved half of each of the next two theorems.
Theorem 6.19.
Let A be a complex $n\times n$ matrix. Then A is normal if and only if A is unitarily equivalent to a diagonal matrix.
Proof.
By the preceding remarks, we need only prove that if A is unitarily equivalent to a diagonal matrix, then A is normal.
Suppose that $A=P\text{*}DP$, where P is a unitary matrix and D is a diagonal matrix. Then
Similarly, $A\text{*}A=P\text{*}D\text{*}DP$. Since D is a diagonal matrix, however, we have $DD*=D*D$. Thus $AA\text{*}=A\text{*}A$.
Theorem 6.20.
Let A be a real $n\times n$ matrix. Then A is symmetric if and only if A is orthogonally equivalent to a real diagonal matrix.
Proof.
The proof is similar to the proof of Theorem 6.19 and is left as an exercise.
Theorem 6.20 is used extensively in many areas of mathematics and statistics. See, for example, goo.gl/
Example 6
Let
Since A is symmetric, Theorem 6.20 tells us that A is orthogonally equivalent to a diagonal matrix. We find an orthogonal matrix P and a diagonal matrix D such that ${P}^{t}AP=D$.
To find P, we obtain an orthonormal basis of eigenvectors. It is easy to show that the eigenvalues of A are 2 and 8. The set $\{(1,\text{}1,\text{}0),\text{}(1,\text{}0,\text{}1)\}$ is a basis for the eigenspace corresponding to 2. Because this set is not orthogonal, we apply the GramSchmidt process to obtain the orthogonal set $\{(1,\text{}1,\text{}0),\text{}{\displaystyle \frac{1}{2}}(1,\text{}1,\text{}2)\}$. The set $\{(1,\text{}1,\text{}1)\}$ is a basis for the eigenspace corresponding to 8. Notice that (1, 1, 1) is orthogonal to the preceding two vectors, as predicted by Theorem 6.15(d) (p. 368). Taking the union of these two bases and normalizing the vectors, we obtain the following orthonormal basis for ${\text{R}}^{3}$ consisting of eigenvectors of A:
Thus one possible choice for P is
Because of Schur’s theorem (Theorem 6.14 p. 367), the next result is immediate. As it is the matrix form of Schur’s theorem, we also refer to it as Schur’s theorem.
Theorem 6.21. (Schur).
Let $A\in {\text{M}}_{n\times n}(F)$ be a matrix whose characteristic polynomial splits over F.

(a) If $F=C$, then A is unitarily equivalent to a complex upper triangular matrix.

(b) If $F=R$, then A is orthogonally equivalent to a real upper triangular matrix.
Rigid Motions*
The purpose of this application is to characterize the socalled rigid motions of a finitedimensional real inner product space. One may think intuitively of such a motion as a transformation that does not affect the shape of a figure under its action, hence the term rigid. The key requirement for such a transformation is that it preserves distances.
Definition.
Let V be a real inner product space. A function $f:\text{V}\to \text{V}$ is called a rigid motion if
for all $x,\text{}y\in \text{V}$.
For example, any orthogonal operator on a finitedimensional real inner product space is a rigid motion.
Another class of rigid motions are the translations. A function $g:\text{V}\to \text{V}$, where V is a real inner product space, is called a translation if there exists a vector ${v}_{0}\in \text{V}$ such that $g(x)=x+{v}_{0}$ for all $x\in \text{V}$. We say that g is the translation by ${v}_{0}$. It is a simple exercise to show that translations, as well as composites of rigid motions on a real inner product space, are also rigid motions. (See Exercise 22.) Thus an orthogonal operator on a finitedimensional real inner product space V followed by a translation on V is a rigid motion on V. Remarkably, every rigid motion on V may be characterized in this way.
Theorem 6.22.
Let $f:\text{V}\to \text{V}$ be a rigid motion on a finitedimensional real inner product space V. Then there exists a unique orthogonal operator T on V and a unique translation g on V such that $f=g\circ \text{T}$.
Any orthogonal operator is a special case of this composite, in which the translation is by 0. Any translation is also a special case, in which the orthogonal operator is the identity operator.
Proof.
Let $\text{T}:\text{V}\to \text{V}$ be defined by
for all $x\in \text{V}$. Note that $f=g\circ \text{T}$, where g is the translation by f(0). Moreover, T is the composite of f and the translation by $f(0)$; hence T is a rigid motion. We begin by showing that T is a linear operator. For any $x\in \text{V}$, we have
and consequently $\text{T}(x)=x$ for any $x\in \text{V}$. Thus for any $x,\text{}y\in \text{V}$,
and
But $\text{T}(x)\text{T}(y){}^{2}=xy{}^{2}$; so $\langle T(x),\text{T}(y)\rangle =\langle x,\text{\hspace{0.17em}}y\rangle $ for all $x,\text{}y\in \text{V}$.
We are now in a position to show that T is a linear transformation. Let $x,\text{}y\in \text{V}$, and let $a\in \text{R}$. Then
Thus $\text{T}(x+ay)=\text{T}(x)+a\text{T}(y)$, and hence T is linear. Since we have already shown that T preserves inner products, T is an orthogonal operator.
To prove uniqueness, suppose that ${u}_{0}$ and ${v}_{0}$ are in V and T and U are orthogonal operators on V such that
for all $x\in \text{V}$. Substituting $x=0$ in the preceding equation yields ${u}_{0}={v}_{0}$, and hence the translation is unique. This equation, therefore, reduces to $\text{T}(x)=\text{U}(x)$ for all $x\in \text{V}$, and hence $\text{T}=\text{U}$.
Orthogonal Operators on R^{2}
Because of Theorem 6.22, an understanding of rigid motions requires a characterization of orthogonal operators. The next result characterizes orthogonal operators on ${\text{R}}^{2}$ . We postpone the case of orthogonal operators on more general spaces to Section 6.11.
Theorem 6.23.
Let T be an orthogonal operator on ${\text{R}}^{2}$ , and let $A={[\text{T}]}_{\beta}$, where $\beta $ is the standard ordered basis for ${\text{R}}^{2}$ . Then exactly one of the following conditions is satisfied:

(a) T is a rotation, and $\mathrm{det}(A)=1$.

(b) T is a reflection about a line through the origin, and $\mathrm{det}(A)=1$.
Proof.
Because T is an orthogonal operator, $\text{T}(\beta )=\{\text{T}({e}_{1}),\text{T}({e}_{2})\}$ is an orthonormal basis for ${\text{R}}^{2}$ by Theorem 6.18(c). Since $\text{T}({e}_{1})$ is a unit vector, there is a unique angle $\theta ,\text{}0\le \theta 2\pi $, such that $\text{T}({e}_{1})=(\mathrm{cos}\text{}\theta ,\text{}\mathrm{sin}\text{}\theta )$. Since $\text{T}({e}_{2})$ is a unit vector and is orthogonal to $\text{T}({e}_{1})$, there are only two possible choices for $\text{T}({e}_{2})$. Either
First, suppose that $\text{T}({e}_{2})=(\mathrm{sin}\text{}\theta ,\text{}\mathrm{cos}\text{}\theta )$. Then $A=\left(\begin{array}{rr}\mathrm{cos}\text{}\theta & \mathrm{sin}\text{}\theta \\ \mathrm{sin}\text{}\theta & \mathrm{cos}\text{}\theta \end{array}\right)$.
It follows from Example 1 of Section 6.4 that T is a rotation by the angle $\theta $. Also
Now suppose that $\text{T}({e}_{2})=(\mathrm{sin}\text{}\theta ,\text{\hspace{0.17em}}\mathrm{cos}\text{}\theta )$. Then $A=\left(\begin{array}{rr}\mathrm{cos}\text{}\theta & \mathrm{sin}\text{}\theta \\ \mathrm{sin}\text{}\theta & \mathrm{cos}\text{}\theta \end{array}\right)$.
Comparing this matrix to the matrix A of Example 5, we see that T is the reflection of ${\text{R}}^{2}$ about a line L such that $\alpha =\theta /2$ is the angle from the positive xaxis to L. Furthermore,
Combining Theorems 6.22 and 6.23, we obtain the following characterization of rigid motions on ${\text{R}}^{2}$ .
Corollary.
Any rigid motion on ${\text{R}}^{2}$ is either a rotation followed by a translation or a reflection about a line through the origin followed by a translation.
Example 7
Let
We show that ${\text{L}}_{A}$ is the reflection of ${\text{R}}^{2}$ about a line L through the origin, and then describe L.
Clearly $AA\text{*}=A\text{*}A=I$, and therefore A is an orthogonal matrix. Hence ${\text{L}}_{A}$ is an orthogonal operator. Furthermore,
and thus ${\text{L}}_{A}$ is a reflection of ${\text{R}}^{2}$ about a line L through the origin by Theorem 6.23. Since L is the onedimensional eigenspace corresponding to the eigenvalue 1 of ${\text{L}}_{A}$, it suffices to find an eigenvector of ${\text{L}}_{A}$ corresponding to 1. One such vector is $v=(2,\text{}\sqrt{5}1)$. Thus L is the span of $\left\{v\right\}$. Alternatively, L is the line through the origin with slope $(\sqrt{5}1)/2$, and hence is the line with the equation
Conic Sections
As an application of Theorem 6.20, we consider the quadratic equation
For special choices of the coefficients in (2), we obtain the various conic sections. For example, if $a=c=1,\text{}b=d=e=0,$ and $f=1$, we obtain the circle ${x}^{2}+{y}^{2}=1$ with center at the origin. The remaining conic sections, namely, the ellipse, parabola, and hyperbola, are obtained by other choices of the coefficients. If $b=0$, then it is easy to graph the equation by the method of completing the square because the xyterm is absent. For example, the equation ${x}^{2}+2x+{y}^{2}+4y+2=0$ may be rewritten as ${(x+1)}^{2}+{(y+2)}^{2}=3$, which describes a circle with radius $\sqrt{3}$ and center at $(1,\text{}2)$ in the xycoordinate system. If we consider the transformation of coordinates $(x,\text{}y)\to ({x}^{\prime},\text{}{y}^{\prime})$, where ${x}^{\prime}=x+1$ and ${y}^{\prime}=y+2$, then our equation simplifies to ${({x}^{\prime})}^{2}+{({y}^{\prime})}^{2}=3$. This change of variable allows us to eliminate the x and yterms.
We now concentrate solely on the elimination of the xyterm. To accomplish this, we consider the expression
which is called the associated quadratic form of (2). Quadratic forms are studied in more generality in Section 6.8.
If we let
then (3) may be written as ${X}^{t}AX=\langle AX,\text{}X\rangle $. For example, the quadratic form $3{x}^{2}+4xy+6{y}^{2}$ may be written as
The fact that A is symmetric is crucial in our discussion. For, by Theorem 6.20, we may choose an orthogonal matrix P and a diagonal matrix D with real diagonal entries ${\lambda}_{1}$ and ${\lambda}_{2}$ such that ${P}^{t}AP=D$. Now define
by ${X}^{\prime}={P}^{t}X$ or, equivalently, by $P{X}^{\prime}=P{P}^{t}X=X$. Then
Thus the transformation $(x,\text{}y)\to ({x}^{\prime},\text{}{y}^{\prime})$ allows us to eliminate the xyterm in (3), and hence in (2).
Furthermore, since P is orthogonal, we have by Theorem 6.23 (with $\text{T}={\text{L}}_{P}$) that $\mathrm{det}(P)=\pm 1$. If $\mathrm{det}(P)=1$, we may interchange the columns of P to obtain a matrix Q. Because the columns of P form an orthonormal basis of eigenvectors of A, the same is true of the columns of Q. Therefore
Notice that $\mathrm{det}(Q)=\mathrm{det}(p)=1$. So, if $\mathrm{det}(P)=1$, we can take Q for our new P; consequently, we may always choose P so that $\mathrm{det}(P)=1$. By Lemma 4 to Theorem 6.22 (with $\text{T}={\text{L}}_{P}$), it follows that matrix P represents a rotation.
In summary, the xyterm in (2) may be eliminated by a rotation of the xaxis and yaxis to new axes ${x}^{\prime}$ and ${y}^{\prime}$ given by $X=P{X}^{\prime}$, where P is an orthogonal matrix and $\mathrm{det}(P)=1$. Furthermore, the coefficients of ${({x}^{\prime})}^{2}$ and ${({y}^{\prime})}^{2}$ are the eigenvalues of
This result is a restatement of a result known as the principal axis theorem for ${\text{R}}^{2}$. The arguments above, of course, are easily extended to quadratic equations in n variables. For example, in the case $n=3$, by special choices of the coefficients, we obtain the quadratic surfaces—the elliptic cone, the ellipsoid, the hyperbolic paraboloid, etc.
As an illustration of the preceding transformation, consider the quadratic equation
for which the associated quadratic form is $2{x}^{2}4xy+5{y}^{2}$. In the notation we have been using,
so that the eigenvalues of A are 1 and 6 with associated eigenvectors
As expected (from Theorem 6.15(d) p. 368), these vectors are orthogonal. The corresponding orthonormal basis of eigenvectors
determines new axes ${x}^{\prime}$ and ${y}^{\prime}$ as in Figure 6.4. Hence if
then
Under the transformation $X=P{X}^{\prime}$ or
we have the new quadratic form ${({x}^{\prime})}^{2}+6{({y}^{\prime})}^{2}$. Thus the original equation $2{x}^{2}4xy+5{y}^{2}=36$ may be written in the form ${({x}^{\prime})}^{2}+6{({y}^{\prime})}^{2}=36$ relative to a new coordinate system with the ${x}^{\prime}$ and ${y}^{\prime}$axes in the directions of the first and second vectors of $\beta $, respectively. It is clear that this equation represents an ellipse. (See Figure 6.4.) Note that the preceding matrix P has the form
where $\theta ={\mathrm{cos}}^{1}{\displaystyle \frac{2}{\sqrt{5}}}\approx 26.6\xb0$. So P is the matrix representation of a rotation of ${\text{R}}^{2}$ through the angle $\theta $. Thus the change of variable $X=P{X}^{\prime}$ can be accomplished by this rotation of the x and yaxes. There is another possibility for P, however. If the eigenvector of A corresponding to the eigenvalue 6 is taken to be $(1,\text{}2)$ instead of $(1,\text{}2)$, and the eigenvalues are interchanged, then we obtain the matrix
which is the matrix representation of a rotation through the angle $\theta ={\mathrm{sin}}^{1}\left({\displaystyle {\displaystyle \frac{2}{\sqrt{5}}}}\right)\approx 63.4\xb0$. This possibility produces the same ellipse as the one in Figure 6.4, but interchanges the names of the ${x}^{\prime}$ and ${y}^{\prime}$axes.
Exercises

Label the following statements as true or false. Assume that the underlying inner product spaces are finitedimensional.

(a) Every unitary operator is normal.

(b) Every orthogonal operator is diagonalizable.

(c) A matrix is unitary if and only if it is invertible.

(d) If two matrices are unitarily equivalent, then they are also similar.

(e) The sum of unitary matrices is unitary.

(f) The adjoint of a unitary operator is unitary.

(g) If T is an orthogonal operator on V, then ${[\text{T}]}_{\beta}$ is an orthogonal matrix for any ordered basis $\beta $ for V.

(h) If all the eigenvalues of a linear operator are 1, then the operator must be unitary or orthogonal.

(i) A linear operator may preserve norms without preserving inner products.


For each of the following matrices A, find an orthogonal or unitary matrix P and a diagonal matrix D such that $P\text{*}AP=D$.

(a) $\left(\begin{array}{rr}1& 2\\ 2& 1\end{array}\right)$

(b) $\left(\begin{array}{rr}0& 1\\ 1& 0\end{array}\right)$

(c) $\left(\begin{array}{cc}2& 33i\\ 3+3i& 5\end{array}\right)$

(d) $\left(\begin{array}{rrr}0& 2& 2\\ 2& 0& 2\\ 2& 2& 0\end{array}\right)$

(e) $\left(\begin{array}{rrr}2& 1& 1\\ 1& 2& 1\\ 1& 1& 2\end{array}\right)$


Prove that the composite of unitary [orthogonal] operators is unitary [orthogonal].

For $z\in C$, define ${\text{T}}_{z}:\text{}C\to C$ by ${\text{T}}_{z}(u)=zu$. Characterize those z for which ${\text{T}}_{z}$ is normal, selfadjoint, or unitary.

Which of the following pairs of matrices are unitarily equivalent?

(a) $\left(\begin{array}{rr}1& 0\\ 0& 1\end{array}\right)$ and $\left(\begin{array}{rr}0& 1\\ 1& 0\end{array}\right)$

(b) $\left(\begin{array}{rr}0& 1\\ 1& 0\end{array}\right)$ and $\left(\begin{array}{rr}0& {\displaystyle {\displaystyle \frac{1}{2}}}\\ {\displaystyle {\displaystyle \frac{1}{2}}}& 0\end{array}\right)$

(c) $\left(\begin{array}{rrr}0& 1& 0\\ 1& 0& 0\\ 0& 0& 1\end{array}\right)$ and $\left(\begin{array}{rrr}2& 0& 0\\ 0& 1& 0\\ 0& 0& 0\end{array}\right)$

(d) $\left(\begin{array}{rrr}0& 1& 0\\ 1& 0& 0\\ 0& 0& 1\end{array}\right)$ and $\left(\begin{array}{rrr}1& 0& 0\\ 0& i& 0\\ 0& 0& i\end{array}\right)$

(e) $\left(\begin{array}{rrr}1& 1& 0\\ 0& 2& 2\\ 0& 0& 3\end{array}\right)$ and $\left(\begin{array}{rrr}1& 0& 0\\ 0& 2& 0\\ 0& 0& 3\end{array}\right)$


Let V be the inner product space of complexvalued continuous functions on [0, 1] with the inner product
$$\langle f,\text{}g\rangle ={\displaystyle {\int}_{0}^{1}f(t)\overline{g(t)}\text{}dt.}$$Let $h\in \text{V}$, and define $\text{T}:\text{V}\to \text{V}$ by $\text{T}(f)=hf$. Prove that T is a unitary operator if and only if $h(t)=1$ for $0\le t\le 1$.
red Hint for the “only if” part: Suppose that T is unitary. Set $f(t)=1h(t){}^{2}$ and $g(t)=1$. Show that
$${\int}_{0}^{1}{(1h(t){}^{2})}^{2}\text{}dt=0,$$and use the fact that if the integral of a nonnegative continuous function is zero, then the function is identically zero.

Prove that if T is a unitary operator on a finitedimensional inner product space V, then T has a unitary square root; that is, there exists a unitary operator U such that $\text{T}={\text{U}}^{2}$. Visit goo.gl/
jADTaS for a solution. 
Let T be a selfadjoint linear operator on a finitedimensional inner product space. Prove that $(\text{T}+i\text{I}){(\text{T}i\text{I})}^{1}$ is unitary using Exercise 10 of Section 6.4.

Let U be a linear operator on a finitedimensional inner product space V. If $\text{U}(x)=x$ for all x in some orthonormal basis for V, must U be unitary? Justify your answer with a proof or a counterexample.

Let A be an $n\times n$ real symmetric or complex normal matrix. Prove that
$$\text{tr}(A)={\displaystyle \sum _{i=1}^{n}{\lambda}_{i}}\phantom{\rule{1em}{0ex}}\text{and}\phantom{\rule{1em}{0ex}}\text{tr}(A\text{*}A)={\displaystyle \sum _{i=1}^{n}{\lambda}_{i}{}^{2},}$$where the ${\lambda}_{i}$’s are the (not necessarily distinct) eigenvalues of A.

Find an orthogonal matrix whose first row is $\left({\displaystyle {\displaystyle \frac{1}{3}}},\text{}{\displaystyle {\displaystyle \frac{2}{3}}},\text{}{\displaystyle {\displaystyle \frac{2}{3}}}\right)$.

Let A be an $n\times n$ real symmetric or complex normal matrix. Prove that
$$\mathrm{det}(A)={\displaystyle \prod _{i=1}^{n}{\lambda}_{i},}$$where the ${\lambda}_{i}$ ‘s are the (not necessarily distinct) eigenvalues of A.

Suppose that A and B are diagonalizable matrices. Prove or disprove that A is similar to B if and only if A and B are unitarily equivalent.

Prove that if A and B are unitarily equivalent matrices, then A is positive definite [semidefinite] if and only if B is positive definite [semidefinite]. (See the definitions in the exercises in Section 6.4.)

Let U be a unitary operator on an inner product space V, and let W be a finitedimensional Uinvariant subspace of V. Prove that

(a) $\text{U}(\text{W})=\text{W;}$

(b) ${\text{W}}^{{}^{\perp}}$ is Uinvariant.
Contrast (b) with Exercise 16.


Find an example of a unitary operator U on an inner product space and a Uinvariant subspace W such that ${\text{W}}^{{}^{\perp}}$ is not Uinvariant.

Prove that a matrix that is both unitary and upper triangular must be a diagonal matrix.

Show that “is unitarily equivalent to” is an equivalence relation on ${\text{M}}_{n\times n}(C)$.

Let W be a finitedimensional subspace of an inner product space V. By Theorem 6.7 (p. 349) and the exercises of Section 1.3, $\text{V}=\text{W}\oplus {\text{W}}^{\perp}$. Define $\text{U}:\text{\hspace{0.17em}}\text{V}\to \text{V}$ by $\text{U}({v}_{1}+{v}_{2})={v}_{1}{v}_{2}$, where ${v}_{1}\in \text{W}$ and ${v}_{2}\in {\text{W}}^{\perp}$. Prove that U is a selfadjoint unitary operator.

Let V be a finitedimensional inner product space. A linear operator U on V is called a partial isometry if there exists a subspace W of V such that $\text{U}(x)x$ for all $x\in \text{W}$ and $\text{U}(x)=0$ for all $x\in {\text{W}}^{\perp}$. Observe that W need not be Uinvariant. Suppose that U is such an operator and $\{{v}_{1},\text{}{v}_{2},\text{}\dots ,\text{}{v}_{k}\}$ is an orthonormal basis for W. Prove the following results.

(a) $\langle \text{U}(x),\text{U}(y)\rangle =\langle x,\text{}y\rangle $ for all $x,\text{}y\in \text{W}$. Hint: Use Exercise 20 of Section 6.1.

(b) $\{\text{U}({v}_{1}),\text{U}({v}_{2}),\text{}\dots ,\text{U}({v}_{k})\}$ is an orthonormal basis for R(U).

(c) There exists an orthonormal basis $\gamma $ for V such that the first k columns of ${[\text{U}]}_{\gamma}$ form an orthonormal set and the remaining columns are zero.

(d) Let $\{{w}_{1},\text{}{w}_{2},\text{}\dots ,\text{}{w}_{j}\}$ be an orthonormal basis for $\text{R}{(\text{U})}^{\perp}$ and $\beta =\{\text{U}({v}_{1}),\text{U}({v}_{2}),\text{}\dots ,\text{U}({v}_{k}),\text{}{w}_{1},\text{}\dots ,\text{}{w}_{j}\}$. Then $\beta $ is an orthonormal basis for V.

(e) Let T be the linear operator on V that satisfies $\text{T}(\text{U}({v}_{i}))={v}_{i}(1\le i\le k)$ and $\text{T}({w}_{i})=0\text{}(1\le i\le j)$. Then T is well defined, and $\text{T}=\text{U*}$. Hint: Show that $\langle \text{U}(x),\text{}y\rangle =\langle x,\text{T}(y)\rangle $ for all $x,\text{}y\in \beta $. There are four cases.

(f) U* is a partial isometry.
This exercise is continued in Exercise 9 of Section 6.6.


Let A and B be $n\times n$ matrices that are unitarily equivalent.

(a) Prove that $\text{tr}(A\text{*}A)=\text{tr}(B\text{*}B)$.

(b) Use (a) to prove that
$$\sum _{i,\text{}j=1}^{n}{A}_{ij}{}^{2}={\displaystyle \sum _{i,\text{}j=1}^{n}{B}_{ij}{}^{2}.}$$ 
(c) Use (b) to show that the matrices
$$\left(\begin{array}{rr}1& 2\\ 2& i\end{array}\right)\phantom{\rule{1em}{0ex}}\text{and}\phantom{\rule{1em}{0ex}}\left(\begin{array}{rr}i& 4\\ 1& 1\end{array}\right)$$are not unitarily equivalent.


Let V be a real inner product space.

(a) Prove that any translation on V is a rigid motion.

(b) Prove that the composite of any two rigid motions on V is a rigid motion on V.


Prove the following variant of Theorem 6.22: If $f:\text{V}\to \text{V}$ is a rigid motion on a finitedimensional real inner product space V, then there exists a unique orthogonal operator T on V and a unique translation g on V such that $f=\text{T}\circ g$. (Note that the conclusion of Theorem 6.22 has $f=g\circ \text{T}$).

Let T and U be orthogonal operators on ${\text{R}}^{2}$ . Use Theorem 6.23 to prove the following results.

(a) If T and U are both reflections about lines through the origin, then UT is a rotation.

(b) If T is a rotation and U is a reflection about a line through the origin, then both UT and TU are reflections about lines through the origin.


Suppose that T and U are reflections of ${\text{R}}^{2}$ about the respective lines L and ${\text{L}}^{\prime}$ through the origin and that $\varphi $ and $\psi $ are the angles from the positive xaxis to L and ${\text{L}}^{\prime}$, respectively. By Exercise 24, UT is a rotation. Find its angle of rotation.

Suppose that T and U are orthogonal operators on ${\text{R}}^{2}$ such that T is the rotation by the angle $\varphi $ and U is the reflection about the line L through the origin. Let $\psi $ be the angle from the positive xaxis to L. By Exercise 24, both UT and TU are reflections about lines ${\text{L}}_{1}$ and ${\text{L}}_{2}$, respectively, through the origin.

(a) Find the angle $\theta $ from the positive xaxis to ${\text{L}}_{1}$.

(b) Find the angle $\theta $ from the positive xaxis to ${\text{L}}_{2}$.


Find new coordinates ${x}^{\prime},\text{}{y}^{\prime}$ so that the following quadratic forms can be written as A ${\lambda}_{1}{({x}^{\prime})}^{2}+{\lambda}_{2}{({y}^{\prime})}^{2}$.

(a) ${x}^{2}+4xy+{y}^{2}$

(b) $2{x}^{2}+2xy+2{y}^{2}$

(c) ${x}^{2}12xy4{y}^{2}$

(d) $3{x}^{2}+2xy+3{y}^{2}$

(e) ${x}^{2}2xy+{y}^{2}$


Consider the expression ${X}^{t}AX$, where ${X}^{t}=(x,\text{}y,\text{}z)$ and A is as defined in Exercise 2(e). Find a change of coordinates ${x}^{\prime},\text{}{y}^{\prime}\text{},{z}^{\prime}$ so that the preceding expression is of the form ${\lambda}_{1}{({x}^{\prime})}^{2}+{\lambda}_{2}{({y}^{\prime})}^{2}+{\lambda}_{3}{({z}^{\prime})}^{2}$.

QRFactorization. Let ${w}_{1},\text{}{w}_{2},\text{}\dots ,\text{}{w}_{n}$ be linearly independent vectors in ${\text{F}}^{n}$, and let ${v}_{1},\text{}{v}_{2},\text{}\dots ,\text{}{v}_{n}$ be the orthogonal vectors obtained from ${w}_{1},\text{}{w}_{2},\text{}\dots ,\text{}{w}_{n}$ by the GramSchmidt process. Let ${u}_{1},\text{}{u}_{2},\text{}\dots ,\text{}{u}_{n}$ be the orthonormal basis obtained by normalizing the ${v}_{i}$’s.

(a) Solving (1) in Section 6.2 for ${w}_{k}$ in terms of ${u}_{k}$, show that
$${w}_{k}={v}_{k}{u}_{k}+{\displaystyle \sum _{j=1}^{k1}\langle {w}_{k},\text{}{u}_{j}\rangle {u}_{j}\phantom{\rule{1em}{0ex}}(1\le k\le n).}$$ 
(b) Let A and Q denote the $n\times n$ matrices in which the kth columns are ${w}_{k}$ and ${u}_{k}$, respectively. Define $R\in {\text{M}}_{n\times n}(F)$ by
$${R}_{jk}=\{\begin{array}{ll}{v}_{j}& \text{if}j=k\\ \langle {w}_{k},\text{}{u}_{j}\rangle & \text{if}jk\\ 0& \text{if}jk.\end{array}$$Prove $A=QR$.

(c) Compute Q and R as in (b) for the $3\times 3$ matrix whose columns are the vectors (1, 1, 0), (2, 0, 1), and (2, 2, 1).

(d) Since Q is unitary [orthogonal] and R is upper triangular in (b), we have shown that every invertible matrix is the product of a unitary [orthogonal] matrix and an upper triangular matrix. Suppose that $A\in {\text{M}}_{n\times n}(F)$ is invertible and $A={Q}_{1}{R}_{1}={Q}_{2}{R}_{2}$, where ${Q}_{1},\text{}{Q}_{2}\in {\text{M}}_{n\times n}(F)$ are unitary and ${R}_{1},{R}_{2},\in {\text{M}}_{n\times n}(F)$ are upper triangular. Prove that $D={R}_{2}{R}_{1}^{1}$ is a unitary diagonal matrix. Hint: Use Exercise 17.

(e) The QR factorization described in (b) provides an orthogonalization method for solving a linear system $Ax=b$ when A is invertible. Decompose A to QR, by the GramSchmidt process or other means, where Q is unitary and R is upper triangular. Then $QRx=b$, and hence $Rx=Q\text{*}b$. This last system can be easily solved since R is upper triangular.^{1}
Use the orthogonalization method and (c) to solve the system
$$\begin{array}{rcrcrcr}{x}_{1}& +& 2{x}_{2}& +& 2{x}_{3}& =& 1\\ {x}_{1}& & & +& 2{x}_{3}& =& 11\\ & & {x}_{2}& +& {x}_{3}& =& 1\mathrm{.}\end{array}$$


Suppose that $\beta $ and $\gamma $ are ordered bases for an ndimensional real [complex] inner product space V. Prove that if Q is an orthogonal [unitary] $n\times n$ matrix that changes $\gamma $coordinates into $\beta $coordinates, then $\beta $ is orthonormal if and only if $\gamma $ is orthonormal.
The following definition is used in Exercises 31 and 32.
Definition.
Let V be a finitedimensional complex [real] inner product space, and let u be a unit vector in V. Define the Householder operator ${\text{H}}_{u}:\text{V}\to \text{V}$ by ${\text{H}}_{u}(x)=x2\langle x,\text{}u\rangle u$ for all $x\in \text{V}$.

Let ${\text{H}}_{u}$ be a Householder operator on a finitedimensional inner product space V. Prove the following results.

(a) ${\text{H}}_{u}$ is linear.

(b) ${\text{H}}_{u}(x)=x$ if and only if x is orthogonal to u.

(c) ${\text{H}}_{u}(x)=u.$

(d) ${\text{H}}_{{}_{u}}^{\text{*}}={\text{H}}_{u}$ and ${\text{H}}_{{}_{u}}^{2}=\text{I}$, and hence ${\text{H}}_{u}$ is a unitary [orthogonal] operator on V.
(Note: If V is a real inner product space, then in the language of Section 6.11, ${\text{H}}_{u}$ is a reflection.)


Let V be a finitedimensional inner product space over F. Let x and y be linearly independent vectors in V such that $x=y$.

(a) If $F=C$, prove that there exists a unit vector u in V and a complex number $\theta $ with $\theta =1$ such that ${\text{H}}_{u}(x)=\theta y$. Hint: Choose $\theta $ so that $\langle x,\text{}\theta y\rangle $ is real, and set $u={\displaystyle {\displaystyle \frac{1}{x\theta y}}}(x\theta y)$.

(b) If $F=R$, prove that there exists a unit vector u in V such that ${\text{H}}_{u}(x)=y$.
