Appendix E Polynomials – Linear Algebra, 5th Edition

Appendix E Polynomials

In this appendix, we discuss some useful properties of the polynomials with coefficients from a field. For the definition of a polynomial, refer to Section 1.2. Throughout this appendix, we assume that all polynomials have coefficients from a fixed field F.

Definition.

A polynomial f(x) divides a polynomial g(x) if there exists a polynomial q(x) such that g(x)=f(x)q(x).

Our first result shows that the familiar long division process for polynomials with real coefficients is valid for polynomials with coefficients from an arbitrary field.

Theorem E.1 (The Division Algorithm for Polynomials).

Let f(x) be a polynomial of degree n, and let g(x) be a polynomial of degree m0. Then there exist unique polynomials q(x) and r(x) such that

f(x)=q(x)g(x)+r(x), (1)

where the degree of r(x) is less than m.

Proof.

We begin by establishing the existence of q(x) and r(x) that satisfy (1). If n<m, take q(x)=0 and r(x)=f(x) to satisfy (1). Otherwise, if 0mn, we apply mathematical induction on n. First suppose that n=0. Then m=0, and it follows that f(x) and g(x) are nonzero constants. Hence we may take q(x)=f(x)/g(x) and r(x)=0 to satisfy (1).

Now suppose that the result is valid for all polynomials with degree less than n for some fixed n>0, and assume that f(x) has degree n. Suppose that

f(x)=anxn+an1xn1++a1x+a0

and

g(x)=bmxm+bm1xm1++b1x+b0,

and let h(x) be the polynomial defined by

h(x)=f(x)anbm1xnmg(x). (2)

Then h(x) is a polynomial of degree k<n, and therefore we may apply the induction hypothesis or the case where k<m (whichever is relevant) to obtain polynomials q1(x) and r(x) such that r(x) has degree less than m and

h(x)=q1(x)g(x)+r(x). (3)

Combining (2) and (3) and solving for f(x) gives us f(x)=q(x)g(x)+r(x) with q(x)=anbm1xnm+q1(x), which establishes the existence of q(x) and r(x) by mathematical induction.

We now show the uniqueness of q(x) and r(x). Suppose that q1(x),q2(x),r1(x), and r2(x) exist such that r1(x) and r2(x) each has degree less than m and

f(x)=q1(x)g(x)+r1(x)=q2(x)g(x)+r2(x).

Then

[q1(x)q2(x)]g(x)=r2(x)r1(x). (4)

The right side of (4) is a polynomial of degree less than m. Since g(x) has degree m, it must follow that q1(x)q2(x) is the zero polynomial. Hence q1(x)=q2(x), and thus r1(x)=r2(x) by (4).

In the context of Theorem E.1, we call q(x) and r(x) the quotient and remainder, respectively, for the division of f(x) by g(x). For example, suppose that F is the field of complex numbers. Then the quotient and remainder for the division of

f(x)=(3+i)x5(1i)x4+6x3+(6+2i)x2+(2+i)x+1

by

g(x)=(3+i)x22ix+4

are, respectively,

q(x)=x3+ix22andr(x)=(23i)x+9.

Corollary 1.

Let f(x) be a polynomial of positive degree, and let aF. Then f(a)=0 if and only if xa divides f(x).

Proof.

Suppose that xa divides f(x). Then there exists a polynomial q(x) such that f(x)=(xa)q(x). Thus f(a)=(aa)q(a)=0q(a)=0.

Conversely, suppose that f(a)=0. By the division algorithm, there exist polynomials q(x) and r(x) such that r(x) has degree less than 1 and

f(x)=q(x)(xa)+r(x).

Substituting a for x in the equation above, we obtain r(a)=0. Since r(x) has degree less than 1, it must be the constant polynomial r(x)=0. Thus f(x)=q(x)(xa).

For any polynomial f(x) with coefficients from a field F, an element aF is called a zero of f(x) if f(a)=0. With this terminology, the preceding corollary states that a is a zero of f(x) if and only if xa divides f(x).

Corollary 2.

Any polynomial of degree n1 has at most n distinct zeros.

Proof.

The proof is by mathematical induction on n. The result is obvious if n=1. Now suppose that the result is true for some positive integer n, and let f(x) be a polynomial of degree n+1. If f(x) has no zeros, then there is nothing to prove. Otherwise, if a is a zero of f(x), then by Corollary 1 we may write f(x)=(xa)q(x) for some polynomial q(x). Note that q(x) must be of degree n; therefore, by the induction hypothesis, q(x) can have at most n distinct zeros. Since any zero of f(x) distinct from a is also a zero of q(x), it follows that f(x) can have at most n+1 distinct zeros.

Polynomials having no common divisors arise naturally in the study of canonical forms. (See Chapter 7.)

Definition.

Polynomials f1(x), f2(x),, fn(x) are called relatively prime if no polynomial of positive degree divides all of them.

For example, the polynomials with real coefficients f(x)=x2(x1) and h(x)=(x1)(x2) are not relatively prime because x1 divides each of them. On the other hand, consider f(x) and g(x)=(x2)(x3), which do not appear to have common factors. Could other factorizations of f(x) and g(x) reveal a hidden common factor? We will soon see (Theorem E.9) that the preceding factors are the only ones. Thus f(x) and g(x) are relatively prime because they have no common factors of positive degree.

Lemma.

If f1(x) and f2(x) are relatively prime polynomials, there exist polynomials q1(x) and q2(x) such that

q1(x)f1(x)+q2(x)f2(x)=1,

where 1 denotes the constant polynomial with value 1.

Proof.

Without loss of generality, assume that the degree of f1(x) is greater than or equal to the degree of f2(x). The proof is by mathematical induction on the degree of f2(x). If f2(x) has degree 0, then f2(x) is a nonzero constant c. In this case, we can take q1(x)=0 and q2(x)=1/c.

Now suppose that the theorem holds whenever the polynomial of lesser degree has degree less than n for some positive integer n, and suppose that f2(x) has degree n. By the division algorithm, there exist polynomials q(x) and r(x) such that r(x) has degree less than n and

f1(x)=q(x)f2(x)+r(x). (5)

Since f1(x) and f2(x) are relatively prime, r(x) is not the zero polynomial. We claim that f2(x) and r(x) are relatively prime. Suppose otherwise; then there exists a polynomial g(x) of positive degree that divides both f2(x) and r(x). Hence, by (5), g(x) also divides f1(x), contradicting the fact that f1(x) and f2(x) are relatively prime. Since r(x) has degree less than n, we may apply the induction hypothesis to f2(x) and r(x). Thus there exist polynomials g1(x) and g2(x) such that

g1(x)f2(x)+g2(x)r(x)=1. (6)

Combining (5) and (6), we have

1=g1(x)f2(x)+g2(x)[f1(x)q(x)f2(x)]=g2(x)f1(x)+[g1(x)g2(x)q(x)]f2(x).

Thus, setting q1(x)=g2(x) and q2(x)=g1(x)g2(x)q(x), we obtain the desired result.

Theorem E.2.

If f1(x), f2(x),, fn(x) are relatively prime polynomials, there exist polynomials q1(x), q2(x),, qn(x) such that

q1(x)f1(x)+q2(x)f2(x)++qn(x)fn(x)=1,

where 1 denotes the constant polynomial with value 1.

Proof.

The proof is by mathematical induction on n, the number of polynomials. The preceding lemma establishes the case n=2.

Now assume the result is true for fewer than n polynomials, for some n3. We will first consider the trivial case where the first n1 polynomials are relatively prime. By the induction hypothesis, there exist polynomials q1(x), q2(x),, qn1(x) such that

q1(x)f1(x)+q2(x)f2(x)++qn1(x)fn1(x)=1.

Setting qn(x)=0, the zero polynomial, we have

q1(x)f1(x)+q2(x)f2(x)++qn(x)fn(x)=1,

which proves the result if the first n1 polynomials are relatively prime.

Now suppose that the first n1 polynomials are not relatively prime, and let g(x) be the monic polynomial of maximum positive degree that divides each of these polynomials. For k=1, 2,, n1, let hk(x) be the polynomial defined by fk(x)=g(x)hk(x). Then the polynomials h1(x), h2(x),, hn1(x) are relatively prime. So by the induction hypothesis, there exist polynomials ϕ1(x), ϕ2(x),, ϕn1(x) such that

ϕ1(x)h1(x)+ϕ2(x)h2(x)++ϕn1(x)hn1(x)=1.

Multiplying both sides of this equation by g(x), we obtain

ϕ1(x)f1(x)+ϕ2(x)f2(x)++ϕn1(x)fn1(x)=g(x).

Note that g(x) and fn(x) are relatively prime, and hence there exist polynomials p(x) and qn(x) such that

p(x)g(x)+qn(x)fn(x)=1.

Let qi(x)=p(x)ϕi(x) for i=1, 2,, n1. Then combining the two preceding equations yields

q1(x)f1(x)+q2(x)f2(x)++qn1(x)fn1(x)+qn(x)fn(x)=1,

which completes the induction argument.

Example 1

Let f1(x)=x1, f2(x)=x2+x2 f3(x)=x2x1, and f4(x)=x+1. As polynomials with real coefficients, f1(x), f2(x), f3(x) and f4(x) are relatively prime. It is easily verified that the polynomials q1(x)=x2, q2(x)=x, q3(x)=3, and q4(x)=x2 satisfy

q1(x)f1(x)+q2(x)f2(x)+q3(x)f3(x)+q4(x)f4(x)=1,

and hence these polynomials satisfy the conclusion of Theorem E.2.

Throughout Chapters 5, 6, and 7, we consider linear operators that are polynomials in a particular operator T and matrices that are polynomials in a particular matrix A. For these operators and matrices, the following notation is convenient.

Definitions.

Let

f(x)=a0+a1(x)++anxn

be a polynomial with coefficients from a field F. If T is a linear operator on a vector space V over F, we define

f(T)=a0I+a1T++anTn.

Similarly, if A is an n×n matrix with entries from F, we define

f(A)=a0I+a1A++anAn.

Example 2

Let T be the linear operator on R2 defined by T(a, b)=(2a+b, ab), and let f(x)=x2+2x3. It is easily checked that T2(a, b)=(5a+b, a+2b); so

f(T)(a, b)=(T2+2T3I)(a, b)=(5a+b, a+2b)+(4a+2b, 2a2b)3(a, b)=(6a+3b, 3a3b).

Similarly, if

A=(2111),

then

f(A)=A2+2A3I=(5112)+2(2111)3(1001)=(6333).

The next three results use this notation.

Theorem E.3.

Let f(x) be a polynomial with coefficients from a field F, and let T be a linear operator on a vector space V over F. Then the following statements are true.

  1. f(T) is a linear operator on V.

  2. If β is a finite ordered basis for V and A=[T]β, then [f(T)]β=f(A).

Proof.

Exercise.

Theorem E.4.

Let T be a linear operator on a vector space V over a field F, and let A be a square matrix with entries from F. Then, for any polynomials f1(x) and f2(x) with coefficients from F,

  1. f1(T)f2(T)=f2(T)f1(T).

  2. f1(A)f2(A)=f2(A)f1(A).

Proof.

Exercise.

Theorem E.5.

Let T be a linear operator on a vector space V over a field F, and let A be an n×n matrix with entries from F. If f1(x) and f2(x) are relatively prime polynomials with entries from F, then there exist polynomials q1(x) and q2(x) with entries from F such that

  1. q1(T)f1(T)+q2(T)f2(T)=I.

  2. q1(A)f1(A)+q2(A)f2(A)=I.

Proof.

Exercise.

In Chapters 5 and 7, we are concerned with determining when a linear operator T on a finite-dimensional vector space can be diagonalized and with finding a simple (canonical) representation of T. Both of these problems are affected by the factorization of a certain polynomial determined by T (the characteristic polynomial of T). In this setting, particular types of polynomials play an important role.

Definitions.

A polynomial f(x) with coefficients from a field F is called monic if its leading coefficient is 1. If f(x) has positive degree and cannot be expressed as a product of polynomials with coefficients from F each having positive degree, then f(x) is called irreducible.

Observe that whether a polynomial is irreducible depends on the field F from which its coefficients come. For example, f(x)=x2+1 is irreducible over the field of real numbers, but it is not irreducible over the field of complex numbers since x2+1=(x+i)(xi).

Clearly any polynomial of degree 1 is irreducible. Moreover, for polynomials with coefficients from an algebraically closed field, the polynomials of degree 1 are the only irreducible polynomials.

The following facts are easily established.

Theorem E.6.

Let ϕ(x) and f(x) be polynomials. If ϕ(x) is irreducible and ϕ(x) does not divide f(x), then ϕ(x) and f(x) are relatively prime.

Proof.

Exercise.

Theorem E.7.

Any two distinct irreducible monic polynomials are relatively prime.

Proof.

Exercise.

Theorem E.8.

Let f(x), g(x), and ϕ(x) be polynomials. If ϕ(x) is irreducible and divides the product f(x)g(x), then ϕ(x) divides f(x) or ϕ(x) divides g(x).

Proof.

Suppose that ϕ(x) does not divide f(x). Then ϕ(x) and f(x) are relatively prime by Theorem E.6, and so there exist polynomials q1(x) and q2(x) such that

1=q2(x)ϕ(x)+q2(x)f(x).

Multiplying both sides of this equation by g(x) yields

g(x)=q1(x)ϕ(x)g(x)+q2(x)f(x)g(x). (7)

Since ϕ(x) divides f(x)g(x), there is a polynomial h(x) such that f(x)g(x)=ϕ(x)h(x). Thus (7) becomes

g(x)=q1(x)ϕ(x)g(x)+q2(x)ϕ(x)h(x)=ϕ(x)[q1(x)g(x)+q2(x)h(x)].

So ϕ(x) divides g(x).

Corollary.

Let ϕ(x),ϕ1(x),ϕ2(x),,ϕn(x) be irreducible monic polynomials. If ϕ(x) divides the product ϕ1(x)ϕ2(x)ϕn(x), then ϕ(x)=ϕi(x) for some i (i=1, 2,, n).

Proof.

We prove the corollary by mathematical induction on n. For n=1, the result is an immediate consequence of Theorem E.7. Suppose then that for some n>1, the corollary is true for any n1 irreducible monic polynomials, and let ϕ1(x),ϕ2(x),,ϕn(x) be n irreducible polynomials. If ϕ(x) divides

ϕ1(x)ϕ2(x)ϕn(x)=[ϕ1(x)ϕ2(x)ϕn1(x)]ϕn(x),

then ϕ(x) divides the product ϕ1(x)ϕ2(x)ϕn1(x) or ϕ(x) divides ϕn(x) by Theorem E.8. In the first case, ϕ(x)=ϕi(x) for some i (i=1, 2,, n1) by the induction hypothesis; in the second case, ϕ(x)=ϕn(x) by Theorem E.7.

We are now able to establish the unique factorization theorem, which is used throughout Chapters 5 and 7. This result states that every polynomial of positive degree is uniquely expressible as a constant times a product of irreducible monic polynomials.

Theorem E.9 (Unique Factorization Theorem for Polynomials).

For any polynomial f(x) of positive degree, there exist a unique constant c; unique distinct irreducible monic polynomials ϕ1(x), ϕ2(x),, ϕk(x); and unique positive integers n1, n2,, nk such that

f(x)=c[ϕ1(x)]n1[ϕ2(x)]n2[ϕk(x)]nk.

Proof.

We begin by showing the existence of such a factorization using mathematical induction on the degree of f(x). If f(x) is of degree 1, then f(x)=ax+b for some constants a and b with a0. Setting ϕ(x)=x+b/a, we have f(x)=aϕ(x). Since ϕ(x) is an irreducible monic polynomial, the result is proved in this case. Now suppose that the conclusion is true for any polynomial with positive degree less than some integer n>1, and let f(x) be a polynomial of degree n. Then

f(x)=anxn++a1x+a0

for some constants ai with an0. If f(x) is irreducible, then

f(x)=an(xn+an1anxn1++a1an+a0an)

is a representation of f(x) as a product of an and an irreducible monic polynomial. If f(x) is not irreducible, then f(x)=g(x)h(x) for some polynomials g(x) and h(x), each of positive degree less than n. The induction hypothesis guarantees that both g(x) and h(x) factor as products of a constant and powers of distinct irreducible monic polynomials. Consequently f(x)=g(x)h(x) also factors in this way. Thus, in either case, f(x) can be factored as a product of a constant and powers of distinct irreducible monic polynomials.

It remains to establish the uniqueness of such a factorization. Suppose that

f(x)=c[ϕ1(x)]n1[ϕ2(x)]n2[ϕk(x)]nk=d[ψ1(x)]m1[ψ2(x)]m2[ψr(x)]mr, (8)

where c and d are constants, ϕi(x) and ψj(x) are irreducible monic polynomials, and ni and mj are positive integers for i=1, 2,, k and j=1, 2,, r. Clearly both c and d must be the leading coefficient of f(x); hence c=d. Dividing by c, we find that (8) becomes

[ϕ1(x)]n1[ϕ2(x)]n2[ϕk(x)]nk=[ψ1(x)]m1[ψ2(x)]m2[ψr(x)]mr. (9)

So ϕi(x) divides the right side of (9) for i=1, 2,, k. Consequently, by the corollary to Theorem E.8, each ϕi(x) equals some ψj(x), and similarly, each ψj(x) equals some ϕi(x). We conclude that r=k and that, by renumbering if necessary, ϕi(x)=ψi(x) for i=1, 2,, k. Suppose that nimi for some i. Without loss of generality, we may suppose that i=1 and n1>m1. Then by canceling [ϕ1(x)]m1 from both sides of (9), we obtain

[ϕ1(x)]n1m1[ϕ2(x)]n2[ϕk(x)]nk=[ϕ2(x)]m2[ϕk(x)]mk. (10)

Since n1m1>0, ϕ1(x) divides the left side of (10) and hence divides the right side also. So ϕ1(x)=ϕi(x) for some i=2,, k by the corollary to Theorem E.8. But this contradicts that ϕ1(x), ϕ2(x),, ϕk(x) are distinct. Hence the factorizations of f(x) in (8) are the same.

It is often useful to regard a polynomial f(x)=anxn++a1x+a0 with coefficients from a field F as a function f:FF. In this case, the value of f at cF is f(c)=ancn++a1c+a0. Unfortunately, for arbitrary fields there is not a one-to-one correspondence between polynomials and polynomial functions. For example, if f(x)=x2 and g(x)=x are two polynomials over the field Z2 (defined in Example 4 of Appendix C), then f(x) and g(x) have different degrees and hence are not equal as polynomials. But f(a)=g(a) for all aZ2, so that f and g are equal polynomial functions. Our final result shows that this anomaly cannot occur over an infinite field.

Theorem E.10.

Let f(x) and g(x) be polynomials with coefficients from an infinite field F. If f(a)=g(a) for all aF, then f(x) and g(x) are equal.

Proof.

Suppose that f(a)=g(a) for all aF. Define h(x)=f(x)g(x), and suppose that h(x) is of degree n1. It follows from Corollary 2 to Theorem E.1 that h(x) can have at most n zeroes. But

h(a)=f(a)g(a)=0

for every aF, contradicting the assumption that h(x) has positive degree. Thus h(x) is a constant polynomial, and since h(a)=0 for each aF, it follows that h(x) is the zero polynomial. Hence f(x)=g(x).