# 5.3* Matrix Limits and Markov Chains

In this section, we apply what we have learned thus far in Chapter 5 to study the limit of a sequence of powers  where A is a square matrix with complex entries. Such sequences and their limits have practical applications in the natural and social sciences.

We assume familiarity with limits of sequences of real numbers. The limit of a sequence of complex numbers  can be defined in terms of the limits of the sequences of the real and imaginary parts: If , where  and  are real numbers, and i is the imaginary number such that , then



provided that  and  exist.

# Definition.

Let L,  be  matrices having complex entries. The sequence  is said to converge to the  matrix L, called the limit of the sequence, if



for all  and . To denote that L is the limit of the sequence, we write



# Example 1

If



then



where e is the base of the natural logarithm.

A simple, but important, property of matrix limits is contained in the next theorem. Note the analogy with the familiar property of limits of sequences of real numbers that asserts that if  exists, then



# Theorem 5.11.

Let  be a sequence of  matrices with complex entries that converges to the matrix L. Then for any  and 



# Proof.

For any  and 



Hence . The proof that  is similar.

# Corollary.

Let  be such that . Then for any invertible matrix 



# Proof.

Since



we have



by applying Theorem 5.11 twice.

In the discussion that follows, we frequently encounter the set



Geometrically, this set consists of the complex number 1 and the interior of the unit disk (the disk of radius 1 centered at the origin). This set is of interest because if  is a complex number, then  exists if and only . This fact, which is obviously true if  is real, can be shown to be true for complex numbers also.

The following important result gives necessary and sufficient conditions for the existence of the type of limit under consideration.

# Theorem 5.12.

Let A be a square matrix with complex entries. Then  exists if and only if both of the following conditions hold.

1. (a) Every eigenvalue of A is contained in S.

2. (b) If 1 is an eigenvalue of A, then the dimension of the eigenspace corresponding to 1 equals the multiplicity of 1 as an eigenvalue of A.

One proof of this theorem, which relies on the theory of Jordan canonical forms (Section 7.2), can be found in Exercise 19 of Section 7.2. A second proof, which makes use of Schur’s theorem (Theorem 6.14 of Section 6.4), can be found in the article by S. H. Friedberg and A. J. Insel, “Convergence of matrix powers,” Int. J. Math. Educ. Sci. Technol., 1992, Vol. 23, no. 5, pp. 765-769.

The necessity of condition (a) is easily justified. For suppose that  is an eigenvalue of A such that . Let v be an eigenvector of A corresponding to . Regarding v as an  matrix, we see that



by Theorem 5.11, where . But  diverges because  does not exist. Hence if  exists, then condition (a) of Theorem 5.12 must hold.

Although we are unable to prove the necessity of condition (b) here, we consider an example for which this condition fails. Observe that the characteristic polynomial for the matrix



is , and hence B has eigenvalue  with multiplicity 2. It can easily be verified that , so that condition (b) of Theorem 5.12 is violated. A simple mathematical induction argument can be used to show that



and therefore that  does not exist. We see in Chapter 7 that if A is a matrix for which condition (b) fails, then A is similar to a matrix whose upper left  submatrix is precisely this matrix B.

In most of the applications involving matrix limits, the matrix is diagonalizable, and so condition (b) of Theorem 5.12 is automatically satisfied. In this case, Theorem 5.12 reduces to the following theorem, which can be proved using our previous results.

# Theorem 5.13.

Let  satisfy the following two conditions.

1. (a) Every eigenvalue of A is contained in S.

2. (b) A is diagonalizable.

Then  exists.

# Proof.

Since A is diagonalizable, there exists an invertible matrix Q such that  is a diagonal matrix. Suppose that



Because  are the eigenvalues of A, condition (i) requires that for each i, either  or . Thus



But since



the sequence  converges to a limit L. Hence



by the corollary to Theorem 5.11.

The technique for computing  used in the proof of Theorem 5.13 can be employed in actual computations, as we now illustrate. Let



Using the methods in Sections 5.1 and 5.2, we obtain



such that . Hence



Next, we consider an application that uses the limit of powers of a matrix. Suppose that the population of a certain metropolitan area remains constant but there is a continual movement of people between the city and the suburbs. Specifically, let the entries of the following matrix A represent the probabilities that someone living in the city or in the suburbs on January 1 will be living in each region on January 1 of the next year.



For instance, the probability that someone living in the city (on January 1) will be living in the suburbs next year (on January 1) is 0.10. Notice that since the entries of A are probabilities, they are nonnegative. Moreover, the assumption of a constant population in the metropolitan area requires that the sum of the entries of each column of A be 1.

Any square matrix having these two properties (nonnegative entries and columns that sum to 1) is called a transition matrix or a stochastic matrix. For an arbitrary  transition matrix M, the rows and columns correspond to n states, and the entry  represents the probability of moving from state j to state i in one stage.

In our example, there are two states (residing in the city and residing in the suburbs). So, for example,  is the probability of moving from the city to the suburbs in one stage, that is, in one year. We now determine the probability that a city resident will be living in the suburbs after 2 years. There are two different ways in which such a move can be made: remaining in the city for 1 year and then moving to the suburbs, or moving to the suburbs during the first year and remaining there the second year. (See Figure 5.3.) The probability that a city dweller remains in the city for the first year is 0.90, whereas the probability that the city dweller moves to the suburbs during the first year is 0.10. Hence the probability that a city dweller stays in the city for the first year and then moves to the suburbs during the second year is the product (0.90)(0.10). Likewise, the probability that a city dweller moves to the suburbs in the first year and remains in the suburbs during the second year is the product (0.10)(0.98). Thus the probability that a city dweller will be living in the suburbs after 2 years is the sum of these products, . Observe that this number is obtained by the same calculation as that which produces , and hence  represents the probability that a city dweller will be living in the suburbs after 2 years. In general, for any transition matrix M, the entry  represents the probability of moving from state j to state i in m stages.

Suppose additionally that in year 2000, 70% of the population of the metropolitan area lived in the city and 30% lived in the suburbs. We record these data as a column vector:



Notice that the rows of P correspond to the states of residing in the city and residing in the suburbs, respectively, and that these states are listed in the same order as the listing in the transition matrix A. Observe also that the column vector P contains nonnegative entries that sum to 1; such a vector is called a probability vector. In this terminology, each column of a transition matrix is a probability vector. It is often convenient to regard the entries of a transition matrix or a probability vector as proportions or percentages instead of probabilities, as we have already done with the probability vector P.

In the vector AP, the first coordinate is the sum . The first term of this sum, (0.90)(0.70), represents the proportion of the 2000 metropolitan population that remained in the city during the next year, and the second term, (0.02) (0.30), represents the proportion of the 2000 metropolitan population that moved into the city during the next year. Hence the first coordinate of AP represents the proportion of the metropolitan population that was living in the city in 2001. Similarly, the second coordinate of



represents the proportion of the metropolitan population that was living in the suburbs in 2001. This argument can be easily extended to show that the coordinates of



represent the proportions of the metropolitan population that were living in each location in 2002. In general, the coordinates of  represent the proportion of the metropolitan population that will be living in the city and suburbs, respectively, after m stages (m years after 2000).

Will the city eventually be depleted if this trend continues? In view of the preceding discussion, it is natural to define the eventual proportion of the city dwellers and suburbanites to be the first and second coordinates, respectively, of . We now compute this limit. It is easily shown that A is diagonalizable, and so there is an invertible matrix Q and a diagonal matrix D such that . In fact,



Therefore



Consequently



Thus, eventually,  of the population will live in the city and  will live in the suburbs each year. Note that the vector LP satisfies . Hence LP is both a probability vector and an eigenvector of A corresponding to the eigenvalue 1. Since the eigenspace of A corresponding to the eigenvalue 1 is one-dimensional, there is only one such vector, and LP is independent of the initial choice of probability vector P. (see Exercise 15.) For example, had the 2000 metropolitan population consisted entirely of city dwellers, the limiting outcome would be the same.

In analyzing the city–suburb problem, we gave probabilistic interpretations of  and AP, showing that  is a transition matrix and AP is a probability vector. In fact, the product of any two transition matrices is a transition matrix, and the product of any transition matrix and probability vector is a probability vector. A proof of these facts is a simple corollary of the next theorem, which characterizes transition matrices and probability vectors.

# Theorem 5.14.

Let M be an  matrix having real nonnegative entries, let v be a column vector in  having nonnegative coordinates, and let  be the column vector in which each coordinate equals 1. Then

1. (a) M is a transition matrix if and only if ;

2. (b) v is a probability vector if and only if .

Exercise.

# Corollary.

1. (a) The product of two  transition matrices is an  transition matrix. In particular, any power of a transition matrix is a transition matrix.

2. (b) The product of a transition matrix and a probability vector is a probability vector.

# Proof.

Exercise.

The city–suburb problem is an example of a process in which elements of a set are each classified as being in one of several fixed states that can switch over time. In general, such a process is called a stochastic process. The switching to a particular state is described by a probability, and in general this probability depends on such factors as the state in question, the time in question, some or all of the previous states in which the object has been (including the current state), and the states that other objects are in or have been in.

For instance, the object could be an American voter, and the state of the object could be his or her preference of political party; or the object could be a molecule of , and the states could be the three physical states in which  can exist (solid, liquid, and gas). In these examples, all four of the factors mentioned above influence the probability that an object is in a particular state at a particular time.

If, however, the probability that an object in one state changes to a different state in a fixed interval of time depends only on the two states (and not on the time, earlier states, or other factors), then the stochastic process is called a Markov process. If, in addition, the number of possible states is finite, then the Markov process is called a Markov chain. We treated the city–suburb example as a two-state Markov chain. Of course, a Markov process is usually only an idealization of reality because the probabilities involved are almost never constant over time.

With this in mind, we consider another Markov chain. A certain community college would like to obtain information about the likelihood that students in various categories will graduate. The school classifies a student as a sophomore or a freshman depending on the number of credits that the student has earned. Data from the school indicate that, from one fall semester to the next, 40% of the sophomores will graduate, 30% will remain sophomores, and 30% will quit permanently. For freshmen, the data show that 10% will graduate by next fall, 50% will become sophomores, 20% will remain freshmen, and 20% will quit permanently. During the present year, 50% of the students at the school are sophomores and 50% are freshmen. Assuming that the trend indicated by the data continues indefinitely, the school would like to know

1. the percentage of the present students who will graduate, the percentage who will be sophomores, the percentage who will be freshmen, and the percentage who will quit school permanently by next fall;

2. the same percentages as in item 1 for the fall semester two years hence; and

3. the percentage of its present students who will eventually graduate.

The preceding paragraph describes a four-state Markov chain with the following states:

2. being a sophomore

3. being a freshman

4. having quit permanently.

The given data provide us with the transition matrix



of the Markov chain. (Notice that students who have graduated or have quit permanently are assumed to remain indefinitely in those respective states. Thus a freshman who quits the school and returns during a later semester is not regarded as having changed states—the student is assumed to have remained in the state of being a freshman during the time he or she was not enrolled.) Moreover, we are told that the present distribution of students is half in each of states 2 and 3 and none in states 1 and 4. The vector



that describes the initial probability of being in each state is called the initial probability vector for the Markov chain.

To answer question 1, we must determine the probabilities that a present student will be in each state by next fall. As we have seen, these probabilities are the coordinates of the vector



Hence by next fall, 25% of the present students will graduate, 40% will be sophomores, 10% will be freshmen, and 25% will quit the school permanently. Similarly,



provides the information needed to answer question 2: within two years 42% of the present students will graduate, 17% will be sophomores, 2% will be freshmen, and 39% will quit school.

Finally, the answer to question 3 is provided by the vector LP, where . For the matrices



we have . Thus



So



and hence the probability that one of the present students will graduate is .

In the preceding two examples, we saw that , where A is the transition matrix and P is the initial probability vector of the Markov chain, gives the eventual proportions in each state. In general, however, the limit of powers of a transition matrix need not exist. For example, if



then  does not exist because odd powers of M equal M and even powers of M equal I. The reason that the limit fails to exist is that condition (a) of Theorem 5.12 does not hold for M ( is an eigenvalue). In fact, it can be shown (see Exercise 20 of Section 7.2) that the only transition matrices A such that  does not exist are precisely those matrices for which condition (a) of Theorem 5.12 fails to hold.

But even if the limit of powers of the transition matrix exists, the computation of the limit may be quite difficult. (The reader is encouraged to work Exercise 6 to appreciate the truth of the last sentence.) Fortunately, there is a large and important class of transition matrices for which this limit exists and is easily computed—this is the class of regular transition matrices.

# Definition.

A transition matrix is called regular if some power of the matrix contains only nonzero (i.e., positive) entries.

# Example 2

The transition matrix



of the Markov chain used in the city–suburb problem is clearly regular because each entry is positive. On the other hand, the transition matrix



of the Markov chain describing community college enrollments is not regular because the first column of  is



for any power m.

Observe that a regular transition matrix may contain zero entries. For example,



is regular because every entry of  is positive.

The remainder of this section is devoted to proving that, for a regular transition matrix A, the limit of the sequence of powers of A exists and has identical columns. From this fact, it is easy to compute this limit. In the course of proving this result, we obtain some interesting bounds for the magnitudes of eigenvalues of any square matrix. These bounds are given in terms of the sum of the absolute values of the rows and columns of the matrix. The necessary terminology is introduced in the definitions that follow.

# Definitions.

Let . For , define  to be the sum of the absolute values of the entries of row i of A, and define  to be equal to the sum of the absolute values of the entries of column j of A. Thus



and



The row sum of A, denoted , and the column sum of A, denoted , are defined as



# Example 3

For the matrix



, and . Hence  and .

Our next results show that the smaller of  and  is an upper bound for the absolute values of eigenvalues of A. In the preceding example, for instance, A has no eigenvalue with absolute value greater than .

To obtain a geometric view of the following theorem, we introduce some terminology. For an  matrix A, we define the ith Gershgorin disk  to be the disk in the complex plane with center  and radius ; that is,



For example, consider the matrix



For this matrix,  is the disk with center  and radius 1, and  is the disk with center  and radius 2. (See Figure 5.4.)

Gershgorin’s circle theorem, stated below, tells us that all the eigenvalues of A are located within these two disks. In particular, we see that 0 is not an eigenvalue, and hence by Exercise 9(c) of Section 5.1, A is invertible.

# Theorem 5.15. (Gershgorin’s Circle Theorem).

Let . Then every eigenvalue of A is contained in a Gershgorin disk.

# Proof.

Let  be an eigenvalue of A with the corresponding eigenvector



Then v satisfies the matrix equation , which can be written



Suppose that  is a coordinate of v having the largest absolute value; note that  because v is an eigenvector of A.

We show that  lies in  that is, . For , it follows from (3) that



Thus



so



because .

# Corollary 1.

Let  be any eigenvalue of . Then .

# Proof.

By Gershgorin’s circle theorem,  for some k. Hence



# Corollary 2.

Let  be any eigenvalue of . Then



# Proof.

Since  by Corollary 1, it suffices to show that . By Exercise 15 of Section 5.1,  is an eigenvalue of , and so  by Corollary 1. But the rows of  are the columns of A; consequently . Therefore .

The next corollary is immediate from Corollary 2.

# Corollary 3.

If  is an eigenvalue of a transition matrix, then . The next result asserts that the upper bound in Corollary 3 is attained.

# Theorem 5.16.

Every transition matrix has 1 as an eigenvalue.

# Proof.

Let A be an  transition matrix, and let  be the column vector in which each coordinate is 1. Then  by Theorem 5.14, and hence u is an eigenvector of  corresponding to the eigenvalue 1. But since A and  have the same eigenvalues, it follows that 1 is also an eigenvalue of A.

Suppose that A is a transition matrix for which some eigenvector corresponding to the eigenvalue 1 has only nonnegative coordinates. Then some multiple of this vector is a probability vector P as well as an eigenvector of A corresponding to eigenvalue 1. It is interesting to observe that if P is the initial probability vector of a Markov chain having A as its transition matrix, then the Markov chain is completely static. For in this situation,  for every positive integer m; hence the probability of being in each state never changes. Consider, for instance, the city-suburb problem with



# Theorem 5.17.

Let  be a matrix in which each entry is a positive real number, and let  be a complex eigenvalue of A such that . Then  and  is a basis for , where  is the column vector in which each coordinate equals 1.

# Proof.

Let v be an eigenvector of A corresponding to , with coordinates . Suppose that  is the coordinate of v having the largest absolute value, and let . Then



Since , the three inequalities in (4) are actually equalities; that is,

1. (a) ,

2. (b) , and

3. (c) .

We will see in Exercise 15(b) of Section 6.1 that (a) holds if and only if all the terms  are obtained by multiplying some nonzero complex number z by nonnegative real numbers. Without loss of generality, we assume that . Thus there exist nonnegative real numbers  such that



By (b) and the assumption that  for all k and j, we have



Combining (5) and (6), we obtain



and therefore by (5), we have  for all j. So



and hence  is a basis for .

Finally, observe that all of the entries of Au are positive because the same is true for the entries of both A and u. But , and hence . Therefore, .

# Corollary 1.

Let  be a matrix in which each entry is positive, and let  be an eigenvalue of A such that . Then , and the dimension of  equals 1.

Exercise.

# Corollary 2.

Let  be a transition matrix in which each entry is positive, and let  be any eigenvalue of A other than 1. Then . Moreover, the eigenspace corresponding to the eigenvalue 1 has dimension 1.

# Proof.

Exercise.

Our next result extends Corollary 2 to regular transition matrices and thus shows that regular transition matrices satisfy condition (a) of Theorems 5.12 and 5.13.

# Theorem 5.18.

Let A be a regular transition matrix, and let  be an eigenvalue of A. Then

1. (a) 

2. (b) If , then , and .

# Proof.

Statement (a) was proved as Corollary 3 to Theorem 5.15.

(b) Since A is regular, there exists a positive integer s such that  has only positive entries. Because A is a transition matrix and the entries of  are positive, the entries of  are positive. Suppose that . Then  and  are eigenvalues of  and , respectively, having absolute value 1. So by Corollary 2 to Theorem 5.17, . Thus . Let  and  denote the eigenspaces of A and , respectively, corresponding to . Then  and, by Corollary 2 to Theorem 5.17, . Hence , and .

# Corollary.

Let A be a regular transition matrix that is diagonalizable. Then  exists.

The preceding corollary, which follows immediately from Theorems 5.18 and 5.13, is not the best possible result. In fact, it can be shown that if A is a regular transition matrix, then the multiplicity of 1 as an eigenvalue of A is 1. Thus, by Theorem 5.7 (p. 264), condition (b) of Theorem 5.12 is satisfied. So if A is a regular transition matrix,  exists regardless of whether A is or is not diagonalizable. As with Theorem 5.12, however, the fact that the multiplicity of 1 as an eigenvalue of A is 1 cannot be proved at this time. Nevertheless, we state this result here (leaving the proof until Exercise 21 of Section 7.2) and deduce further facts about  when A is a regular transition matrix.

# Theorem 5.19.

Let A be a regular transition matrix. Then

1. (a) The multiplicity of 1 as an eigenvalue of A is 1.

2. (b)  exists.

3. (c)  is a transition matrix.

4. (d) .

5. (e) The columns of L are identical. In fact, each column of L is equal to the unique probability vector v that is also an eigenvector of A corresponding to the eigenvalue 1.

6. (f) For any probability vector w, .

# Proof.

1. (a) see Exercise 21 of Section 7.2.

2. (b) This follows from (a) and Theorems 5.18 and 5.12.

3. (c) By Theorem 5.14, we must show that . Now  is a transition matrix by the corollary to Theorem 5.14, so



and it follows that L is a transition matrix.

4. (d) By Theorem 5.11,



Similarly, .

5. (e) Since  by (d), each column of L is an eigenvector of A corresponding to the eigenvalue 1. Moreover, by (c), each column of L is a probability vector. Thus, by (a), each column of L is equal to the unique probability vector v corresponding to the eigenvalue 1 of A.

6. (f) Let w be any probability vector, and set . Then y is a probability vector by the corollary to Theorem 5.14, and also  by (d). Hence y is also an eigenvector corresponding to the eigenvalue 1 of A. So  by (e).

# Definition.

The vector v in Theorem 5.19(e) is called the fixed probability vector or stationary vector of the regular transition matrix A.

Theorem 5.19 can be used to deduce information about the eventual distribution in each state of a Markov chain having a regular transition matrix.

# Example 4

Because of an expected El Niño in the coming fall, the National Weather Service predicted that, during the following winter, the probability of above- average, average, and below-average snowfall for a particular region will be 0.5, 0.3, and 0.2, respectively. Historical weather records for this region show that if there is above-average snowfall in a given winter, then the probabilities of above-average, average, and below-average snowfall for the next winter are 0.4, 0.2, and 0.2, respectively. In addition, if there is average snowfall in a given winter, then the probabilities of above-average, average, and below- average snowfall for the next winter are 0.1, 0.7, and 0.2, respectively. Finally, if there is below-average snowfall in a given winter, then the probabilities of above-average, average, and below-average snowfall for the next winter are 0.5, 0.1, and 0.6, respectively.

Assuming that this historical weather trend continues, the situation described in the preceding paragraph is a three-state Markov chain in which the states are above-average, average, and below-average winter snowfall. Here the probability vector that gives the initial probability of being in each state during the coming winter is



and the transition matrix is



The probabilities of being in each state m winters after the original National Weather Service prediction are the coordinates of the vector . The reader may check that



Note the apparent convergence of .

Since A is regular, the long-range prediction for the region’s winter snowfall can be found by computing the fixed probability vector for A. This vector is the unique probability vector v such that . Letting



we see that the matrix equation  yields the following system of linear equations:



It is easily shown that



is a basis for the solution space of this system. Hence the unique fixed probability vector for A is



Therefore, in the long run, the probabilities of above-average, average, and below-average winter snowfall for this region are 0.25, 0.35, and 0.40, respectively.

Note that if



then



So



# Example 5

Farmers in Lamron plant one crop per year—either corn, soybeans, or wheat. Because they believe in the necessity of rotating their crops, these farmers do not plant the same crop in successive years. In fact, of the total acreage on which a particular crop is planted, exactly half is planted with each of the other two crops during the succeeding year. This year, 300 acres of corn, 200 acres of soybeans, and 100 acres of wheat were planted.

The situation just described is another three-state Markov chain in which the three states correspond to the planting of corn, soybeans, and wheat, respectively. In this problem, however, the amount of land devoted to each crop, rather than the percentage of the total acreage (600 acres), is given. By converting these amounts into fractions of the total acreage, we see that the transition matrix A and the initial probability vector P of the Markov chain are



The fraction of the total acreage devoted to each crop in m years is given by the coordinates of , and the eventual proportions of the total acreage used for each crop are the coordinates of . Thus the eventual amounts of land devoted to each crop are found by multiplying this limit by the total acreage; that is, the eventual amounts of land used for each crop are the coordinates of .

Since A is a regular transition matrix, Theorem 5.19 shows that  is a matrix L in which each column equals the unique fixed probability vector for A. It is easily seen that the fixed probability vector for A is



Hence



so



Thus, in the long run, we expect 200 acres of each crop to be planted each year. (For a direct computation of , see Exercise 14.)

In this section, we have concentrated primarily on the theory of regular transition matrices. There is another interesting class of transition matrices that can be represented in the form



where I is an identity matrix and O is a zero matrix. (Such transition matrices are not regular since the lower left block remains O in any power of the matrix.) The states corresponding to the identity submatrix are called absorbing states because such a state is never left once it is entered. A Markov chain is called an absorbing Markov chain if it is possible to go from an arbitrary state into an absorbing state in a finite number of stages. Observe that the Markov chain that describes the enrollment pattern in a community college is an absorbing Markov chain with states 1 and 4 as its absorbing states. (See page 291.) Readers interested in learning more about absorbing Markov chains are referred to Introduction to Finite Mathematics (third edition) by J. Kemeny, J. Snell, and G. Thompson (Prentice-Hall, Inc., Englewood Cliffs, N. J., 1974) or Discrete Mathematical Models by Fred S. Roberts (Prentice- Hall, Inc., Englewood Cliffs, N. J., 1976).

# Applications*

Visit goo.gl/cKTyHi for an application of Markov chains to internet searches.

In species that reproduce sexually, the characteristics of an offspring with respect to a particular genetic trait are determined by a pair of genes, one inherited from each parent. The genes for a particular trait are of two types, which are denoted by G and g. The gene G represents the dominant characteristic, and g represents the recessive characteristic. Offspring with genotypes GG or Gg exhibit the dominant characteristic, whereas offspring with genotype gg exhibit the recessive characteristic. For example, in humans, brown eyes are a dominant characteristic and blue eyes are the corresponding recessive characteristic; thus the offspring with genotypes GG or Gg are brown-eyed, whereas those of type gg are blue-eyed.

Let us consider the probability of offspring of each genotype for a male fruit fly of genotype Gg. (We assume that the population under consideration is large, that mating is random with respect to genotype, and that the distribution of each genotype within the population is independent of sex and life expectancy.) Let



denote the present proportion of the fruit fly population with genotypes GG, Gg, and gg, respectively. This situation can by described by a three-state Markov chain with the following transition matrix:



It is easily checked that  contains only positive entries; so B is regular. Thus, if only males of genotype Gg survive until reproductive maturity, the proportion of offspring in the population having a certain genotype will stabilize at the fixed probability vector for B, which is



Now suppose that similar experiments are to be performed with males of genotypes GG and gg. As already mentioned, these experiments are three- state Markov chains with transition matrices



respectively. In order to consider the case where all male genotypes are permitted to reproduce, we must form the transition matrix , which is the linear combination of A, B, and C weighted by the proportion of males of each genotype. Thus



To simplify the notation, let  and . (The numbers a and b represent the proportions of G and g genes, respectively, in the population.) Then



where .

Let , and  denote the proportions of the first-generation offspring having genotypes GG, Gg, and gg, respectively. Then



In order to consider the effects of unrestricted matings among the first- generation offspring, a new transition matrix  must be determined based upon the distribution of first-generation genotypes. As before, we find that



where  and . However



Thus ; so the distribution of second-generation offspring among the three genotypes is



the same as the first-generation offspring. In other words, MP is the fixed probability vector for M, and genetic equilibrium is achieved in the population after only one generation. (This result is called the Hardy-Weinberg law.) Notice that in the important special case that  (or equivalently, that ), the distribution at equilibrium is



# Exercises

1. Label the following statements as true or false.

1. (a) If  and , then, for any invertible matrix , we have .

2. (b) If 2 is an eigenvalue of , then  does not exist.

3. (c) Any vector



such that  is a probability vector.

4. (d) The sum of the entries of each row of a transition matrix equals 1.

5. (e) The product of a transition matrix and a probability vector is a probability vector.

6. (f) Let z be any complex number such that . Then the matrix



does not have 3 as an eigenvalue.

7. (g) Every transition matrix has 1 as an eigenvalue.

8. (h) No transition matrix can have  as an eigenvalue.

9. (i) If A is a transition matrix, then  exists.

10. (j) If A is a regular transition matrix, then  exists and has rank 1.

2. Determine whether  exists for each of the following matrices A, and compute the limit if it exists.

1. (a) 

2. (b) 

3. (c) 

4. (d) 

5. (e) 

6. (f) 

7. (g) 

8. (h) 

9. (i) 

10. (j) 

3. Prove that if  is a sequence of  matrices with complex entries such that , then .

4. Prove that if  is diagonalizable and  exists, then either  or .

5. Find  matrices A and B having real entries such that , and  all exist, but


6. In the week beginning June 1, 30% of the patients who arrived by helicopter at a hospital trauma unit were ambulatory and 70% were bedridden. One week after arrival, 60% of the ambulatory patients had been released, 20% remained ambulatory, and 20% had become bedridden. After the same amount of time, 10% of the bedridden patients had been released, 20% had become ambulatory, 50% remained bedridden, and 20% had died. Determine the percentages of helicopter arrivals during the week of June 1 who were in each of the four states one week after arrival. Assuming that the given percentages continue in the future, also determine the percentages of patients who eventually end up in each of the four states.

7. A player begins a game of chance by placing a marker in box 2, marked Start. (See Figure 5.5.) A die is rolled, and the marker is moved one square to the left if a 1 or a 2 is rolled and one square to the right if a 3, 4, 5, or 6 is rolled. This process continues until the marker lands in square 1, in which case the player wins the game, or in square 4, in which case the player loses the game. What is the probability of winning this game? Hint: Instead of diagonalizing the appropriate transition matrix A, it is easier to represent  as a linear combination of eigenvectors of A and then apply  to the result.

8. Which of the following transition matrices are regular?

1. (a) 

2. (b) 

3. (c) 

4. (d) 

5. (e) 

6. (f) 

7. (g) 

8. (h) 

9. Compute  if it exists, for each matrix A in Exercise 8.

10. Each of the matrices that follow is a regular transition matrix for a three-state Markov chain. In all cases, the initial probability vector is



For each transition matrix, compute the proportions of objects in each state after two stages and the eventual proportions of objects in each state by determining the fixed probability vector.

1. (a) 

2. (b) 

3. (c) 

4. (d) 

5. (e) 

6. (f) 

11. In 1940, a county land-use survey showed that 10% of the county land was urban, 50% was unused, and 40% was agricultural. Five years later, a follow-up survey revealed that 70% of the urban land had remained urban, 10% had become unused, and 20% had become agricultural. Likewise, 20% of the unused land had become urban, 60% had remained unused, and 20% had become agricultural. Finally, the 1945 survey showed that 20% of the agricultural land had become unused while 80% remained agricultural. Assuming that the trends indicated by the 1945 survey continue, compute the percentages of urban, unused, and agricultural land in the county in 1950 and the corresponding eventual percentages.

12. A diaper liner is placed in each diaper worn by a baby. If, after a diaper change, the liner is soiled, then it is replaced by a new liner. Otherwise, the liner is washed with the diapers and reused, except that each liner is replaced by a new liner after its second use, even if it has never been soiled. The probability that the baby will soil any diaper liner is one-third. If there are only new diaper liners at first, eventually what proportions of the diaper liners being used will be new, once-used, and twice-used?

13. In 1975, the automobile industry determined that 40% of American car owners drove large cars, 20% drove intermediate-sized cars, and 40% drove small cars. A second survey in 1985 showed that 70% of the large- car owners in 1975 still owned large cars in 1985, but 30% had changed to an intermediate-sized car. Of those who owned intermediate-sized cars in 1975, 10% had switched to large cars, 70% continued to drive intermediate-sized cars, and 20% had changed to small cars in 1985. Finally, of the small-car owners in 1975, 10% owned intermediate-sized cars and 90% owned small cars in 1985. Assuming that these trends continue, determine the percentages of Americans who own cars of each size in 1995 and the corresponding eventual percentages.

14. Show that if A and P are as in Example 5, then



where



Deduce that


15. Prove that if a 1-dimensional subspace W of  contains a nonzero vector with all nonnegative entries, then W contains a unique probability vector.

16. Prove Theorem 5.14 and its corollary.

17. Prove the two corollaries of Theorem 5.17. Visit goo.gl/V5Hsou for a solution.

18. Prove the corollary of Theorem 5.18.

19. Suppose that M and  are  transition matrices.

1. (a) Prove that if M is regular, N is any  transition matrix, and c is a real number such that , then  is a regular transition matrix.

2. (b) Suppose that for all i, j, we have that  whenever . Prove that there exists a transition matrix N and a real number c with  such that .

3. (c) Deduce that if the nonzero entries of M and  occur in the same positions, then M is regular if and only if  is regular.

The following definition is used in Exercises 20-24.

# Definition.

For , define , where



This limit exists by Exercise 22 of Section 7.2. Thus  is the sum of the infinite series



and  is the mth partial sum of this series. (Note the analogy with the power series



which is valid for all complex numbers a.)

1. Compute  and , where O and I denote the  zero and identity matrices, respectively.

2. Let  be a diagonal matrix. Prove that .

3. Let  be diagonalizable. Use the result of Exercise 21 to show that  exists. (Exercise 22 of Section 7.2 shows that  exists for every .)

4. Find A,  such that .

5. Prove that a differentiable function  is a solution to the system of differential equations defined in Exercise 16 of Section 5.2 if and only if  for some , where A is defined in that exercise.