# 20.6 Exercises

Let ${X}_{1}$ and ${X}_{2}$ be two independent tosses of a fair coin. Find the entropy $H({X}_{1})$ and the joint entropy $H({X}_{1}\text{,}\text{}{X}_{2})$. Why is $H({X}_{1}\text{,}\text{}{X}_{2})=H({X}_{1})+H({X}_{2})$?

Consider an unfair coin where the two outcomes, heads and tails, have probabilities $p(heads)=p$ and $p(tails)=1-p$.

If the coin is flipped two times, what are the possible outcomes along with their respective probabilities?

Show that the entropy in part (a) is $-2p{log}_{2}(p)-2(1-p){log}_{2}(1-p)$. How could this have been predicted without calculating the probabilities in part (a)?

A random variable $X$ takes the values $1\text{,}\text{}2\text{,}\text{}\cdots \text{,}\text{}n\text{,}\text{}\cdots $ with probabilities $\frac{1}{2}\text{,}\text{}\frac{1}{{2}^{2}}\text{,}\text{}\cdots \text{,}\text{}\frac{1}{{2}^{n}}\text{,}\text{}\cdots $. Calculate the entropy $H(X)$.

Let $X$ be a random variable taking on integer values. The probability is 1/2 that $X$ is in the range $[0\text{,}\text{}{2}^{8}-1]$, with all such values being equally likely, and the probability is 1/2 that the value is in the range $[{2}^{8}\text{,}\text{}{2}^{32}-1]$, with all such values being equally likely. Compute $H(X)$.

Let $X$ be a random event taking on the values $-2\text{,}\text{}-1\text{,}\text{}0\text{,}\text{}1\text{,}\text{}2$, all with positive probability. What is the general inequality/equality between $H(X)$ and $H(Y)$, where $Y$ is the following?

$Y={2}^{X}$

$Y={X}^{2}$

In this problem we explore the relationship between the entropy of a random variable $X$ and the entropy of a function $f(X)$ of the random variable. The following is a short proof that shows $H(f(X))\le H(X)$. Explain what principles are used in each of the steps.

$$\begin{array}{c}H(X\text{,}\text{}f(X))=H(X)+H(f(X)|X)=H(X)\text{,}\text{}\\ H(X\text{,}\text{}f(X))=H(f(X))+H(X|f(X))\ge H(f(X))\text{.}\end{array}$$Letting $X$ take on the values $\pm 1$ and letting $f(x)={x}^{2}$, show that it is possible to have $H(f(X))<H(X)$.

In part (a), show that you have equality if and only if $f$ is a one-to-one function (more precisely, $f$ is one-to-one on the set of outputs of $X$ that have nonzero probability).

The preceding results can be used to study the behavior of the run length coding of a sequence. Run length coding is a technique that is commonly used in data compression. Suppose that ${X}_{1}\text{,}\text{}{X}_{2}\text{,}\text{}\cdots \text{,}\text{}{X}_{n}$ are random variables that take the values $0$ or $1$. This sequence of random variables can be thought of as representing the output of a binary source. The run length coding of ${X}_{1}\text{,}\text{}{X}_{2}\text{,}\text{}\cdots \text{,}\text{}{X}_{n}$ is a sequence $L=({L}_{1}\text{,}\text{}{L}_{2}\text{,}\text{}\cdots \text{,}\text{}{L}_{k})$ that represents the lengths of consecutive symbols with the same value. For example, the sequence $110000100111$ has a run length sequence of $L=(2\text{,}\text{}4\text{,}\text{}1\text{,}\text{}2\text{,}\text{}3)$. Observe that

**L**is a function of ${X}_{1}\text{,}\text{}{X}_{2}\text{,}\text{}\cdots \text{,}\text{}{X}_{n}$. Show that**L**and ${X}_{1}$ uniquely determine ${X}_{1}\text{,}\text{}{X}_{2}\text{,}\text{}\cdots \text{,}\text{}{X}_{n}$. Do**L**and ${X}_{n}$ determine ${X}_{1}\text{,}\text{}{X}_{2}\text{,}\text{}\cdots \text{,}\text{}{X}_{n}$? Using these observations and the preceding results, compare $H({X}_{1}\text{,}\text{}{X}_{2}\text{,}\text{}\cdots \text{,}\text{}{X}_{n})$, $H(L)$, and $H(L\text{,}\text{}{X}_{1})$.

A bag contains five red balls, three white balls, and two black balls that are identical to each other in every manner except color.

Choose two balls from the bag with replacement. What is the entropy of this experiment?

What is the entropy of choosing two balls without replacement? (Note: In both parts, the order matters; i.e., red then white is not the same as white then red.)

We often run into situations where we have a sequence of $n$ random events. For example, a piece of text is a long sequence of letters. We are concerned with the rate of growth of the joint entropy as $n$ increases. Define the entropy rate of a sequence $X=\left\{{X}_{k}\right\}$ of random events as

$${H}_{\infty}(X)=\underset{n\to \infty}{\mathrm{lim}}{\displaystyle \frac{1}{n}}H({X}_{1}\text{,}\text{}{X}_{2}\text{,}\text{}\cdots {X}_{n})\text{.}$$A very crude model for a language is to assume that subsequent letters in a piece of text are independent and come from identical probability distributions. Using this, show that the entropy rate equals $H({X}_{1})$.

In general, there is dependence among the random variables. Assume that ${X}_{1}\text{,}\text{}{X}_{2}\text{,}\text{}\cdots {X}_{n}$ have the same probability distribution but are somehow dependent on each other (for example, if I give you the letters TH you can guess that the next letter is E). Show that

$$H({X}_{1}\text{,}\text{}{X}_{2}\text{,}\text{}\cdots {X}_{n})\le \sum _{k}H({X}_{k})$$and thus that

$${H}_{\mathrm{\infty}}(\text{X})\le H({X}_{1})$$(if the limit defining ${H}_{\mathrm{\infty}}$ exists).

Suppose we have a cryptosystem with only two possible plaintexts. The plaintext $a$ occurs with probability $1/3$ and $b$ occurs with probability $2/3$. There are two keys, ${k}_{1}$ and ${k}_{2}$, and each is used with probability $1/2$. Key ${k}_{1}$ encrypts $a$ to $A$ and $b$ to $B$. Key ${k}_{2}$ encrypts $a$ to $B$ and $b$ to $A$.

Calculate $H(P)$, the entropy for the plaintext.

Calculate $H(P|C)$, the conditional entropy for the plaintext given the ciphertext. (

*Optional hint:*This can be done with no additional calculation by matching up this system with another well-known system.)

Consider a cryptosystem $\{P\text{,}\text{}K\text{,}\text{}C\}$.

Explain why $H(P\text{,}\text{}K)=H(C\text{,}\text{}P\text{,}\text{}K)=H(P)+H(K)$.

Suppose the system has perfect secrecy. Show that

$$H(C\text{,}\text{}P)=H(C)+H(P)$$and

$$H(C)=H(K)-H(K|C\text{,}\text{}P)\text{.}$$Suppose the system has perfect secrecy and, for each pair of plaintext and ciphertext, there is at most one corresponding key that does the encryption. Show that $H(C)=H(K)$.

Prove that for a cryptosystem $\{P\text{,}\text{}K\text{,}\text{}C\}$ we have

$$H(C|P)=H(P\text{,}\text{}K\text{,}\text{}C)-H(P)-H(K|C\text{,}\text{}P)=H(K)-H(K|C\text{,}\text{}P)\text{.}$$Consider a Shamir secret sharing scheme where any five people of a set of 20 can determine the secret $K$, but no fewer can do so. Let $H(K)$ be the entropy of the choice of $K$, and let $H(K|{S}_{1})$ be the conditional entropy of $K$, given the information supplied to the first person. What are the relative sizes of $H(K)$ and $H(K|{S}_{1})$? (Larger, smaller, equal?)

Let $X$ be a random event taking on the values $1\text{,}\text{}2\text{,}\text{}3\text{,}\text{}\dots \text{,}\text{}36$, all with equal probability.

What is the entropy $H(X)$?

Let $Y={X}^{36}\phantom{\rule{0ex}{0ex}}(\mathrm{m}\mathrm{o}\mathrm{d}\phantom{\rule{0ex}{0ex}}37)$. What is $H(Y)$?

Show that the maximum of $-p{log}_{2}p-(1-p){log}_{2}(1-p)$ for $0\le p\le 1$ occurs when $p=1/2$.

Let ${p}_{i}\ge 0$ for $1\le i\le n$. Show that the maximum of

$$-\sum _{i}{p}_{i}{log}_{2}{p}_{i}\text{,}\text{}$$subject to the constraint $\sum _{i}{p}_{i}=1$, occurs when ${p}_{1}=\cdots ={p}_{n}$. (Hint: Lagrange multipliers could be useful in this problem.)

Suppose we define $\tilde{H}(Y|X)=-\sum _{x\text{,}\text{}y}{p}_{Y}(y|x){log}_{2}{p}_{Y}(y|x)$. Show that if $X$ and $Y$ are independent, and $X$ has $|\mathit{X}|$ possible outputs, then $\tilde{H}(Y|X)=|\mathit{X}|H(Y)\ge H(Y)$.

Use (a) to show that $\tilde{H}(Y|X)$ is not a good description of the uncertainty of $Y$ given $X$.