# 11.3 The Merkle-Damgård Construction

Until recently, most hash functions used a form of the Merkle-Damgård construction. It was invented independently by Ralph Merkle in 1979 and Ivan Damgård in 1989. The main ingredient is a function $f$, usually called a **compression function**. It takes two bitstrings as inputs, call them $H$ and $M$, and outputs a bitstring ${H}^{\prime}=f(H\text{,}\text{}M)$ of the same length as $H$. For example, $M$ could have length 512 and $H$ could have length $256$. These are the sizes that the hash function SHA-256 uses, and we’ll use them for concreteness. The message that is to be hashed is suitably padded so that its length is a multiple of 512, and then broken into $n$ blocks of length 512:

An initial value $IV$ is set. Then the blocks are fed one-by-one into $f$ and the final output is the hash value:

This construction is very natural: The blocks are read from the message one at a time and stirred into the mix with the previous blocks. The final result is the hash value.

Over the years, some disadvantages of the method have been discovered. One is called the **length extension attack**. For example, suppose Alice wants to ensure that her message $M$ to Bob has not been tampered with. They both have a secret key $K$, so Alice prepends $M$ with $K$ to get $K||M$. She sends both $M$ and $H(K||M)$ to Bob. Since Bob also knows $K$, he computes $H(K||M)$ and checks that it agrees with the hash value sent by Alice. If so, Bob concludes that the message is authentic.

Since Eve does not know $K$, she cannot send her own message ${M}^{\prime}$ along with $H(K||{M}^{\prime})$. But, because of the iterative form of the hash function, Eve can append blocks ${M}^{\prime \prime}$ to $M$ if she intercepts Alice’s communication. Then Eve sends

to Bob. Since she knows $H(K||M)$, she can produce a message that Bob will regard as authentic. Of course, this attack can be thwarted by using $M||K$ instead of $K||M$, but it points to a weakness in the construction.

However, using $H(M||K)$ might also cause problems if Eve discovers a way of producing collisions for $H$. Namely, Eve finds ${M}_{1}$ and ${M}_{2}$ with $H({M}_{1})=H({M}_{2})$. If Eve can arrange this so that ${M}_{1}$ is a good message and ${M}_{2}$ is a bad message (see Section 12.1), then Eve arranges for Alice to authenticate ${M}_{1}$ by computing $H({M}_{1}||K)$, which equals $H({M}_{2}||K)$. This means that Alice has also authenticated ${M}_{2}$.

Another attack was given by Daum and Luks in 2005. Suppose Alice is using a high-level document language such as PostScript, which is really a program rather than just a text file. The file begins with a preamble that identifies the file as PostScript and gives some instructions. Then the content of the file follows.

Suppose Eve is able to find random strings ${R}_{1}$ and ${R}_{2}$ such that

where $\text{put}({R}_{i})$ instructs the PostScript program to put the string ${R}_{i}$ in a certain register. In other words, we are assuming that Eve has found a collision of this form. If any string $S$ is appended to these messages, there is still a collision

because of the iterative nature of the hash algorithm (we are ignoring the effects of padding).

Of course, Eve has an evil document ${T}_{1}$, perhaps saying that Alice (who is a bank president) gives Eve access to the bank’s vault. Eve also produces a document ${T}_{2}$ that Alice will be willing to sign, for example, a petition to give bank presidents tax breaks. Eve then produces two messages:

For example, ${Y}_{2}$ puts ${R}_{1}$ into a stack, then puts in ${R}_{2}$. They are not equal, so ${T}_{2}$ is produced. Eve now has two Postscript files, ${Y}_{1}$ and ${Y}_{2}$, with $H({Y}_{1})=H({Y}_{2})$. As we’ll see in Chapter 13, it is standard for Alice to sign the hash of a message rather than the message itself. Eve shows ${Y}_{2}$ to Alice, who compiles it. The output is the petition that Alice is happy to sign. So Alice signs $H({Y}_{2})$. But this means Alice has also signed $H({Y}_{1})$. Eve takes ${Y}_{1}$ to the bank, along with Alice’s signature on its hash value. The security officer at the bank checks that the signature is valid, then opens the document, which says that Alice grants Eve access to the bank’s vault. This potentially costly forgery relies on Eve being able to find a collision, but again it shows a weakness in the construction if there is a possibility of finding collisions.