# 11.5 SHA-3/Keccak

In 2006, NIST announced a competition to produce a new hash function to serve alongside SHA-2. The new function was required to be at least as secure as SHA-2 and to have the same four output possibilities. Fifty-one entries were submitted, and in 2012, Keccak was announced as the winner. It was certified as a standard by NIST in 2012 in FIPS-202. (The name is pronounced “ketchak”. It has been suggested that the name is related to “Kecak,” a type of Balinese dance. Perhaps the movement of the dancers is analogous to the movement of the bits during the algorithm.) This algorithm became the hash function SHA-3.

The SHA-3 algorithm was developed by Guido Bertoni, Joan Daemen, and Gilles Van Assche from STMicroelectronics and Michaël Peeters from NXP Semiconductors. It differs from the Merkle-Damgård construction and is based on the theory of Sponge Functions. The idea is that the first part of the algorithm absorbs the message, and then the hash value is squeezed out. Here is how it works. The state of the machine is a string of  bits, which is fed to a function  that takes an input of  bits and outputs a string of  bits, thus producing a new state of the machine. In contrast to the compression functions in the Merkle-Damgård construction, the function  is a one-to-one function. Such a function could not be used in the Merkle-Damgård situation since the number of input bits (from  and the previous step) is greater than the number of output bits. But the different construction in the present case allows it.

Parameters  (“the rate”) and  (“the capacity”) are chosen so that . The message (written in binary) is padded so that its length is a multiple of , then is broken into  blocks of length :



To start, the state is initialized to all 0s. The absorption stage is the first part of Figure 11.2.

After the absorption is finished, the hash value is squeezed out:  bits are output and truncated to the  bits that are used as the hash value. This is the second part of Figure 11.2.

Producing a SHA-3 hash value requires only one squeeze. However, the algorithm can also be used with multiple squeezes to produce arbitrarily long pseudorandom bitstrings. When it is used this way, it is often called SHAKE (= Secure Hash Algorithm with Keccak).

The “length extension” and collision-based attacks of Section 11.3 are less likely to succeed. Suppose two messages yield the same hash value. This means that when the absorption has finished and the squeezing stage is starting, there are  bits of the state that agree with these bits for the other message. But there are at least  bits that are not output, and there is no reason that these  bits match. If, instead of starting the squeezing, you do another round of absorption, the differing  bits will cause the subsequent states and the outputted hash values to differ. In other words, there are at least  possible internal states for any given -bit output .

SHA-3 has four different versions, named SHA3-224, SHA3-256, SHA3-384, and SHA3-512. For SHA3-, the  denotes the security level, measured in bits. For example, SHA3-256 is expected to require around  operations to find a collision or a preimage. Since , this should be impossible well into the future. The parameters are taken to be



For SHA3-256, these are



The same function  is used in all versions, which means that it is easy to change from one security level to another. Note that there is a trade-off between speed and security. If the security parameter  is increased, then  decreases, so the message is read slower, since it is read in blocks of  bits.

In the following, we concentrate on SHA3-256. The other versions are obtained by suitably varying the parameters. For more details, see [FIPS 202].

The Padding. We start with a message . The message is read in blocks of  bits, so we want the message to have length that is a multiple of 1088. But first the message is padded to . This is for “domain separation.” There are other uses of the Keccak algorithm such as SHAKE (mentioned above), and for these other purposes,  is padded differently, for example with `1111`. This initial padding makes it very likely that using  in different situations yields different outputs. Next, “`10*1` padding” is used. This means that first a 1 is appended to 011 to yield 011 . Then sufficiently many 01 s are appended to make the total length one less than a multiple of 1088. Finally, a 1 is appended. We can now divide the result into blocks of length 1088.

Why are these choices made for the padding? Why not simply append enough `0`s s to get the desired length? Suppose that . Then . Now append 1079 zeros to get the block to be hashed. If  is being used in SHAKE, then  is padded with 1080 zeros to yield the same block. This means that the outputs for  and  are equal. The padding is designed to avoid all such situations.

Absorption and Squeezing. From now on, we assume that the padding has been done and we have  blocks of length :



The absorption now proceeds as in Figure 11.2 (we describe the function  later).

1. The initial state  is a string of 0s s of length 1600.

2. For  to , let , where  denotes a string of 0s of length . What this does is XOR the message block  with the first  bits of , and then apply the function . This yields an updated state , which is modified during each iteration of the index .

3. Return .

The squeezing now proceeds as in Figure 11.2:

1. Input  and let  be the empty string.

2. While  (where  is the output size)

1. Let , where  denotes the first  bits of .

2. .

3. Return .

The bitstring  is the 256-bit hash value SHA-. For the hash value, we need only one squeeze to obtain . But the algorithm could also be used to produce a much longer pseudorandom bitstring, in which case several squeezes might be needed.

The function . The main component of the algorithm is the function , which we now describe. The input to  is the 1600-bit state of the machine



where each  is a bit. It’s easiest to think of these bits as forming a three-dimensional  array  with coordinates  satisfying



A “column” consists of the five bits with fixed . A “row” consists of the five bits with fixed . A “lane” consists of the 64 bits with fixed .

When we write “for all ” we mean for  and , and similarly for other combinations of .

The correspondence between  and  is given by



for all . For example, . The ordering of the indices could be described as “lexicographic” order using  (not ), since the index of  corresponding to  is smaller than the index for  if  precedes  in “alphabetic order.”

The coordinates  are taken to be numbers mod 5, and  is mod 64. For example  is taken to be , since  mod 5,  mod 5, and  mod 64.

The computation of  proceeds in several steps. The steps receive the array  as input and they output a modified array  to replace .

The following steps  through  are repeated for  to :

1. The first step XORs the bits in a column with the parities of two nearby columns.

1. For all , let



This gives the “parity” of the bitstring formed by the five bits in the  column.

2. For all , let .

3. For all , let .

2. The second step rotates the 64 bits in each lane by an amount depending on the lane:

1. For all , let .

2. Let .

3. For  to 

1. For all , let .

2. Let 

4. Return .

For example, consider the bits with coordinates of the form . They are handled by the case  and we have , so the bits in this lane are rotated by one position. Then, in Step 3(b),  is changed to  for the iteration with . We have , so this lane is rotated by three positions. Then  is changed to , which is reduced mod 5 to , and we pass to , which gives a rotation by six (the rotations are by what are known as “triangular numbers”). After , all of the lanes have been rotated.

3. The third step rearranges the positions of the lanes:

1. For all , let .

Again, the coordinate  should be reduced mod 5.

4. The next step is the only nonlinear step in the algorithm. It XORs each bit with an expression formed from two other bits in its row.

1. For all , let

2. 

3. The multiplication is multiplying two binary bits, hence is the same as the AND operator.

5. Finally, some bits in the  lane are modified.

1. For all , let .

2. Set .

3. For  to 6, let , where  is an auxiliary function defined below.

4. For all , let .

5. Return .

After I through V are completed for one value of , the next value of  is used and I through V are repeated for the new , through . The final output is the new array , which yields a new bitstring  of length .

This completes the description of the function , except that we still need to describe the auxiliary function .

The function  takes an integer  mod 255 as input and outputs a bit according to the following algorithm:

1. If  mod 255, return 1 . Else

2. 10000000

3. For  to  mod 255

1. Let 

2. Let 

3. Let 

4. Let 

5. Let 

6. Let .

4. Return .

The bit that is outputted is .