8.2 The Layers – Introduction to Cryptography with Coding Theory, 3rd Edition

8.2 The Layers

We now describe the steps in more detail. The 128 input bits are grouped into 16 bytes of eight bits each, call them

a0, 0,  a1, 0,  a2, 0,  a3, 0,  a0, 1,  a1, 1, ,  a3, 3.

These are arranged into a 4×4 matrix

a0, 0a0, 1a0, 2a0, 3a1, 0a1, 1a1, 2a1, 3a2, 0a2, 1a2, 2a2, 3a3, 0a3, 1a3, 2a3, 3.

In the following, we’ll occasionally need to work with the finite field GF(28). This is covered in Section 3.11. However, for the present purposes, we only need the following facts. The elements of GF(28) are bytes, which consist of eight bits. They can be added by XOR. They can also be multiplied in a certain way (i.e., the product of two bytes is again a byte), but this process is more complicated. Each byte b except the zero byte has a multiplicative inverse; that is, there is a byte b such that bb=00000001. Since we can do arithmetic operations on bytes, we can work with matrices whose entries are bytes.

As a technical point, we note that the model of GF(28) depends on a choice of irreducible polynomial of degree 8. The choice for Rijndael is X8+X4+X3+X+1. This is also the polynomial used in the examples in Section 3.11. Other choices for this polynomial would presumably give equally good algorithms.

8.2.1 The SubBytes Transformation

In this step, each of the bytes in the matrix is changed to another byte by Table 8.1, called the S-box.

Table 8.1 S-Box for Rijndael

Write a byte as eight bits: abcdefgh. Look for the entry in the abcd row and efgh column (the rows and columns are numbered from 0 to 15). This entry, when converted to binary, is the output. For example, if the input byte is 10001011, we look in row 8 (the ninth row) and column 11 (the twelfth column). The entry is 61, which is 111101 in binary. This is the output of the S-box.

The output of SubBytes is again a 4×4 matrix of bytes, let’s call it

b0, 0b0, 1b0, 2b0, 3b1, 0b1, 1b1, 2b1, 3b2, 0b2, 1b2, 2b2, 3b3, 0b3, 1b3, 2b3, 3.

8.2.2 The ShiftRows Transformation

The four rows of the matrix are shifted cyclically to the left by offsets of 0, 1, 2, and 3, to obtain

c0, 0c0, 1c0, 2c0, 3c1, 0c1, 1c1, 2c1, 3c2, 0c2, 1c2, 2c2, 3c3, 0c3, 1c3, 2c3, 3=b0, 0b0, 1b0, 2b0, 3b1, 1b1, 2b1, 3b1, 0b2, 2b2, 3b2, 0b2, 1b3, 3b3, 0b3, 1b3, 2.

8.2.3 The MixColumns Transformation

Regard a byte as an element of GF(28), as in Section 3.11. Then the output of the ShiftRows step is a 4×4 matrix (ci, j) with entries in GF(28). Multiply this by a matrix, again with entries in GF(28), to produce the output (di, j), as follows:

00000010000000110000000100000001000000010000001000000011000000010000000100000001000000100000001100000011000000010000000100000010c0, 0c0, 1c0, 2c0, 3c1, 0c1, 1c1, 2c1, 3c2, 0c2, 1c2, 2c2, 3c3, 0c3, 1c3, 2c3, 3
=d0, 0d0, 1d0, 2d0, 3d1, 0d1, 1d1, 2d1, 3d2, 0d2, 1d2, 2d2, 3d3, 0d3, 1d3, 2d3, 3.

8.2.4 The RoundKey Addition

The round key, derived from the key in a way we’ll describe later, consists of 128 bits, which are arranged in a 4×4 matrix (ki, j) consisting of bytes. This is XORed with the output of the MixColumns step:

d0, 0d0, 1d0, 2d0, 3d1, 0d1, 1d1, 2d1, 3d2, 0d2, 1d2, 2d2, 3d3, 0d3, 1d3, 2d3, 3k0, 0k0, 1k0, 2k0, 3k1, 0k1, 1k1, 2k1, 3k2, 0k2, 1k2, 2k2, 3k3, 0k3, 1k3, 2k3, 3
=e0, 0e0, 1e0, 2e0, 3e1, 0e1, 1e1, 2e1, 3e2, 0e2, 1e2, 2e2, 3e3, 0e3, 1e3, 2e3, 3.

This is the final output of the round.

8.2.5 The Key Schedule

The original key consists of 128 bits, which are arranged into a 4×4 matrix of bytes. This matrix is expanded by adjoining 40 more columns, as follows. Label the first four columns W(0), W(1), W(2), W(3). The new columns are generated recursively. Suppose columns up through W(i1) have been defined. If i is not a multiple of 4, then


If i is a multiple of 4, then


where T(W(i1)) is the transformation of W(i1) obtained as follows. Let the elements of the column W(i1) be a, b, c, d. Shift these cyclically to obtain b, c, d, a. Now replace each of these bytes with the corresponding element in the S-box from the SubBytes step, to get 4 bytes e, f, g, h. Finally, compute the round constant


in GF(28) (recall that we are in the case where i is a multiple of 4). Then T(W(i1)) is the column vector

(er(i), f, g, h).

In this way, columns W(4), , W(43) are generated from the initial four columns.

The round key for the ith round consists of the columns

W(4i), W(4i+1), W(4i+2), W(4i+3).

8.2.6 The Construction of the S-Box

Although the S-box is implemented as a lookup table, it has a simple mathematical description. Start with a byte x7x6x5x4x3x2x1x0, where each xi is a binary bit. Compute its inverse in GF(28), as in Section 3.11. If the byte is 00000000, there is no inverse, so we use 00000000 in place of its inverse. The resulting byte y7y6y5y4y3y2y1y0 represents an eight-dimensional column vector, with the rightmost bit y0 in the top position. Multiply by a matrix and add the column vector (1, 1, 0, 0, 0, 1, 1, 0) to obtain a vector (z0, z1, z2, z3, z4, z5, z6, z7) as follows:


The byte z7z6z5z4z3z2z1z0 is the entry in the S-box.

For example, start with the byte 11001011. Its inverse in GF(28) is 00000100, as we calculated in Section 3.11. We now calculate


This yields the byte 00011111. The first four bits 1100 represent 12 in binary and the last four bits 1011 represent 11 in binary. Add 1 to each of these numbers (since the first row and column are numbered 0) and look in the 13th row and 12th column of the S-box. The entry is 31, which in binary is 00011111.

Some of the considerations in the design of the S-box were the following. The map xx1 was used to achieve nonlinearity. However, the simplicity of this map could possibly allow certain attacks, so it was combined with multiplication by the matrix and adding the vector, as described previously. The matrix was chosen mostly because of its simple form (note how the rows are shifts of each other). The vector was chosen so that no input ever equals its S-box output or the complement of its S-box output (complementation means changing each 1 to 0 and each 0 to 1).