12.4 Using Hash Functions to Encrypt – Introduction to Cryptography with Coding Theory, 3rd Edition

12.4 Using Hash Functions to Encrypt

Cryptographic hash functions are some of the most widely used cryptographic tools, perhaps second only to block ciphers. They find applications in many different areas of information security. Later, in Chapter 13, we shall see an application of hash functions to digital signatures, where the fact that they shrink the representation of data makes the operation of creating a digital signature more efficient. We now look at how they may be used to serve the role of a cipher by providing data confidentiality.

A cryptographic hash function takes an input of arbitrary length and provides a fixed-size output that appears random. In particular, if we have two distinct inputs, then their hashes should be different. Generally, their hashes are very different. This is a property that hash functions share with good ciphers and is a property that allows us to use a hash function to perform encryption.

Using a hash function to perform encryption is very similar to a stream cipher in which the output of a pseudorandom number generator is XORed with the plaintext. We saw such an example when we studied the output feedback mode (OFB) of a block cipher. Much like the block cipher did for OFB, the hash function creates a pseudorandom bit stream that is XORed with the plaintext to create a ciphertext.

In order to make a cryptographic hash function operate as a stream cipher, we need two components: a key shared between Alice and Bob, and an initialization vector. We shall soon address the issue of the initialization vector, but for now let us begin by assuming that Alice and Bob have established a shared secret key KAB.

Now, Alice could create a pseudorandom byte x1 by taking the leftmost byte of the hash of KAB; that is, x1=L8h(KAB). She could then encrypt a byte of plaintext p1 by XORing with the random byte x1 to produce a byte of ciphertext


But if she has more than one byte of plaintext, then how should continue? We use feedback, much like we did in OFB mode. The next pseudorandom byte should be created by x2=L8h(KABx1). Then the next ciphertext byte can be created by


In general, the pseudorandom byte xj is created by xj=L8h(KABxj1), and encryption is simply XORing xj with the plaintext pj. Decryption is a simple matter, as Bob must merely recreate the bytes xj and XOR with the ciphertext cj to get out the plaintext pj.

There is a simple problem with this procedure for encryption and decryption. What if Alice wants to encrypt a message on Monday, and a different message on Wednesday? How should she create the pseudorandom bytes? If she starts all over, then the pseudorandom sequence xj on Monday and Wednesday will be the same. This is not desirable.

Instead, we must introduce some randomness to make certain the two bit streams are different. Thus, each time Alice sends a message, she should choose a random initialization vector, which we denote by x0. She then starts by creating x1=L8h(KABx0) and proceeding as before. But now she must send x0 to Bob, which she can do when she sends c1. If Eve intercepts x0, she is still not able to compute x1 since she doesn’t know KAB. In fact, if h is a good hash function, then x0 should give no information about x1.

The idea of using a hash function to create an encryption procedure can be modified to create an encryption procedure that incorporates the plaintext, much in the same way as the CFB mode does.