2.7 Enigma – Introduction to Cryptography with Coding Theory, 3rd Edition

2.7 Enigma

Mechanical encryption devices known as rotor machines were developed in the 1920s by several people. The best known was designed by Arthur Scherbius and became the famous Enigma machine used by the Germans in World War II.

It was believed to be very secure and several attempts at breaking the system ended in failure. However, a group of three Polish cryptologists, Marian Rejewski, Henryk Zygalski, and Jerzy Różycki, succeeded in breaking early versions of Enigma during the 1930s. Their techniques were passed to the British in 1939, two months before Germany invaded Poland. The British extended the Polish techniques and successfully decrypted German messages throughout World War II.

The fact that Enigma had been broken remained a secret for almost 30 years after the end of the war, partly because the British had sold captured Enigma machines to former colonies and didn’t want them to know that the system had been broken.

In the following, we give a brief description of Enigma and then describe an attack developed by Rejewski. For more details, see for example [Kozaczuk], which contains appendices by Rejeweski giving details of attacks on Enigma.

We give a basic schematic diagram of the machine in Figure 2.1. For more details, we urge the reader to visit some of the many websites that can be found on the Internet that give pictures of actual Enigma machines and extensive diagrams of the internal workings of these machines. There are also several online Enigma simulators. Try one of them to get a better understanding of how Enigma works.

Figure 2.1 A Schematic Diagram of the Enigma Machine

L, M, N are the rotors. On one side of each rotor are 26 fixed electrical contacts, arranged in a circle. On the other side are 26 spring-loaded contacts, again arranged in a circle so as to touch the fixed contacts of the adjacent rotor. Inside each rotor, the fixed contacts are connected to the spring-loaded contacts in a somewhat random manner. These connections are different in each rotor. Each rotor has 26 possible initial settings.

R is the reversing drum. It has 26 spring-loaded contacts, connected in pairs.

K is the keyboard and is the same as a typewriter keyboard.

S is the plugboard. It has approximately six pairs of plugs that can be used to interchange six pairs of letters.

When a key is pressed, the first rotor N turns 1/26 of a turn. Then, starting from the key, electricity passes through S, then through the rotors N, M, L. When it reaches the reversing drum R, it is sent back along a different path through L, M, N, then through S. At this point, the electricity lights a bulb corresponding to a letter on the keyboard, which is the letter of the ciphertext.

Since the rotor N rotates before each encryption, this is much more complicated than a substitution cipher. Moreover, the rotors L and M also rotate, but much less often, just like the wheels on a mechanical odometer.

Decryption uses exactly the same method. Suppose a sender and receiver have identical machines, both set to the same initial positions. The sender encrypts the message by typing it on the keyboard and recording the sequence of letters indicated by the lamps. This ciphertext is then sent to the receiver, who types the ciphertext into the machine. The sequence of letters appearing in the lamps is the original message. This can be seen as follows. Lamp “a” and key “a” are attached to a wire coming out of the plugboard. Lamp “h” and key “h” are attached to another wire coming out of the plugboard. If the key “a” is pressed and the lamp “h” lights up, then the electrical path through the machine is also connecting lamp “a” to key “h”. Therefore, if the “h” key were pressed instead, then the “a” key would light.

Similar reasoning shows that no letter is ever encrypted as itself. This might appear to be a good idea, but actually it is a weakness since it allows a cryptanalyst to discard many possibilities at the start. See Chapter 14.

The security of the system rests on the keeping secret the initial settings of the rotors, the setting of the plugs on the plugboard, and the internal wiring of the rotors and reversing drum. The settings of the rotors and the plugboard are changed periodically (for example, daily).

We’ll assume the internal wiring of the rotors is known. This would be the case if a machine were captured, for example. However, there are ways to deduce this information, given enough ciphertext, and this is what was actually done in some cases.

How many combinations of settings are there? There are 26 initial settings for each of the three rotors. This gives 263=17576 possibilities. There are six possible orderings of the three rotors. This yields 6×17576=105456 possible ways to initialize the rotors. In later versions of Enigma, there were five rotors available, and each day three were chosen. This made 60 possible orderings of the rotors and therefore 1054560 ways to initialize the rotors.

On the plugboard, there are 100391791500 ways of interchanging six pairs of letters.

In all, there seem to be too many possible initializations of the machine to have any hope of breaking the system. Techniques such as frequency analysis fail since the rotations of the rotors change the substitution for each character of the message.

So, how was Enigma attacked? We don’t give the whole attack here, but rather show how the initial settings of the rotors were determined in the years around 1937. This attack depended on a weakness in the protocol being used at that time, but it gives the general flavor of how the attacks proceeded in other situations.

Each Enigma operator was given a codebook containing the daily settings to be used for the next month. However, if these settings had been used without modification, then each message sent during a given day would have had its first letter encrypted by the same substitution cipher. The rotor would then have turned and the second letter of each text would have corresponded to another substitution cipher, and this substitution would have been the same for all messages for that day. A frequency analysis on the first letter of each intercepted message during a day would probably allow a decryption of the first letter of each text. A second frequency analysis would decrypt the second letters. Similarly, the remaining letters of the ciphertexts (except for the ends of the longest few ciphertexts) could be decrypted.

To avoid this problem, for each message the operator chose a message key consisting of a sequence of three letters, for example, r, f, u. He then used the daily setting from the codebook to encrypt this message key. But since radio communications were prone to error, he typed in rfu twice, therefore encrypting rfurfu to obtain a string of six letters. The rotors were then set to positions r, f, and u and the encryption of the actual message began. So the first six letters of the transmitted message were the encrypted message key, and the remainder was the ciphertext. Since each message used a different key, frequency analysis didn’t work.

The receiver simply used the daily settings from the codebook to decrypt the first six letters of the message. He then reset the rotors to the positions indicated by the decrypted message key and proceeded to decrypt the message.

The duplication of the key was a great aid to the cryptanalysts. Suppose that on some day you intercept several messages, and among them are three that have the following initial six letters:

dmqvbn

vonpuy

pucfmq

All of these were encrypted with the same daily settings from the codebook. The first encryption corresponds to a permutation of the 26 letters; let’s call this permutation A. Before the second letter is encrypted, a rotor turns, so the second letter uses another permutation; call it B. Similarly, there are permutations C, D, E, F for the remaining four letters. The strategy is to look at the products AD, BE, and CF.

We need a few conventions and facts about permutations. When we write AD for two permutations A and D, we mean that we apply the permutation A then D (some books use the reverse ordering). The permutation that maps a to b, b to c, and c to a will be denoted as the 3-cycle (abc). A similar notation will be used for cycles of other lengths. For example, (ab) is the permutation that switches a and b. A permutation can be written as a product of cycles. For example, the permutation

(dvpfkxgzyo)(eijmunqlht)(bc)(rw)(a)(s)

is the permutation that maps d to v, v to p, t to e, r to w, etc., and fixes a and s. If the cycles are disjoint (meaning that no two cycles have letters in common), then this decomposition into cycles is unique.

Let’s look back at the intercepted texts. We don’t know the letters of any of the three message keys, but let’s call the first message key xyz. Therefore, xyzxyz encrypts to dmqvbn. We know that permutation A sends x to d. Also, the fourth permutation D sends x to v. But we know more. Because of the internal wiring of the machine, A actually interchanges x and d and D interchanges x and v. Therefore, the product of the permutations, AD, sends d to v (namely, A sends d to x and then D sends x to v). The unknown x has been eliminated. Similarly, the second intercepted text tells us that AD sends v to p, and the third tells us that AD sends p to f. We have therefore determined that

AD=(dvpf).

In the same way, the second and fifth letters of the three messages tell us that

BE=(oumb)

and the third and sixth letters tell us that

CF=(cqny).

With enough data, we can deduce the decompositions of AD, BE, and CF into products of cycles. For example, we might have

AD=(dvpfkxgzyo)(eijmunqlht)(bc)(rw)(a)(s)BE=(blfqveoum)(hjpswizrn)(axt)(cgy)(d)(k)CF=(abviktjgfcqny)(duzrehlxwpsmo).

This information depends only on the daily settings of the plugboard and the rotors, not on the message key. Therefore, it relates to every machine used on a given day.

Let’s look at the effect of the plugboard. It introduces a permutation S at the beginning of the process and then adds the inverse permutation S1 at the end. We need another fact about permutations: Suppose we take a permutation P and another permutation of the form SPS1 for some permutation S (where S1 denotes the inverse permutation of S; in our case, S=S1) and decompose each into cycles. They will usually not have the same cycles, but the lengths of the cycles in the decompositions will be the same. For example, AD has cycles of length 10, 10, 2, 2, 1, 1. If we decompose SADS1 into cycles for any permutation S, we will again get cycles of lengths 10, 10, 2, 2, 1, 1. Therefore, if the plugboard settings are changed, but the initial positions of the rotors remain the same, then the cycle lengths remain unchanged.

You might have noticed that in the decomposition of AD, BE, and CF into cycles, each cycle length appears an even number of times. This is a general phenomenon. For an explanation, see Appendix E of the aforementioned book by Kozaczuk.

Rejewski and his colleagues compiled a catalog of all 105456 initial settings of the rotors along with the set of cycle lengths for the corresponding three permutations AD, BE, CF. In this way, they could take the ciphertexts for a given day, deduce the cycle lengths, and find the small number of corresponding initial settings for the rotors. Each of these substitutions could be tried individually. The effect of the plugboard (when the correct setting was used) was then merely a substitution cipher, which was easily broken. This method worked until September 1938, when a modified method of transmitting message keys was adopted. Modifications of the above technique were again used to decrypt the messages. The process was also mechanized, using machines called “bombes” to find daily keys, each in around two hours.

These techniques were extended by the British at Bletchley Park during World War II and included building more sophisticated “bombes.” These machines, designed by Alan Turing, are often considered to have been the first electronic computers.