see also:



My favourite book was Fermat’s Last Theorem by Simon Singh. So when in 1999 I saw his new book - The Code Book - in shops I just had to buy it, despite it being incredibly expensive (for a very poor student like myself). I then took my new prize procession with me expectantly on the summer holiday. And when I reached the end of the book that I discovered it contained a Cipher Challenge!

I set about cracking the ciphers with gusto. By chance I happened to have my 286 computer with me, and my hobby language of choice - Turbo Pascal 7.

I think I skipped the ADFGVX cipher, so excited was I at the prospect of cracking the Enigma stage 8 of the contest. Unfortunately, when I reached the Enigma, I was stumped. I couldn’t make any headway with simulating the Enigma. I wrongly assumed that the message would begin with the message key encrypted with the day key twice. But my biggest mistake was not understanding how rings affected things. I didn’t understand how rings affected things because The Code Book didn’t actually explain them! The book instructed the intrepid explorer to search the Internet for more details and I didn’t take any heed of this because a) I couldn’t imagine the book wouldn’t actually give me everything I needed to crack the challenge, and b) I didn’t have any Internet. The Enigma I was trying to crack wasn’t even an accurate Enigma.

So 17 years later I find myself on a whim tacking the Enigma once again. And my computer is a bit faster too. But I want to attack it my way, putting myself as much as possible in the shoes of my old self 17 years ago, only this time with Internet so I can double-check the correctness and completeness of my implementation of the Engima cipher machine.

This is my attack on the Enigma. I will describe my attack on an M3 Enigma (used by the army and air force) and then I will generalize it to attack the M4 (used by the U-boats; what Alan Turing is famous for cracking).

The best way to understand how the Enigma works is to make one from paper, following instructions you find on the web. Each time the user presses a key the key goes (via a plugboard) into the rightmost rotor. Here is is translated to another letter and goes into the middle rotor where it gets translated again. In then goes to the leftmost rotor where, after being translated again, it goes into a reflector. The reflector sends it back into the leftmost rotor at another point, which translates it to another letter and sends it to the middle wheel and eventually back through the rightmost rotor and back through the plugboard to the lamp panel:

The rotors advance before each keypress, like the dials in odometer. The rightmost rotor advances each keypress and notches on it advance the middle rotor once per full revolution. The middle rotor also has notches on it and can knock the other wheels once each full revolution.

The Enigma has a lot of possible settings.

First, there are the combinations of 3 rotors and 1 reflector. The M3 started with 3 rotors and two reflectors; then the Navy added 2 more, and the army followed suit, and then the navy upped it to 8. There are 6 ways to order 3 rotors, 60 ways to pick from 5 rotors and 336 ways to pick from 8. There were two reflectors to choose from. So worst case 672 combinations of rotor and reflector.

Then each rotor has 26 possible starting positions (A through Z). 26 x 26 x 26 is 17,576.

Then the middle and right rotors have 26 possible ring positions, which affects when they ‘knock’ the other rotors into rotating. (The leftmost rotor has a ring too, but has no impact on encryption when its the leftmost ring.) There are therefore 26 x 26 = 676 ring combinations. However, as a slight detail, not all messages are long enough to cause the middle ring to knock and therefore the impact of the rings is a function of the length of the message. To simplify things I am going to ignore this, meaning my attack is going to blindly iterate over equivalent rotor settings on short messages.

Finally, we have the plugboard. Early on 6 pairs of letters were swapped and later this was changed to 10 pairs. There are 100,391,791,500 ways to order 6 pairs and 150,738,274,937,250 ways to order 10.

So the early M3 Enigmas had:

2 x 6 x 17,576 x 676 x 100,391,791,500 = 14,313,511,465,501,248,000 keys

And the later M3s had:

2 x 336 x 17,576 x 676 x 150,738,274,937,250 = 1,203,537,298,065,206,936,832,000 keys

These are very very big numbers! The earliest M3 is providing 64-bits key and the later M3s 80-bits. (As usual, I find slightly different numbers quoted in various texts where e.g. the number of reflectors is disregarded. Corrections welcome!)

The World War 2 attacks on the Enigma were very very very clever and, in hindsight, the Germans didn’t understand the weaknesses nor make things as difficult for the allies as they could have. The Germans considered the number of combinations as astronomically more than those massive numbers above because they didn’t realise that the rotor wirings were known to the allies. The Poles correctly determined and guessed the wirings just by analyzing a manual a spy gave them. Later, as more wheels were added, the allies set out to capture them by boarding captured boats and such. This was despite the fact that they themselves captured the equivalent British TypeX cipher machine at Dunkirk. Let this be a message to those relying on the obscurity of the algorithm itself! (When Admiral Dönitz commissioned the M4 Engima for his U-boat fleet, was he thinking there was a spy with access to the surface fleet day keys?)

The allies took advantage of the fact that messages were predictable and that the Enigma never ever encrypted a letter as itself. This provided them with ‘cribs’ that dramatically cut down the search space.

However, my attack is thoroughly modern and uncunning. I’m going to try and use computing brute force which, unavailable in World War 2, is now at my fingertips in my oldish slowish laptop! What takes world-beating cleverness in the wartime can now be done by dumb repetition by a schoolboy-level-programmer in 2016! :)

So, if we get a computer to iterate over all the possible keys, how does it know which key is the correct one? A human would have to know German and read each decrypt and see if it made sense. A computer can look the decrypt up in a German dictionary and German grammar too. Or we could short cut that and say that, if we have the key wrong, then the outcome should be seemingly random. If the outcome isn’t random, i.e. there’s a very uneven distribution of letter frequencies, then its a good candidate decrypt for human intervention.

Having show you how big the numbers are, you can see how daunting a brute-force attack would be. However, I’m going to disregard the plugboard. If we disregard the plugboard then suddenly the numbers become much much smaller.

The thing about the plugboard is that it doesn’t swap all the letters, and letters it doesn’t swap have the usual frequency in the decrypt. The swapped letters get garbled in the decrypt, but if sufficient letters go through ungarbled then the frequency counts of the letters should still be uneven enough to trigger it being treated as a candidate solution. The job then becomes to discover a plugboard that makes sense of the garbled parts of the message.



In the Cipher Challenge, the Enigma stage 8 has the three rotors and the reflector specified. These are actually the I, II and III rings and the B reflector. We just don’t know the order of the rotors nor the starting positions nor the rings.

1 x 6 x 17,576 x 676 = 71,288,256

71M is a very small number for a modern computer. My computer can, on a single core, iterate through all 71M possible rotor start positions and decrypt the message in under 10 minutes, using just one core. I have four cores, so wall-clock time could be under 3 minutes, and this is a 2014 laptop. (It would have been feasible - given days to weeks - to tackle this on a 286 too. I would have optimised harder.)

To judge the randomness of the decrypt I just work out the frequency of the most common letter. Here is the distribution for the Cipher Challenge text:

42 = 13

41 = 27

40 = 18

39 = 56

38 = 92

37 = 277

36 = 536

35 = 1132

34 = 3228

33 = 8697

32 = 21143

31 = 49767

30 = 115703

29 = 256947

28 = 550083

27 = 1124947

26 = 2194175

25 = 4036371

24 = 6904511

23 = 10651956

22 = 14097448

21 = 14880108

20 = 10951117

19 = 4615207

18 = 792354

17 = 32251

16 = 92

There are 367 characters in the message and there are 26 characters in the alphabet so the average would be 14. Most of the decrypts are rather noisy. Only a handful are not noisy and worthy of further consideration. (In fact, most of these are actually duplicate results caused by the middle ring not rotating. If we just greedily decrypt the highest candidates we happen hit the true solution immediately.)

This is the decrypt of the Cipher Challenge text, picking the rotor settings with highest maximum frequency in the decrypt, ignoring the plugboard:

VRA

x

KTZYIJ

g

A

w

YKNJRAOI

p

KCOT

xx

AOM

f

Q

x

DIMINZPO

h

SIS

t

MCEJI

x

UEOVIEKMJ

g

P

d

ET

x

FSO

x

YIA

x

L

n

MLT

d

EI

r

OOE

s

DK

x

W

ch

RKJ

b

UL

da

HHKEJW

ssteh

IJ

d

I

xb

HVI

x

NIAEA

c h

KVIYAIKA

x

IJ

t

JI

c

LO

xx

IP

x

EFORIEHA

x

IHJA

x

HT

r

TIIZJT

xz

I

r

TPWN

r

GYIBJUZIEJ AZIEJX

x

MET

h

MUXN

g

PSUEGT

r

IIVEI

sx

MJWN

en

O

d

I

c

LRIT

d

PNOZ

d

SF

x

RTDO

x

IRDMV

g

M

rx

NIGJEIQ

x

XWOY

d

BU

xz

W

gr

RAVPKXN

g

MJNIJ

x

A

ch

KMIAAIRKJJOLTAZIXZT

w

UKZ

xal

X

xr

YDDDOSOM

d

EI

x

A

ch

PE

f

OMIEWJUMLNJOOG

x

K

r

IM

b

O

The 91 lowercase letters are actually correct and the uppercase letters garbled, its just that at this point we don’t know that. We have to find a plugboard that turns this into German (where, as was the convention, X means space and XX means full stop).

Here are the frequencies in this decrypt:

A = 20, B = 5, C = 8, D = 16, E = 21, F = 5, G = 10

H = 13, I = 41, J = 22, K = 14, L = 8, M = 16, N = 13

O = 21, P = 9, Q = 2, R = 18, S = 10, T = 17, U = 8

V = 7, W = 9, X = 36, Y = 7, Z = 11

Now for some detective work. Lets attack this manually by guessing. We know that in German E is the most common letter (16%) followed by D, C and A (5%, 3%, 2% respectively). However, this being Enigma plaintext, we expect X to be really common too.

We see in our decrypt that we have several X, several XX and no XXX, so we can guess that X is unplugged! If we replace the X with spaces, here’s what we get:

VRA

KTZYIJ

g

A

w

YKNJRAOI

p

KCOT

AOM

f

Q

DIMINZPO

h

SIS

t

MCEJI

UEOVIEKMJ

g

P

d

ET

FSO

YIA

L

n

MLT

d

EI

r

OOE

s

DK

W

ch

RKJ

b

UL

da

HHKEJW

ssteh

IJ

d

I

b

HVI

NIAEA

c h

KVIYAIKA

IJ

t

JI

c

LO

IP

EFORIEHA

IHJA

HT

r

TIIZJT

z

I

r

TPWN

r

GYIBJUZIEJ AZIEJ

MET

h

MU N

g

PSUEGT

r

IIVEI

s

MJWN

en

O

d

I

c

LRIT

d

PNOZ

d

SF

RTDO

IRDMV

g

M

r

NIGJEIQ

WOY

d

BU

z

W

gr

RAVPK N

g

MJNIJ

A

ch

KMIAAIRKJJOLTAZI ZT

w

UKZ

al

r

YDDDOSOM

d

EI

A

ch

PE

f

OMIEWJUMLNJOOG

K

r

IM

b

O

The most common letter in our decrypt is I, so lets assume this is E. If we set up the plugboard to swap E and I, lets see what we get:

VRA

KTZAEJ

g

A

w

YKN

RAOE

p

KCOT

AOM

f

Q

J

e

ME

ZPO

h

S

e

S

t

MC

i

J

e

U

i

OV

ei

KMJ

g

P

d i

T

FSO

Y

e

A

L

n

MLT

dier

OO

is

D

W

ch

RKS

b

U

da

HHK

i

JW

sst

I

he

J

de b

HV

e

N

e

AIA

c h

KV

e

YA

e

KA

e

J

t

J

ec

LO

e

P

i

FOR

ei

HA

e

HJA

HT

r

TE

e

ZJT

zer

TPWN

r

G

e

BJAZ

ei

J A

ei

J

M

i

T

h

MU T

gr

SUIGT

r

E

e

VI

es

MJWN

en

O

dec

LR

e

T

d

PAAZ

d

SF

RTDO

ERDMV

g

M

r

N

e

GJI

e

A

UWO

d

BU

z

W

gr

MJVPK N

g

MJN

e

J

A

ch

KM

e

AA

e

R

JJOLTAZ

e

ZT

w

UKZ

al

A

r

YDDKOSOM

die

A

chrif

OM

ei

W

h

UMLNJOOG

K

r

EM

b

O

The 150 lowercase letters are now correct, but we still don’t know that. We just think we are getting somewhere because we look at the frequencies E is now the most common:

A = 25, B = 5, C = 8, D = 14, E = 42, F = 5, G = 10

H = 14, I = 20, J = 21, K = 13, L = 7, M = 17, N = 10

O = 20, P = 7, Q = 1, R = 19, S = 11, T = 18, U = 8

V = 7, W = 9, X = 42, Y = 4, Z = 10

However, D is not looking so good. We search now to see what D should be swapped with…

I searched in vain. D isn’t swapped with anything, its just that we are attacking the noise at this point.

My next idea was that SS ought occur in the output. So far it occurs once. The only other double letters that occur more often are two AAs and two OOs. I speculated that A was swapped with S, meaning to try again with O swapped with S if I ended up getting stuck later. Here is what A swapped with S looks like:

VR

s

KT

es

MJ

gsw

Y

r

O

R

s

OE

p

KCOT

s

OM

f

Q

J

e

ME

ZPO

hae

A

t

MC

i

J

e

U

i

OV

ei

KMJ

g

P

d i

T

FAO

Y

es

L

n

MLT

dier

OO

i

AD

ich

RK

abe da

H

K

i

JWA

s

O

ehe

J

de b

HO

e

N

es

I

sc h

KM

e

Y

se

K

s e

J

t

J

ec

LO

e

P

i

FOR

ei

H

s e

HJ

s

H

er

TE

e

ZJT

zer

TPW

er

G

e

BJ

s ei

J

s ei

J

M

ich

MU T

gra

UIGT

r

E

e

I

es

MJWN

en

O

dec

LR

e

T

d

P

ss

Z

da

F

w

TDO

ERDMV

ge r

N

e

GJI

es

UWO

de

U

z

M

gr

MJV

e

K N

ge

JN

e

J

sch

KM

esse

R

JJOLTSZ

er

O

wi

KZ

SK

s r

YDDKO

a

O

die schrif

O

zei

W

h

UM

NJOOJ

K

r

E

ib

O

You can clearly see words popping out; my own eye is drawn to the “die” on the bottom line. Other repeated words such as “EIJ” are shouting out for a J and N swap.

And so I attacked the plugboard manually, despite not knowing German. I could have automated this, but I didn’t need to. The final text fell away before my eyes:

DAS LOESUNGSWORT IST PLUTO. STUFE NEUN ENTHAELT EINE MITTEILUNG DIE MIT DES ENTKODIERT IST. ICH HABE DAS LINKSSTEHENDE BYTE DES

SCHLUESSELS ENTDECKT. ES IST EINS EINS ZERO EINS ZERO ZERO EINS EINS EINS. ICH PROGRAMMIERTE DES UND ENTDECKTE DASS DAS WORT DEB

UGGER WENN ES MIT DEM ZUGRUNDELIEGENDEN SCHLUESSEL ENTKODIERT WIRD ALS RESULTAT DIE SCHRIFTZEICHEN UNTEN ERGIBT

The rotor order was III, II then I; the wheel positions were A, F and L and the rings were 26 and 1 respectively. The plugboard was EI AS NJ KL TO and UM.

Google translate tells me that:

THE SOLUTION WORD IS PLUTO. STAGE NINE CONTAINS A MESSAGE WITH THE ENTKODIERT IS. I HAVE THE LEFT STANDING BYTE OF THE KEY DISCOVERED. IT IS ONE ONE ZERO ONE ZERO ZERO ONE ONE ONE. I PROGRAM OF AND DISCOVERED THAT THE WORD DEBUGGER IF IT WITH THE UNDERLYING KEY ENTKODIERT IS AS A RESULT THE CHARACTERS BELOW SHOWS

Now to extend the attack against the M4 used by the U-boats:

The M4 had a two ‘greek’ wheels which could be combined in one of 26 positions with two ‘thin’ reflectors to make a composite reflector. The M4 had the full 8 normal rotors to choose between too, so the total number of combinations without the 10 pair plugboard is:

2 x 26 x 26 x 336 x 17,576 x 676 = 5,397,376,438,272

That’s actually a bit beyond where I want to be with my laptop in the summer!

However, lets look at how random messages are before they have entered the reflector. Here are the frequencies for the Cipher Challenge text again, but counted before they enter the reflector:

36 = 156

35 = 1482

34 = 4706

33 = 11024

32 = 23400

31 = 49218

30 = 117182

29 = 252122

28 = 526396

27 = 1107028

26 = 2156674

25 = 3930160

24 = 6800664

23 = 10674482

22 = 14215630

21 = 15105272

20 = 10923458

19 = 4577118

18 = 783770

17 = 28210

16 = 104

Again a few combinations are far less random-seeming than the rest! (In fact, the real solution is again top of the pile.) The computer can focus on just the top candidates and crank through the 2 x 26 x 26 possible composite reflectors for each.

Notes

"share"