Cryptic Crossword

Amateur Crypto and Reverse Engineering

Introduction

Reverse engineering is a special subgenre of computer programming. It's about the closest that I as a programmer get to being a scientist. Gather data, formulate a hypothesis, test, refine, repeat: reverse engineering is basically applying the scientific method to a very, very small knowledge domain. If you've never tried to reverse-engineer a program before, you may be wondering how one goes about such a task. The following essay retraces one of the more colorful reverse-engineering problems that I've pursued.

How all this started was that a friend of mine was contributing to an iOS application for crossword puzzles. This was in the early days of the first iPhone, and they wanted to get their app into the store quickly, before this tiny niche was saturated. The program was working, but it was missing one very small feature — namely that it couldn't unscramble scrambled crossword puzzles.

Let me back up a bit and explain. There is a widely used file format for crossword puzzles, called the .puz file format. (Or at least that's what I called it; I don't know if it has a better name.) The file format was created back in the 1990s by a software company called Literate Software, and they used it in their own application, "Across Lite". that allowed you to both create and solve crossword puzzles in their file format. Apparently they had the good fortune to become a de facto standard, so it's pretty much the main file format to use, if you're a crossword application. As far as I know they never published a spec for their format; however, other people had taken the time to reverse-engineer it and make the description publicly available. My understanding is that the company actually made its money not from the software, but by licensing the right to sell crossword files made with its software. So, even though they may not have officially approved of the reverse engineering, they probably didn't mind, since it would only help cement their file format as a standard. I'm fuzzy on the historical details here, but the upshot is that the only available descriptions of this file format, which is pretty much the standard format for crossword puzzles, were written by people outside of the company.

Fortunately, the file format is pretty straightforward. As you might expect, it's a binary file format. The header includes things like the width and height of the grid, and the number of clues.

0000000: FC25 4143 524F 5353 2644 4F57 4E00 02EA .%ACROSS&DOWN... 0000010: 4BD0 B00C ABE5 B845 312E 3200 0000 0000 K......E1.2..... 0000020: 0000 0000 0000 0000 0000 0000 0F0F 4E00 ..............N. 0000030: 0100 0000 4641 5445 2E41 5741 5348 2E41 ....FATE.AWASH.A 0000040: 574F 4C4C 4945 532E 4355 5249 4F2E 5348 WOLLIES.CURIO.SH 0000050: 4F45 454C 4543 544F 5241 5445 2E53 495A OEELECTORATE.SIZ 0000060: 4541 5353 2E45 5253 542E 4449 4554 4544 EASS.ERST.DIETED 0000070: 2E2E 2E43 454E 542E 484F 5354 4553 5352 ...CENT.HOSTESSR

After the header, the file provides all the strings — the first one being the completed grid, which you can see the beginning of in the sample here. Black squares are indicated with ASCII periods. After the answer grid would come the player's working grid, then the list of clues, and then some miscellaneous string data such as the crossword's title and author.

Here's a breakdown of the file header.

Description Length Details File checksum 2 short Magic 12 The ASCII text: " ACROSS&DOWN\0 " Base checksum 2 short Masked checksums 8 short[4] File Version 4 The ASCII text: " 1.N\0 " (N varies) Unused 2 set to uninitialized garbage values Unknown 2 zero unless scrambled Reserved 12 set to uninitialized garbage values Width 1 char Height 1 char Number of clues 2 short Bitmask 2 normally set to 0x0001 Bitmask 2 0x0004 = scrambled

I don't know why anybody thought the file needed six different checksums. (Even better, four of the six checksums are masked: their value is XOR ed with the ASCII characters " ICHEATED ". I'm guessing this was meant to discourage people from manually modifying existing files to make their own crossword files without using the software.) Fortunately, someone besides me had already figured this stuff out. There was one aspect of the file format, however, that was missing from the available descriptions.

One feature of the Across Lite program is that once you had created a crossword, you could opt to have it "scrambled". Normally, the completed grid (i.e. the crossword with all of the answers filled in) was stored in the file in plaintext. But that meant that a motivated user could examine the binary file contents to look up the answers (as we just did). Scrambling gave the puzzle author a way to prevent that.

After scrambling a puzzle, the file contents might look like this:

0000000: EDFB 4143 524F 5353 2644 4F57 4E00 04EA ..ACROSS&DOWN... 0000010: 4DF5 B00C AB05 B845 312E 3200 0000 8B6B M......E1.2....k 0000020: 0000 0000 0000 0000 0000 0000 0F0F 4E00 ..............N. 0000030: 0100 0400 4846 4A49 2E46 4241 4746 2E50 ....HFJI.FBAGF.P 0000040: 5A44 4C49 4342 5A2E 534D 4549 492E 485A ZDLICBZ.SMEII.HZ 0000050: 4F45 514C 564E 4E50 4E4B 4D56 2E54 414D OEQLVNNPNKMV.TAM 0000060: 5258 5358 2E4F 5557 4F2E 5848 5951 4C56 RXSX.OUWO.XHYQLV 0000070: 2E2E 2E46 5144 4C2E 5952 4659 4F4D 4157 ...FQDL.YRFYOMAW

When you ask to have a puzzle scrambled, the application provides you with a key in the form of a four-digit number. (In this particular case, the key is 5274.) If someone later asks to unscramble the puzzle, the Across Lite application will prompt them for the key. If they don't provide the correct key, they can't unscramble the solution.

The reverse-engineered description of the .puz file format had absolutely nothing to say about the nature of scrambled files. And this was my friend's problem. Without that information, their app wouldn't be able to unscramble a file, at all. The user could still work on the crossword, of course: the scrambling doesn't affect the grid's layout, or the clues. But users wouldn't be able to validate their answers.

F A T E A W A S H A W O L L I E S C U R I O S H O E E L E C T O R A T E S I Z E A S S E R S T D I E T E D C E N T H O S T E S S R E F I T S J E W I S H A R I T H K E R N S O A F N I L E A N N E S D U P E D E I O V E N S L O S E R B O D I L Y R A C E R S G L U T E A L P E P S R E S I S T S L U E S K I O T T O R E P U B L I C A N O M E S I R A T E R A M S M E R E X E N O N A B E T ➡ H F J I F B A G F P Z D L I C B Z S M E I I H Z O E Q L V N N P N K M V T A M R X S X O U W O X H Y Q L V F Q D L Y R F Y O M A W M I V L G I E X A G C Y M U X J E X Z Q G H H Z M H I A D G F K V O J I I S C Y R C S E A J X E W Q X I A M K Z X M U H Y P W G U C F A L K Q O I X L R I L V M U I L O B U F C A Z B W F C J C S D H Z V W W U D C A Z R X G L T T I C T B Z D N M B H M T

One of the few advantages that crossword apps have over paper crosswords is that at the end, you can check your answer, and the program can tell you whether or not it's correct. Or it can tell you how many letters are wrong, or it can highlight the squares that are wrong, or just one wrong square, etcetera etcetera. But their app couldn't do that with a scrambled file, even if the user had the four-digit key!

As it happened, the majority of crossword publishers didn't bother to scramble their files. So they were tempted to just release their app without proper support for that feature. Unfortunately, one of the few publishers that did use scrambled files was the New York Times. (They would publish the scrambled file, and then publish the four-digit key the next day, on the same schedule as the print version published the answer grids.) Having a feature that works on everyone's crosswords except the New York Times is kind of like a wedding band that can play everything except the Wedding March. For a lot of potential users, that would be a deal-breaker.

So, my friend contacted me and described the situation, and asked: Do you think you might be able to reverse-engineer this scrambling algorithm? My response was: maybe. Hard to say, but I'm willing to try. Privately, though, my reaction was THIS IS MY DREAM PROJECT AND THERE IS NO WAY I'M NOT SPENDING ALL AVAILABLE FREE TIME ON THIS.

Getting Started

The first step, as it turned out, was just getting the Across Lite program running for myself. To my surprise, they actually had a Linux binary available. Unfortunately, it had a library dependency on a remarkably ancient version of the C++ standard library that pre-dated ANSIfication. After an unsuccessful attempt to track down a copy of this library that would run on a modern kernel, I wound up just using the Windows version of the program, running it under Wine. It was a bit slow to start up, but otherwise ran fine.

So this is what it looks like when you bring up the Across Lite program with a .puz file. The main focus of this interface is for working on solving the crossword, but this program also provides the ability to scramble the crossword, as you can see from the opened menu.

When you use it, the program selects a four-digit unscrambling key, which it gives to you in a message box. You can then save the scrambled .puz file to disk. The key is not available after you close that message box, so it's up to you to make a note of it.

Once that was working, my next step was to write my own program to create unscrambled .puz files as input. I could have used someone else's code, but since I was already studying the file format, it was easy enough for me to slap together a script that generated a .puz file from some basic input.

With that preparation out of the way, I was ready to actually start work. After brainstorming for a bit on where to begin, here are some of the things that I initially tried:

Conduct a thorough web search. Though I really wanted to solve this one entirely by myself, I knew that if anyone else had already solved it, I shouldn't be wasting other people's time. So I spent the better part of a day going through various web searches. The closest I came was a forum in which some people were discussing scrambled .puz files. One person there claimed to have cracked the scrambling algorithm, but had since lost the source code. Could be BS, but if not, then at least I knew that it could be done.

files. One person there claimed to have cracked the scrambling algorithm, but had since lost the source code. Could be BS, but if not, then at least I knew that it could be done. Look for easy-to-calculate invariants between a scrambled grid and its original contents. For example, is there a checksum or a parity value for a grid that remains the same after scrambling? Something like this could reveal possible mechanisms used in the scrambling algorithm. I tried a handful of possibilities; none of them worked.

Look for a way to control the selection of the four-digit key. The annoying aspect of the scrambling feature is that the program selects the key for you; you don't get to choose what it will be. If I could compare two or more grids that were scrambled with the same key, that might be a good place to start.

Finally, one approach that may seem obvious which I explicitly did not pursue: examining the Across Lite program directly with a debugger and/or disassembler. I took this approach off the list of available options right from the start, due to the project's ultimate goal. The law around reverse engineering is murky (though IANAL and things may well have changed since I last looked into it), but black-box reverse engineering generally appears to be on safer legal ground. By "black box", I mean figuring out how a program works by studying it from the outside, examining only those aspects that the program explicitly makes visible. By contrast, looking at a program's disassembly is a bit too similar to looking at source code, and I imagine this could be a thorny distinction to have to mount a defensive legal case with. Since the ultimate goal was to incorporate this into a program that would be sold for profit in Apple's online store, it only made sense that I should avoid potential gray areas as much as possible.

Something that my friend had noticed was that when we scrambled a puzzle twice in a row, the two keys would be different, but only in the first half. The third and fourth digits were the same. At first I thought that this might be due to scrambling the same grid, but further exploration suggested that it was entirely due to temporal proximity. So naturally, I tried running two instances of the Across Lite program at the same time, and hit Alt-S S on both of them as quickly as possible. In this way I obtained two grids scrambled with the same key.

With this technique, I had my first inroad, a way to start making some actual progress in the investigation. I could now create two crossword grids that differed in some specific way, scramble them both with the same key, and then compare the results, seeing directly how a change in input affected the scrambled output. (In fact, I found I could do up to four grids at once, and still have about a 50% chance of all four being assigned the same key.)

Initial Discoveries

One of the first things I tested was scrambling a grid of all As and an otherwise identical grid of all Bs.

A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A ➡ S O W S S N O U V A N N S O T W X S W W S X X K P L Q Y S K O R W S S X K U Q U X P T W X U Z V P M K P U Q V V U Z V R V Z V V M R Q V V M N U Z Q N S R W X T R S S V M R R N S R S R T V W M Y P U Q X X N S S X T T Y P L P T P X L Q P X U Q V N Q U Y P L Q P P U Q A S P Q Q Q Q Q B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B ➡ T P X T T O P V W B O O T P U X Y T X X T Y Y L Q M R Z T L P S X T T Y L V R V Y Q U X Y V A W Q N L Q V R W W V A W S W A W W N S R W W N O V A R O T S X Y U S T T W N S S O T S T S U W X N Z Q V R Y Y O T T Y U U Z Q M Q U Q Y M R Q Y V R W O R V Z Q M R Q Q V R B T Q R R R R R

The results were exactly what I had been hoping (though not really expecting): the scrambled all-B grid's contents were exactly one letter ahead of the contents of the scrambled all-A grid (with Z wrapping around to A). This shows that the grid's contents were not an input into the scrambling process, except at the very end. The scrambling therefore can only depend on the shape of the grid (i.e. its size and the placement of black squares), and of course on the four-digit key. While that's still a big space to explore, it's nowhere near as big as it would be if each letter in the grid could contribute to the scrambling of the other letters.

This meant that from now on, I could examine nothing but all-A grids, and not have to worry that I might be overlooking some important factor in the scrambling process. This was the first point that I was willing to say aloud that I thought that I would be able to solve it. I still couldn't say how long it would take, but I felt confident in predicting that it was ultimately doable.

The next thing to try, of course, was scrambling two grids of the same size but with different shapes.

A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A ➡ S O W S S N O U V A N N S O T W X S W W S X X K P L Q Y S K O R W S S X K U Q U X P T W X U Z V P M K P U Q V V U Z V R V Z V V M R Q V V M N U Z Q N S R W X T R S S V M R R N S R S R T V W M Y P U Q X X N S S X T T Y P L P T P X L Q P X U Q V N Q U Y P L Q P P U Q A S P Q Q Q Q Q A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A ➡ S P U Q U W S Q X W S X N O W S M Y N U P R W S X S O T V Q X W V T W X S K P R S S O U Q V U X K L M U K M R Q Z V P X V X P Q V M R V S T N Z K S V U Z V Q X R V T V W N S R P X U L S X M R N S L Q T S Q P T Y P U W N U N Z P L Q X T Y P R A P P R Q Q Q P U Q V Q S O Y S V A N Q

The total number of white and black squares are the same in each grid — just their positions are different. As you can see, this time I was not so lucky: the two grids produced completely different encryptions, despite both of them containing all As. Or mostly different — as you might be able to tell, there seem to be a lot of similarities between the two scrambled grids, even though it's not clear that there's an actual pattern present.

To study this further, I created a series of very small grids (experimenting showed that the minimum grid size was twelve letters) with only minor variations in their layout. Once I had succeeded in scrambling all of them with the same key, I soon found the method for how the scrambling algorithm was reading the layout. Can you spot the rule?

A A A A A A A A A A A A ➡ Z S M O U T P S S Y S P A A A A A A A A A A A A ➡ Z S Y S U S M O T P S P A A A A A A A A A A A A ➡ Z S T P U S Y S M O S P A A A A A A A A A A A A ➡ Z S T M U S Y P S O S P

It is simply that the letters are read from the grid vertically, not horizontally. Top to bottom, then left to right. The letters are assembled into a plain old one-dimensional string, scrambled, and then the result is then put back into the grid the same way — top to bottom, left to right. The actual positioning of the black squares is completely unimportant.

(This was further vindicated when I returned to considering the unidentified value in the header. The header contains a run of 8 values, each two bytes long, that are unused — and in fact many files contain random values here. Except, that is, for the second entry: This entry (labeled "Unknown" in the table up above) is always zero for normal .puz files, and non-zero for scrambled files. What its value meant, though, was an open question. I had guessed that it was a checksum representing the unscrambled grid, since Across Lite could tell if you tried to unscramble a grid with an incorrect key. But the number didn't actually match the checksum of the unscrambled grid, so I had set aside that hypothesis for the time being. Now, though, I tried taking the checksum of the unscrambled grid contents rearranged as a one-dimensional string, top-to-bottom left-to-right, and this matched the mysterious header value perfectly.)

So, I now knew that the main scrambling process was determined entirely by the number of letters in the grid and the four-digit key. No other aspect of the grid's contents or arrangement was an important factor. Again, this was a huge decrease in the potential solution space to explore.

The next test, though, was a point where things turned out to be less simple than I had expected. I scrambled a series of grids that differed in size by only one letter:

A A A A A A A A A A A A A ➡ T Q N N X W W U M V V P T A A A A A A A A A A A A A A ➡ N X W N Q W O V M V S P P T A A A A A A A A A A A A A A A ➡ T Q T Q L C X X Q U M P Q X P

Although there are clear hints of shared patterns, the basic fact is that changing the grid size can affect every single letter in the scrambled grid. I had been hoping, now that I understood the ordering of the scrambled grids' contents, that I would find that even the size of the grid wasn't actually an important factor, and that the contents of smaller grids would just prove to be a subset of the larger ones. No such luck.

The next test also showed me that things were still more complicated than I had been expecting. Here are four crosswords, all scrambled with the same key, in which each grid differs from the previous one by only one letter. (Colors indicate the changed letter.)

A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A ➡ S O W S S N O U V A N N S O T W X S W W S X X K P L Q Y S K O R W S S X K U Q U X P T W X U Z V P M K P U Q V V U Z V R V Z V V M R Q V V M N U Z Q N S R W X T R S S V M R R N S R S R T V W M Y P U Q X X N S S X T T Y P L P T P X L Q P X U Q V N Q U Y P L Q P P U Q A S P Q Q Q Q Q B A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A ➡ S O X S S N O U V A N N S O T W X S W W S X X K P L Q Y S K O R W S S X K U Q U X P T W X U Z V P M K P U Q V V U Z V R V Z V V M R Q V V M N U Z Q N S R W X T R S S V M R R N S R S R T V W M Y P U Q X X N S S X T T Y P L P T P X L Q P X U Q V N Q U Y P L Q P P U Q A S P Q Q Q Q Q B A A A A A A A A A A B A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A ➡ S O X S S N O U V A N N S O T W X S W W S X X K P L Q Y S K O R W S S X K U Q U X P T W X U Z V P M K P U Q V V U Z W R V Z V V M R Q V V M N U Z Q N S R W X T R S S V M R R N S R S R T V W M Y P U Q X X N S S X T T Y P L P T P X L Q P X U Q V N Q U Y P L Q P P U Q A S P Q Q Q Q Q B A A A A A A A A A A B A A A A A A A A A A B A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A ➡ S O X S S N O U V A N N S O T W X S W W S X X K P L Q Z S K O R W S S X K U Q U X P T W X U Z V P M K P U Q V V U Z W R V Z V V M R Q V V M N U Z Q N S R W X T R S S V M R R N S R S R T V W M Y P U Q X X N S S X T T Y P L P T P X L Q P X U Q V N Q U Y P L Q P P U Q A S P Q Q Q Q Q

As you can see, in addition to being encrypted, the letters of the grid are also being reordered. So there was (at least) two steps to the scrambling process. My hunch was that the letters were encrypted first, and then scrambled. But that meant that I would need to be able to undo the scrambling step before I could even start to tackle the decryption. Of course it was entirely possible that I was wrong and the scrambling was done first, or that there were multiple interleaved steps of encryption and reordering. But as usual, we start by assuming that things are simple until they are proven to be otherwise.

Collecting Data: Script Everything

The other thing I found is that trying to obtain four separate files scrambled with the same key was annoying. I had to launch four separate windows, and then quickly select the scrambling menu option in each of them. Even if I didn't make any slips, I found that it would fail almost half the time, leaving some of the files with one key and the rest with a second key. By now I was doing lots of different comparisons, in search of more data, and so I realized that I needed a more reliable technique.

I found a handy command-line program called xdotool , which allows one to identify windows by their title text and inject mouse and keyboard events. A simple shell script then allowed me to reliably scramble two or more files in parallel.

for w in `xdotool search --title 'across lite'` ; do xdotool windowactivate $w sleep 0.1 xdotool key alt+s s 2> /dev/null sleep 0.2 done

This wasn't perfect — every once in a while I would still get a split of two different keys, but it was much, much rarer. And when it did happen, I just deleted the output files and ran it again.

With the ability to scramble four or more files with the same key easily, I realized I now had a way to fully expose the reordering that had been done on a grid. It works like this. Make one grid with all As. Make another grid with the first 25 letters replaced with B, C, D, on up to Z. Make a third grid with the next 25 letters replaced with B through Z. Make a fourth grid that replaces the next set of 25 letters, and so on until every square is set to something other than A in exactly one file. Scramble them all at once so they have the same key. Thanks to the additive nature of the encryption, you can compare each of the B-through-Z files with the all-As file to identify where each of the B-through-Z letters were reordered to.

A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A ➡ T S O N V U Q P T S O N V U Q P Q P L K S R N M Q P L K S R N M X T S A Z V U Y X T S A Z V U Y U Q P X W S R V U Q P X W S R V B J R Z A A A A C K S A A A A A D L T A A A A A E M U A A A A A F N V A A A A A G O W A A A A A H P X A A A A A I Q Y A A A A A ➡ T N O J V B Q F T J O F V N Q J Q C L Y S G N U Q Y L U S C N Y X T Y A W V S Z X V S D Z Z U D U Q P X W S R V T Q P X W S R V

Thus, for example, looking at the two scrambled grids on the right, the Y in the middle of the rightmost column of the top grid is a Z in the second grid. A difference of 1 means that it must match the square that went from A to B in the unscrambled grid. The S at the top of the second column becomes an N in the next grid, for a difference of 21. This corresponds to the square that contains a V (i.e. the letter shifted 21 from A) in the unscrambled grid. And finally the U in the bottom left corner that becomes a T corresponds to the A that was replaced with Z.

B J R Z I Q Y A C K S B J R Z A D L T C K S A A E M U D L T A A F N V E M U A A G O W F N V A A H P X G O W A A I Q Y H P X A A ➡ T N O J J B F F R J N F V N Q J K C G Y O G K U G Y C U K C G Y J G Y A W V S Z F V B D J Z F D Y Q U X C S Y V T Q Q X Y S U V B J R Z I Q Y H C K S B J R Z I D L T C K S B J E M U D L T C K F N V E M U D L G O W F N V E M H P X G O W F N I Q Y H P X G O ➡ W N S J J B F F R J N F W N S J K C G Y O G K U G Y C U K C G Y J G Y F W B S Z F V B D J Z F D Y C U K C G Y C T Y Q G Y C U G

By scrambling enough grids simultaneously, I could figure out where every position in the original grid went to in the scrambled grid. (And it was necessary to scramble them simultaneously; I verified that grids assigned different four-digit keys were scrambled completely differently.)

So this meant that I could completely separate out the encryption and reordering steps, and study them independently. I cobbled together another script that would take a set of such files, locate all of the reordered letters, and then spit out the encrypted string in its unscrambled order, along with the scrambled ordering sequence, all in a single line of text. This had become easy enough to do that it was little effort to accumulate multiple examples of the same size grid scrambled with various keys. With this, I could then start looking more widely for patterns.

1228 EGAAZOUUNIONMGHTNNINGGHMF 1 12 23 9 20 6 17 3 14 0 11 22 8 19 5 16 2 13 24 10 21 7 18 4 15

1462 HKKJHMMOLUSQNMTIMLSJRMTIM 21 7 18 4 15 1 12 23 9 20 6 17 3 14 0 11 22 8 19 5 16 2 13 24 10

1849 LZPTMAFJCDAZWVUPKSZBWVUPO 14 0 11 22 8 19 5 16 2 13 24 10 21 7 18 4 15 1 12 23 9 20 6 17 3

2415 KJINPSORNQLQMLFIMNLMKLHNL 18 4 15 1 12 23 9 20 6 17 3 14 0 11 22 8 19 5 16 2 13 24 10 21 7

2439 QLMXTAUDRYMESZLSLPNUKNJXP 19 5 16 2 13 24 10 21 7 18 4 15 1 12 23 9 20 6 17 3 14 0 11 22 8

2532 KMOKPRSOOOMILLKKJPMMIMJKJ 3 14 0 11 22 8 19 5 16 2 13 24 10 21 7 18 4 15 1 12 23 9 20 6 17

2699 PTADTHHHXEADTHDKWEAEQBTWI 9 20 6 17 3 14 0 11 22 8 19 5 16 2 13 24 10 21 7 18 4 15 1 12 23

3247 LMNPUSZXVPTPQKSRMORTNLPPM 19 5 16 2 13 24 10 21 7 18 4 15 1 12 23 9 20 6 17 3 14 0 11 22 8

3461 NRQLKMJHGPOROSRNSOTJOMWNS 1 12 23 9 20 6 17 3 14 0 11 22 8 19 5 16 2 13 24 10 21 7 18 4 15

3627 PROTQSXBYVSWSURPKONXSWMSK 8 19 5 16 2 13 24 10 21 7 18 4 15 1 12 23 9 20 6 17 3 14 0 11 22

3719 QSOSOQYEAYUASAWYIOGCUAOYQ 21 7 18 4 15 1 12 23 9 20 6 17 3 14 0 11 22 8 19 5 16 2 13 24 10

4312 IKEIJLKKOMLMNLKJNJKINKLJL 16 2 13 24 10 21 7 18 4 15 1 12 23 9 20 6 17 3 14 0 11 22 8 19 5

4722 KUNPRPPPUSSNUPUPTROMMNKIM 12 23 9 20 6 17 3 14 0 11 22 8 19 5 16 2 13 24 10 21 7 18 4 15 1

5338 SRTWQMRRTMTTXTYYVQOYTTTAV 12 23 9 20 6 17 3 14 0 11 22 8 19 5 16 2 13 24 10 21 7 18 4 15 1

6174 AUZVYVQQJHKPTRSSRSAWRNSVW 0 11 22 8 19 5 16 2 13 24 10 21 7 18 4 15 1 12 23 9 20 6 17 3 14

6561 TSYSXWXOWRRIRRXOWRXTYSOJO 17 3 14 0 11 22 8 19 5 16 2 13 24 10 21 7 18 4 15 1 12 23 9 20 6

6699 BYEBBBEHBHHHBEEHBHHKEEEEY 13 24 10 21 7 18 4 15 1 12 23 9 20 6 17 3 14 0 11 22 8 19 5 16 2

6956 AEWYWAZXZZCEHHDBBBXXYZWWB 10 21 7 18 4 15 1 12 23 9 20 6 17 3 14 0 11 22 8 19 5 16 2 13 24

7216 XRVUBLQQVMPVAROPUGJPMNGWR 19 5 16 2 13 24 10 21 7 18 4 15 1 12 23 9 20 6 17 3 14 0 11 22 8

8181 SSGZZSZZSLLELLZSSLSSGSZLS 5 16 2 13 24 10 21 7 18 4 15 1 12 23 9 20 6 17 3 14 0 11 22 8 19

8241 SQUTWUYZAMIEMILKQJMOCQPIT 17 3 14 0 11 22 8 19 5 16 2 13 24 10 21 7 18 4 15 1 12 23 9 20 6

8345 WURVTSPZXTPSTPOWWTSAGXZTA 18 4 15 1 12 23 9 20 6 17 3 14 0 11 22 8 19 5 16 2 13 24 10 21 7

9436 APZWKWBYEUVUZSYZEUSTYTMWU 10 21 7 18 4 15 1 12 23 9 20 6 17 3 14 0 11 22 8 19 5 16 2 13 24

9911 CUUMCUUEUUEMMKUCCKUCUCMUU 2 13 24 10 21 7 18 4 15 1 12 23 9 20 6 17 3 14 0 11 22 8 19 5 16

I stored my collected information in files like this. Each file was devoted to a specific grid size. (This one is of grids of size 25.) This made it easy to write scripts to iterate over the grids of a particular size, to search for specific patterns, or to verify hypothesized invariants, or just to display information for me to stare at. Any time I had a new idea, I could quickly produce a script to try it out.

The Reordering Process: A Working Hypothesis

It didn't take much looking before some patterns started to become clear in the reordering. I noticed that, typically, a letter in the grid would wind up about 16 positions away from the previous letter (in the original grid), wrapping around from the end to the beginning in the usual fashion. If you examine the numbers on the right in one of the lines in the sample of grids shown above, and pick a 0 on one of the above lines, and then count 16 entries from there, you'll find yourself on the 1 . Count 16 again to get to the 2 , and so on.

The pattern of intervals — or "strides", as I wound up calling them — wasn't obvious until I started focusing on grids larger than size 32, but once I did it leaped out. In fact, the most common stride was exactly 16, with 17 and 15 coming in a distant second and third place. Looking at more examples showed that the stride could range all the way from 7 to 30, but extreme examples never appeared more than once or twice in a given grid. Almost all of a grid's strides would be in the range of 14–18.

As I looked at a grids of various sizes to see if the pattern continued to hold, I found that the reordering behavior was different for even-sized grids and odd-sized grids. If the grid size was odd, then the stride was exactly 16, every time, reliably. Irregular strides only occurred in even-sized grids. I was surprised to see that the code was making a distinction like that, but in a way it makes a kind of sense. Because 16 is a power of 2, it will always be coprime with a odd number, and therefore a stride of 16 is guaranteed to visit every square in an odd grid. With an even-sized grid, you have to take at least one odd stride in order to avoid revisiting a square before the grid is filled.

In any case, I found it strange that the reordering of even grids was so much more complicated, but at least I could say that I had actually figured out one small piece of the puzzle.

(None of these patterns applied to the initial position, by the way. From what I could see, the first letter could go anywhere in the grid, with equal probability.)

Collecting Data: No Seriously, Script Everything

As I realized that I needed to collect grids scrambled with as many different keys as possible, I wrote scripts to automate more and more of the collection process. Eventually I had all of it scripted except for one part — namely, retrieving the four-digit key. The key was never output anywhere; it was simply displayed to the user in a message box. There wasn't a way to copy the key to the clipboard or anything like that. So, I would run scripts that would do everything up to the scrambling of the files, then I would read the displayed key and enter it on the terminal command-line (or, if some of the keys failed to match, abort and start over), and then the second set of scripts would grab the scrambled files, extract the grids, determine the reordering and store the information in the appropriate data file. I lived with this for a while, until I was forced to acknowledge that this just wasn't going to work for the amount of data that I wanted to collect. So … I started investigating OCR programs for Linux.

for w in `xdotool search --title 'across lite info' | tac` ; do xdotool windowactivate $w sleep 0.3 k=`xwd -id $w -silent | xwdtopnm --quiet | gocr - | sed -ne 'y/I/1/;s/ //g;s/.*keytounscrambleis\([1-9]\{4\}\).*/\1/p'` if test "${key:=$k}" != "$k" ; then test $quiet || echo Warning: inconsistent keys \($key vs $k\). >&2 unset key break fi xdotool key Return 2> /dev/null sleep 0.2 done

I found a very simple one called gocr that worked on the command line. It took a pixmap file as input and returned plain text on standard output. The venerable xwd utility has the ability to select the window to capture by window ID, which could be turned into a pixmap via one of the ImageMagick utilities, which gocr could then turn back into text.

(Side note: I don't hate GUIs per se. What I hate are GOUIs: graphical-only user interfaces. They are the real blight. The above snippet demonstrates one of the things I love about being a Unix programmer: even when faced with a needless GOUI, there is always some way to turn it back into a text-based interface. And then we are unstoppable.)

So: I now had a fully automated system that could collect scrambled grids without oversight. I couldn't use the computer for anything else while it was running, of course, since it depended on screenshots and simulated keyboard events. But in the morning I would set it running in an infinite loop, collecting grids of a specific size, and when I came home from work I would find hundreds of new examples in my data file. A veritable mother lode.

The Reordering Process: Visualizing Strides

Looking at the rows of raw numbers and hoping to notice a pattern was futile, I quickly realized. So I started thinking about how I could display them in ways that would make patterns show up more clearly. Since I was storing all the data in easy-to-parse text files, it was a simple matter to write a script that could display the contents in different ways.

The first thing I wanted to do was display the strides instead of the absolute positions. In order to pack the information more densely, I displayed the stride value as a single character in base 36 — i.e. A being 10, B being 11, C being 12, and so on. (The four-digit key is shown on the far left, and next to it is displayed the initial position in base 16.)

1117 3A UHGGGGGFGGGGGEHGGGGGFGGGGCJGGGGGGFGGGGEHGGGGGFGGGG8MHGGGGGFGGGGEHGGGGGGFGGGGEHGGGGGFGGGGGEHGGGGGFGGG

1544 14 OGKIHGGGGGFGGGGGFGGGGGFGGGGCIHGGGGGFGGGGGFGGGGGGFG8GGKJGGGGGFGGGGGEHGGGGGFGGGGCJGGGGGGFGGGGEHGGGGGFG

1936 59 NGGGKIHGGGGGFGGGGEHGGGGGFGGGGCIHGGGGGFGGGGGFGGGGGG7GGGGMHGGGGGFGGGGGEHGGGGGFGGGGCJGGGGGGFGGGGEHGGGGG

2172 1C GSGJGGGGGFGGGGGEHGGGGGFGGGCGJGGGGGGFGGGGGFGGGGGFGGG8KIHGGGGGFGGGGGFGGGGGFGGGGCGJGGGGGFGGGGGFGGGGGGFG

2746 55 FOGGKIHGGGGGFGGGGGFGGGGGGFGGGCIHGGGGGFGGGGGEHGGGGGF8GGGKJGGGGGGFGGGGEHGGGGGFGGGGCIHGGGGGFGGGGEHGGGGG

2882 45 GNGGGKGJGGGGGFGGGGGFGGGGGGFGGGCGJGGGGGFGGGGGEHGGGGG7GGGKGJGGGGGGFGGGGGFGGGGGFGGGCGIHGGGGGFGGGGGFGGGG

3147 0E GGSIHGGGGGFGGGGGEHGGGGGFGGGCIHGGGGGFGGGGGEHGGGGGFGGG8KJGGGGGGFGGGGEHGGGGGFGGGGCIHGGGGGFGGGGEHGGGGGGF

3921 47 GFOGGGKJGGGGGFGGGGGFGGGGGGFGGGGCJGGGGGFGGGGGEHGGGGGF8GGGGNGGGGGFGGGGGGFGGGGGFGGGGGFGGGGGGFGGGGGFGGGG

4312 04 GGGOMHGGGGGFGGGGGFGGGGGFGGGGGCJGGGGGFGGGGGFGGGGGGFGGG8GNGGGGGFGGGGGEHGGGGGFGGGGGFGGGGGGFGGGGGFGGGGGF

4581 3F GGFOGKGJGGGGGFGGGGGGFGGGGGFGGGCGJGGGGGGFGGGGGFGGGGGFG8GGKGJGGGGGFGGGGGFGGGGGGFGGGCGJGGGGGFGGGGGEHGGG

4759 2B GGGNGGKIHGGGGGGFGGGGEHGGGGGFGGGCGIHGGGGGFGGGGEHGGGGGG7GGGKIHGGGGGFGGGGEGHGGGGGFGGGCIHGGGGGFGGGGGEHGG

4816 39 GGFOGGGMHGGGGGFGGGGEHGGGGGFGGGGGEHGGGGGFGGGGGFGGGGGGF8GGGMHGGGGGFGGGGGEHGGGGGFGGGGCJGGGGGGFGGGGEHGGG

4915 33 GGGNGGGMHGGGGGFGGGGGEHGGGGGFGGGGCJGGGGGFGGGGGEHGGGGGF8GGGGNGGGGGGFGGGGEHGGGGGFGGGGGEHGGGGGFGGGGGFGGG

4988 11 GGGOFGGKGIHGGGGGFGGGGGEHGGGGGFGGCGIHGGGGGGFGGGGEHGGGG8GFGGKGIHGGGGGFGGGGEHGGGGGFGGGCGIHGGGGGFGGGGEHG

5596 21 GGGGNGKGIHGGGGGFGGGGGEHGGGGGFGGCGGJGGGGGGFGGGGEHGGGGGF8GGKGIHGGGGGFGGGGEHGGGGGFGGGCGIHGGGGGFGGGGGFGG

5738 25 GGGFOGGKIHGGGGGFGGGGEHGGGGGGFGGGCIHGGGGGFGGGGGEHGGGGGF8GGGMHGGGGGGFGGGGEHGGGGGFGGGGCIHGGGGGFGGGGEHGG

6147 43 GGFGGSIHGGGGGFGGGGGEHGGGGGFGGGCIHGGGGGFGGGGGEHGGGGGFGGG8KJGGGGGGFGGGGEHGGGGGFGGGGCIHGGGGGFGGGGEHGGGG

6165 3F GGFGGSGJGGGGGFGGGGGEHGGGGGFGGGCGJGGGGGGFGGGGEHGGGGGFGGG8KIHGGGGGFGGGGGFGGGGGGFGGGCIHGGGGGFGGGGGEHGGG

6673 17 GGGGFOGGKIHGGGGGFGGGGGFGGGGGGFGGGCGJGGGGGFGGGGGEHGGGGGF8GGKGJGGGGGFGGGGGEHGGGGGFGGGCGJGGGGGGFGGGGGFG

6911 1B GGGGFOGGGNGGGGGGFGGGGGFGGGGGFGGGGGCJGGGGGFGGGGGFGGGGGGF8GGGGNGGGGGFGGGGGEHGGGGGFGGGGGFGGGGGFGGGGGGFG

6947 03 GGGGGNGGGKIHGGGGGFGGGGGEHGGGGGFGGGCIHGGGGGFGGGGGEHGGGGG8FGGGKJGGGGGGFGGGGEHGGGGGFGGGGCIHGGGGGFGGGGEH

7163 33 GGGFGGSGJGGGGGFGGGGGEHGGGGGFGGGCGJGGGGGFGGGGGEHGGGGGFGGG8KJGGGGGGFGGGGGFGGGGGFGGGGCIHGGGGGFGGGGGFGGG

7356 21 GGGGFGOKIHGGGGGFGGGGGEHGGGGGFGGGCGJGGGGGGFGGGGEHGGGGGFGG8GKIHGGGGGFGGGGEHGGGGGFGGGGCIHGGGGGFGGGGGFGG

7529 17 GGGGFGOGKIHGGGGGFGGGGEHGGGGGGFGGGCIHGGGGGFGGGGEGHGGGGGFG8GGMHGGGGGFGGGGGEHGGGGGFGGGGEHGGGGGGFGGGGEHG

8381 0F GGGGGFGOKGJGGGGGFGGGGGGFGGGGGFGGGCGJGGGGGGFGGGGGFGGGGGFGG8GKGJGGGGGFGGGGGFGGGGGGFGGGCGJGGGGGFGGGGGEH

8421 1F GGGGFGGOGNGGGGGFGGGGGGFGGGGGFGGGGGFGGGGGGFGGGGGFGGGGGFGGG8GKJGGGGGFGGGGGFGGGGGGFGGGGCJGGGGGFGGGGGEHG

8583 5E HGGGGGFOGKGJGGGGGGFGGGGGFGGGGGFGGGCGIHGGGGGFGGGGGFGGGGGGF8GGKGJGGGGGFGGGGGEHGGGGGFGGGCGJGGGGGFGGGGGE

9312 19 GGGGFGGGOMHGGGGGFGGGGGFGGGGGFGGGGGCJGGGGGFGGGGGFGGGGGGFGGG8GNGGGGGFGGGGGEHGGGGGFGGGGGFGGGGGGFGGGGGFG

9324 11 GGGGGFGGOKJGGGGGFGGGGGEHGGGGGFGGGGCJGGGGGGFGGGGEHGGGGGFGGG8GMHGGGGGFGGGGGFGGGGGFGGGGGEHGGGGGFGGGGGFG

9711 60 FGGGGGFGOGGNGGGGGGFGGGGGFGGGGGFGGGGGCJGGGGGFGGGGGFGGGGGGFG8GGGNGGGGGFGGGGGEHGGGGGFGGGGGFGGGGGFGGGGGG

G, representing 16, is clearly the most common value, as I already knew. But other than that, it's hard to see much of anything. You can tell that the non-G values tend to cluster in various places, but that's about it. Obviously, this wasn't the right way to visualize my data.

My next thought was to use color to highlight values. Part of me disliked having to use colorized output, because it pretty much makes it impossible to pipe the output to anything else: it makes the program an endpoint of any pipeline. But I was desperate, so I did it anyway. This time, instead of displaying the stride's absolute value, I displayed the stride's offset from 16. I chose to display a positive offset in bright blue, and a negative offset in magenta. Zero would be displayed in dark blue, so that the non-zero values would stand out.

Immediately I knew that, whatever problems it introduced, colorizing the output was the right thing to do.

1117 3A E1 00000 1 00000 2 1 00000 1 0000 4 3 000000 1 0000 2 1 00000 1 0000 8 61 00000 1 0000 2 1 000000 1 0000 2 1 00000 1 00000 2 1 00000 1 000

1544 14 8 0 421 00000 1 00000 1 00000 1 0000 4 21 00000 1 00000 1 000000 1 0 8 00 43 00000 1 00000 2 1 00000 1 0000 4 3 000000 1 0000 2 1 00000 1 0

1936 59 7 000 421 00000 1 0000 2 1 00000 1 0000 4 21 00000 1 00000 1 000000 9 0000 61 00000 1 00000 2 1 00000 1 0000 4 3 000000 1 0000 2 1 00000

2172 1C 0 C 0 3 00000 1 00000 2 1 00000 1 000 4 0 3 000000 1 00000 1 00000 1 000 8 421 00000 1 00000 1 00000 1 0000 4 0 3 00000 1 00000 1 000000 1 0

2746 55 1 8 00 421 00000 1 00000 1 000000 1 000 4 21 00000 1 00000 2 1 00000 18 000 43 000000 1 0000 2 1 00000 1 0000 4 21 00000 1 0000 2 1 00000

2882 45 0 7 000 4 0 3 00000 1 00000 1 000000 1 000 4 0 3 00000 1 00000 2 1 00000 9 000 4 0 3 000000 1 00000 1 00000 1 000 4 0 21 00000 1 00000 1 0000

3147 0E 00 C21 00000 1 00000 2 1 00000 1 000 4 21 00000 1 00000 2 1 00000 1 000 8 43 000000 1 0000 2 1 00000 1 0000 4 21 00000 1 0000 2 1 000000 1

3921 47 0 1 8 000 43 00000 1 00000 1 000000 1 0000 4 3 00000 1 00000 2 1 00000 18 0000 7 00000 1 000000 1 00000 1 00000 1 000000 1 00000 1 0000

4312 04 000 861 00000 1 00000 1 00000 1 00000 4 3 00000 1 00000 1 000000 1 000 8 0 7 00000 1 00000 2 1 00000 1 00000 1 000000 1 00000 1 00000 1

4581 3F 00 1 8 0 4 0 3 00000 1 000000 1 00000 1 000 4 0 3 000000 1 00000 1 00000 1 0 8 00 4 0 3 00000 1 00000 1 000000 1 000 4 0 3 00000 1 00000 2 1 000

4759 2B 000 7 00 421 000000 1 0000 2 1 00000 1 000 4 0 21 00000 1 0000 2 1 000000 9 000 421 00000 1 0000 2 0 1 00000 1 000 4 21 00000 1 00000 2 1 00

4816 39 00 1 8 000 61 00000 1 0000 2 1 00000 1 00000 2 1 00000 1 00000 1 000000 18 000 61 00000 1 00000 2 1 00000 1 0000 4 3 000000 1 0000 2 1 000

4915 33 000 7 000 61 00000 1 00000 2 1 00000 1 0000 4 3 00000 1 00000 2 1 00000 18 0000 7 000000 1 0000 2 1 00000 1 00000 2 1 00000 1 00000 1 000

4988 11 000 8 1 00 4 0 21 00000 1 00000 2 1 00000 1 00 4 0 21 000000 1 0000 2 1 0000 8 0 1 00 4 0 21 00000 1 0000 2 1 00000 1 000 4 0 21 00000 1 0000 2 1 0

5596 21 0000 7 0 4 0 21 00000 1 00000 2 1 00000 1 00 4 00 3 000000 1 0000 2 1 00000 18 00 4 0 21 00000 1 0000 2 1 00000 1 000 4 0 21 00000 1 00000 1 00

5738 25 000 1 8 00 421 00000 1 0000 2 1 000000 1 000 4 21 00000 1 00000 2 1 00000 18 000 61 000000 1 0000 2 1 00000 1 0000 4 21 00000 1 0000 2 1 00

6147 43 00 1 00 C21 00000 1 00000 2 1 00000 1 000 4 21 00000 1 00000 2 1 00000 1 000 8 43 000000 1 0000 2 1 00000 1 0000 4 21 00000 1 0000 2 1 0000

6165 3F 00 1 00 C 0 3 00000 1 00000 2 1 00000 1 000 4 0 3 000000 1 0000 2 1 00000 1 000 8 421 00000 1 00000 1 000000 1 000 4 21 00000 1 00000 2 1 000

6673 17 0000 1 8 00 421 00000 1 00000 1 000000 1 000 4 0 3 00000 1 00000 2 1 00000 18 00 4 0 3 00000 1 00000 2 1 00000 1 000 4 0 3 000000 1 00000 1 0

6911 1B 0000 1 8 000 7 000000 1 00000 1 00000 1 00000 4 3 00000 1 00000 1 000000 18 0000 7 00000 1 00000 2 1 00000 1 00000 1 00000 1 000000 1 0

6947 03 00000 7 000 421 00000 1 00000 2 1 00000 1 000 4 21 00000 1 00000 2 1 00000 81 000 43 000000 1 0000 2 1 00000 1 0000 4 21 00000 1 0000 2 1

7163 33 000 1 00 C 0 3 00000 1 00000 2 1 00000 1 000 4 0 3 00000 1 00000 2 1 00000 1 000 8 43 000000 1 00000 1 00000 1 0000 4 21 00000 1 00000 1 000

7356 21 0000 1 0 8421 00000 1 00000 2 1 00000 1 000 4 0 3 000000 1 0000 2 1 00000 1 00 8 0 421 00000 1 0000 2 1 00000 1 0000 4 21 00000 1 00000 1 00

7529 17 0000 1 0 8 0 421 00000 1 0000 2 1 000000 1 000 4 21 00000 1 0000 2 0 1 00000 1 0 8 00 61 00000 1 00000 2 1 00000 1 0000 2 1 000000 1 0000 2 1 0

8381 0F 00000 1 0 84 0 3 00000 1 000000 1 00000 1 000 4 0 3 000000 1 00000 1 00000 1 00 8 0 4 0 3 00000 1 00000 1 000000 1 000 4 0 3 00000 1 00000 2 1

8421 1F 0000 1 00 8 0 7 00000 1 000000 1 00000 1 00000 1 000000 1 00000 1 00000 1 000 8 0 43 00000 1 00000 1 000000 1 0000 4 3 00000 1 00000 2 1 0

8583 5E 1 00000 1 8 0 4 0 3 000000 1 00000 1 00000 1 000 4 0 21 00000 1 00000 1 000000 18 00 4 0 3 00000 1 00000 2 1 00000 1 000 4 0 3 00000 1 00000 2

9312 19 0000 1 000 861 00000 1 00000 1 00000 1 00000 4 3 00000 1 00000 1 000000 1 000 8 0 7 00000 1 00000 2 1 00000 1 00000 1 000000 1 00000 1 0

9324 11 00000 1 00 843 00000 1 00000 2 1 00000 1 0000 4 3 000000 1 0000 2 1 00000 1 000 8 0 61 00000 1 00000 1 00000 1 00000 2 1 00000 1 00000 1 0

9711 60 1 00000 1 0 8 00 7 000000 1 00000 1 00000 1 00000 4 3 00000 1 00000 1 000000 1 0 8 000 7 00000 1 00000 2 1 00000 1 00000 1 00000 1 000000

I quickly began noticing patterns in the data. I realized that there was a "seam" of large positive numbers running down the left-hand edge, with values like +7, +8, and +12. And down the middle of the sequence list was a seam of large negative values, usually –8. No other numbers in the stride sequence went outside the range of ±5.

Going down the list, I noticed that the positive number seam slowly moved away from the edge, and that that the negative number seam moved at the exact same rate, so that they were always halfway across the sequence from each other. Closer inspection suddenly brought forth the realization: The position of the positive number seam was equal to the first digit in the key. So, a key of 2358 would mean that the maximum-width stride would be between the second and third square. No exceptions.

Furthermore, I saw that the largest positive values, +11 and +12, only occurred with keys with a second digit of 1. Any other second digit, and the big positive value stayed in the range +7 to +10, with another +4 somewhere nearby. Eventually I realized that the +12 was actually a separate +8 event and a +4 event that had coincided. There was another +4 event near the halfway point, not far away from the –8 event.

It appeared that the placement of these exceptions to the 16-stride rule was partly based on the size of the grid and partly based on the individual digits of the key. By looking at a larger grid size, I was able to see the individual events more clearly:

1318 44 861 0000000 1 000000 2 1 0000000 1 000000 4 21 0000000 1 000000 2 1 0000000 1 0000 8 0 61 0000000 1 000000 2 1 0000000 1 000000 2 1 0000000 1 000000 2 1 0000000 1 0000

1561 2E 8 0 4 0 3 0000000 1 0000000 1 0000000 1 00000 4 0 3 0000000 1 0000000 1 0000000 1 000 8 00 43 0000000 1 0000000 1 0000000 1 000000 4 3 0000000 1 0000000 2 1 0000000 1 00

2156 38 0 C21 0000000 1 000000 2 1 0000000 1 00000 4 0 3 0000000 1 0000000 1 0000000 1 00000 8 421 0000000 1 000000 2 1 0000000 1 00000 4 21 0000000 1 000000 2 1 0000000 1 000

2172 38 0 C 0 3 0000000 1 0000000 1 0000000 1 00000 4 0 3 0000000 1 0000000 1 0000000 1 00000 8 421 0000000 1 000000 2 1 0000000 1 00000 4 0 3 0000000 1 0000000 1 0000000 1 000

2346 2C 0 8421 0000000 1 000000 2 1 0000000 1 00000 4 21 0000000 1 000000 2 1 0000000 1 0000 8 0 43 0000000 1 0000000 1 0000000 1 000000 4 21 0000000 1 000000 2 1 0000000 1 00

3432 20 00 8 0 61 0000000 1 000000 2 1 0000000 1 000000 4 3 0000000 1 0000000 1 0000000 1 0000 8 0 43 0000000 1 0000000 1 0000000 1 000000 4 3 0000000 1 0000000 1 0000000 1 00

3626 0C 00 8 00 61 0000000 1 000000 2 1 0000000 1 000000 2 1 0000000 1 000000 2 1 0000000 1 000 8 00 43 0000000 1 0000000 1 0000000 1 000000 4 21 0000000 1 000000 2 1 0000000 1

4296 00 000 84 0 21 0000000 1 000000 2 1 0000000 1 0000 4 0 21 0000000 1 000000 2 1 0000000 1 000 8 4 0 21 0000000 1 000000 2 1 0000000 1 0000 4 00 3 0000000 1 0000000 1 0000000 1

4362 0C 000 84 0 3 0000000 1 0000000 1 0000000 1 00000 4 0 3 0000000 1 0000000 1 0000000 1 0000 8 0 43 0000000 1 0000000 1 0000000 1 000000 4 21 0000000 1 000000 2 1 0000000 1

4655 73 1 00 8 00 43 0000000 1 0000000 2 1 0000000 1 00000 4 21 0000000 1 000000 2 1 0000000 1 00 8 00 421 0000000 1 000000 2 1 0000000 1 00000 4 0 3 0000000 1 0000000 1 0000000

5217 0E 0000 861 0000000 1 000000 2 1 0000000 1 000000 2 1 0000000 1 000000 2 1 0000000 1 00000 8 61 0000000 1 000000 2 1 0000000 1 000000 4 3 0000000 1 0000000 2 1 0000000 1

5519 73 1 000 8 0 61 0000000 1 000000 2 0 1 0000000 1 00000 4 21 0000000 1 000000 2 1 0000000 1 000 8 00 61 0000000 1 000000 2 1 0000000 1 000000 2 1 0000000 1 000000 2 1 0000000

5537 6F 0 1 00 8 0 421 0000000 1 000000 2 1 0000000 1 00000 4 21 0000000 1 000000 2 1 0000000 1 000 8 00 61 0000000 1 000000 2 1 0000000 1 000000 4 3 0000000 1 0000000 2 1 000000

5563 6B 0 1 00 8 0 4 0 3 0000000 1 0000000 1 0000000 1 00000 4 0 3 0000000 1 0000000 1 0000000 1 000 8 00 43 0000000 1 0000000 2 1 0000000 1 00000 4 21 0000000 1 000000 2 1 000000

5682 5D 00 1 0 8 00 4 0 3 0000000 1 0000000 1 0000000 1 00000 4 0 3 0000000 1 0000000 1 0000000 1 00 8 00 4 0 3 0000000 1 0000000 1 0000000 1 00000 4 0 21 0000000 1 000000 2 1 00000

5781 57 00 1 0 8 00 4 0 3 0000000 1 0000000 1 0000000 1 00000 4 0 3 0000000 1 0000000 2 1 0000000 1 0 8 000 4 0 3 0000000 1 0000000 1 0000000 1 00000 4 0 3 0000000 1 0000000 1 00000

5936 51 00 1 0 8 000 421 0000000 1 000000 2 1 0000000 1 00000 4 21 0000000 1 000000 2 1 0000000 1 0 8 0000 61 0000000 1 000000 2 1 0000000 1 000000 4 3 0000000 1 0000000 1 00000

5975 43 000 1 8 000 4 0 3 0000000 1 0000000 2 1 0000000 1 0000 4 0 21 0000000 1 000000 2 1 0000000 18 0000 421 0000000 1 000000 2 1 0000000 1 00000 4 0 3 0000000 1 0000000 1 0000

6264 71 1 0000 8421 0000000 1 000000 2 1 0000000 1 00000 4 21 0000000 1 000000 2 1 0000000 1 0000 8 4 0 3 0000000 1 0000000 1 0000000 1 00000 4 0 3 0000000 1 0000000 1 0000000

6645 57 00 1 00 8 00 43 0000000 1 0000000 1 0000000 1 000000 4 3 0000000 1 0000000 2 1 0000000 1 00 8 00 421 0000000 1 000000 2 1 0000000 1 00000 4 21 0000000 1 000000 2 1 00000

6794 3D 0000 1 8 00 4 0 21 0000000 1 000000 2 1 0000000 1 0000 4 00 3 0000000 1 0000000 1 0000000 1 0 8 000 4 0 3 0000000 1 0000000 1 0000000 1 00000 4 0 21 0000000 1 000000 2 1 000

7252 69 0 1 0000 843 0000000 1 0000000 1 0000000 1 000000 4 3 0000000 1 0000000 1 0000000 1 00000 8 421 0000000 1 000000 2 1 0000000 1 00000 4 0 3 0000000 1 0000000 1 000000

8376 41 000 1 000 84 0 21 0000000 1 000000 2 1 0000000 1 0000 4 0 21 0000000 1 000000 2 1 0000000 1 000 8 0 421 0000000 1 000000 2 1 0000000 1 00000 4 0 3 0000000 1 0000000 1 0000

8729 2F 00000 1 0 8 00 421 0000000 1 000000 2 1 0000000 1 00000 4 21 0000000 1 000000 2 1 0000000 1 00 8 000 61 0000000 1 000000 2 1 0000000 1 000000 2 1 0000000 1 000000 2 0 1 00

9421 47 000 1 0000 8 0 7 0000000 1 0000000 1 0000000 1 0000000 1 0000000 1 0000000 2 1 0000000 1 0000 8 0 43 0000000 1 0000000 1 0000000 1 000000 4 3 0000000 1 0000000 1 0000

9548 29 00000 1 00 8 0 421 0000000 1 000000 2 1 0000000 1 00000 4 21 0000000 1 000000 2 1 0000000 1 000 8 00 421 0000000 1 000000 2 1 0000000 1 00000 4 21 0000000 1 000000 2 1 00

9581 27 00000 1 00 8 0 4 0 3 0000000 1 0000000 1 0000000 1 00000 4 0 3 0000000 1 0000000 2 1 0000000 1 00 8 00 4 0 3 0000000 1 0000000 1 0000000 1 00000 4 0 3 0000000 1 0000000 1 00

9634 2D 00000 1 00 8 00 61 0000000 1 000000 2 1 0000000 1 000000 4 3 0000000 1 0000000 1 0000000 1 000 8 00 43 0000000 1 0000000 1 0000000 1 000000 4 21 0000000 1 000000 2 1 00

9782 15 000000 1 0 8 00 4 0 3 0000000 1 0000000 1 0000000 1 00000 4 0 21 0000000 1 000000 2 1 0000000 1 0 8 000 4 0 3 0000000 1 0000000 1 0000000 1 00000 4 0 3 0000000 1 0000000 1 0

9895 03 0000000 1 8 000 4 0 3 0000000 1 0000000 2 1 0000000 1 0000 4 0 21 0000000 1 000000 2 1 0000000 18 000 4 0 21 0000000 1 000000 2 1 0000000 1 0000 4 00 3 0000000 1 0000000 1

I now could see that there was a consistent pattern of one +8 event, two +4 events, four +2 events, and eight +1 events. These were balanced by an equal number of negative events, offset for maximum distance from the matching positive events, more or less. And the events could overlap each other, in which case they would just add together, like waves passing through each other.

I found myself imagining the underlying mechanism using physical imagery. Specifically, I imagined the four digits of the key as being like four timing wheels that would turn with each step of the reordering process, periodically firing to create the positive and negative changes in the stride. I therefore wound up referring to the events as "firings". The +8 and –8 wheels fire once, the ±4 wheels fire twice, etc.

I labelled the four separate digits of the key a, b, c, and d, and set out to create formulas that gave the location of every firing. The +8 firing occurred at position a. The –8 firing could be expressed as a + N / 2 (where N stood for the size of the grid). The +4 mechanism produced two firings, and it didn't take long to determine that they took place b / 2 positions after the ±8 firings. Although if b was odd, then the second firing would actually be at (b + 1) / 2. Or, put another way, the value was (b + i) / 2, where i was 0 or 1 depending on the firing. (Note that the slash specifically represents integer division, where any remainders are discarded.)

The two –4 firings proved to be a little more complicated, but not intractable. Encouraged by these results, I dove headfirst into the ±2 and ±1 firings. When I was done, I had this:

F+8 = a F–8 = a + N / 2 F i +4 = F±8 + (b + i) / 2 F i –4 = F±8 + (b + (N / 2) % 2 + i) / 2 F i +2 = F±4 + (c + (2 · (b % 2) + i · (2 · ((N / 2) % 2) + 1)) % 4) / 4 F i –2 = F±4 + (c + (N / 2) % 4 + (2 · (b % 2) + i · (2 · ((N / 2) % 2) + 1)) % 4) / 4 F i +1 = F±2 + (d + (2 · ((2 · (b % 2) + c) % 4) + i · (N % 8 + 1)) % 8) / 8 F i –1 = F±2 + (d + (N / 2) % 8 + (2 · ((2 · (b % 2) + c) % 4) + i · (N % 8 + 1)) % 8) / 8

As you can see, it got a lot uglier pretty quickly. And I'm skipping over a bunch of time here. There were some failed attempts that only worked some of the time, and there were formulas that were much more complicated than these. But this was the first set of formulas that accurately predicted the reordering of every single grid of size 128 that I had collected so far. I then tried it on a larger grid size, size 140, and again it correctly predicted every single one. I also tried it on a smaller grid size, size 122 — and it got exactly one grid wrong, in one place.

I wanted to cry.

But this wasn't an anomaly. The more I looked at the smaller grids, the more divergences I found from my model. By focusing on the larger grid, I had apparently not been able to see some further set of complications to my model. Not that this model was simple, mind you. But apparently it was still too simple. After all, it did work consistently for larger grids. That couldn't be accidental. So the fact that it worked sometimes but not all the time seemed to indicate that the real answer was more complicated still.

In parallel with this, I had been trying to figure out how the initial position was selected. Since that was just a single number, it was a little easier to explore. I looked for scrambled grids where the keys differed from each other by only one number, and could see some obvious patterns. In fact, I soon realized that, once again, the odd-sized grids followed a relatively simple rule:

IP = 15 – 2 · (8a + 4b + 2c + d) % N

But even-sized grids followed a slightly more complicated rule, one that usually produced similar results, but with occasional disruptions. (Sound familiar?)

IP = – 2 · (8a + 4b + 2c + d)

parity = 1

while IP < 0

IP += N

parity = 1 – parity

IP += parity (8< 0

Oh and also when I tried to apply the rule to smaller grids, I found more exceptions where these rules stopped working.

Interlude: Epicycles

At this point, I felt like I was inventing epicycles. Are you familiar with epicycles? They were circles-upon-circles that earlier astronomers theorized to explain the motion of the sun, moon, and planets — specifically to explain why they periodically sped up and slowed down, even to the point of going backwards at times. I mean, if they were rotating at constant speed in perfect circles, how could there be all these variations? Epicycles were the theory that the celestial bodies were really attached to a secondary circle that rotated around the point of attachment to the main circle. A slight complication to the original picture. It did a great job of explaining the orbit of the sun and moon, for example. But when you applied the same approach to the planets, this simple model isn't quite enough. More epicycles are needed, but also things like deferents and equants. In the end the (known) solar system requires dozens of epicycles to cover all the observed motions.

Despite the complexity, this system did a pretty good job of predicting the motion of the heavenly bodies. But even so, there were hints that it wasn't the right answer. For example, all the planets have completely independent motions, yet why do the orbits of the two fastest planets, Mercury and Venus, never take them very far away from the position of the sun? Under the epicycle model, this could only be considered a coincidence. Like the Ptolemaic astronomer, I found myself unable to explain certain features of my stride equations. You'll notice that certain terms, like 2 · (b % 2), crop up in several places, while most terms only appear once. That suggests that there's something in there that hasn't been factored out correctly. But there's something else that's much more troubling. Would you guess, just from looking at these equations and knowing how they're used to guide the reordering step, that it would always assign exactly one letter to every position in the output, never accidentally trying to put two letters into the same position? I mean, it just so happens to be true — but it's far from obvious. Usually when a sensible person writes something like a reordering algorithm, a necessary feature like that will usually be a clear feature of the code.

Things like this convinced me that what I had created was an effective system of epicycles. But then the next question was: how do I go from epicycles to ellipses? If this system of stride formulas isn't the right answer, then how do I get from here to a better answer?

Of course, during this time I wasn't just looking at the reordering process. There was also the actual encryption. I worked on both in parallel, so that when I got frustrated with one I would switch over to the other. So, let's set the reordering problem aside for now, rewind a bit, and cover the other half of the scrambling process.

The Encryption Process: Various Visualizations

Remembering how color had helped me to see patterns in the reordering process, I started by writing scripts to colorize the encrypted grids in various ways, looking for an entry point. One of the first I tried was to color each letter based on how distant it was from the previous letter in grid.

1117 K QK KKK EKQKEKQK KK E Q K KKKK EKQ E KQK K EK KK QK KKKKKK E Q Q KEKQ E KQ E K KKK EK KK Q E KQK KKK EKQK KK QK KK E Q K KKKK E E Q K KKKK EK K E Q

1544 S K O O K T L N QL S O MNO TOLQ P K R Q N NNOP KO O R T H N R KS O ON K T L M Q P S R M NO KO PQO K R Q K NROS K O O L TL N QL S O HN O T S L Q P K R Q NQN OP KO QR N HO R

1936 CLW P N Z QZ L N KO W N T HT WTQ KR W V R K B Z N OP Z Z NZN Q CO W N Q HQ T L Q KR TVT MTZT O M Z W L R N B C LW P N Z QZ L N KO NN T H C W TQ QR W VT KT Z M O W Z R N BN

2172 M SK HM N S F H R L N XG H M N WG M M I S F G N M R S L H H N X F I R M N KG M M S W H M L I X F H N N R G L M H S S G I M M SK H N N S F LRL N XG H M M WG M M I S F H N M R X L H HG XM I

2746 R X UX O Q A S QQ W ZWR Q V KU V SV K Y W P URW X PXP R YS W Q R Z UQ O V A U Q S W MW W Q U M W V NVP Y Y NW R Q X UX O Q A S P Q W Z Y R Q VRU V SOK Y W M U VWV P YP N Y R W

2882 IU UU G I U A AA UO O A A O U U AUAU I G OU U OUOUA OA A OAO O U U G OU UUUU O O GAUO O A UA A IAUOU UUUUU IU A OA UO A AA O O UAU GUI G U UU O OO UAUAUO

3147 O Y T L Q P I S PKP O W J O Q GV V P S M R P M T H Q Y N LNP S SR K O O T J Q Q I V P PP M W P O T G Q V N SNR S M L H O Y T L Q PQS PKN O W J R Q GV T PS M OP G T V Q S NRN M S HR

3921 HPJPW GP KN E K H P O V J O QR DQ G H R UP Q IP CP Q GX J O E H HJ O W K P Q N D K G P J V P O I J C Q Y HX U O YH P KP W GP KN E J H P O V J O QR DQ G H R UPK I P C V Q GX Y O

4312 OJ K HK LLN J H N K JKLKMM I IG L KJ O F L N J L HK LK N G H O K KKKKLM J I N LJJL FM N ILGK KK O G LOJ K H H L LN J H N K MKLKMM I I F L KJ O F L N J L HKOK L G

4581 T P RS Z P P VPV VTS L V Z I W O RW MW K O A YA P K SY P L R W VTTR L Z V PWP R V M S O V A IA O K W UW L O W UT PK SZ PA V PV YT S L W Z I W R R W M PKO A VA P K V Y P L U W

4759 C E W B ZA T X D YB TZF U ZWA B U E X Y DZC V Y EY Z WA G X ZY C S W F Z Z SA D VB X Z D VCW W BY EW W G Z U V C EDBZAYX D Y W T ZFZ Z WA B U Z XV DWC B Y EY WW Z G V Z

4816 W L Z SM Y TX R O A Q Q UR C WOV T KT V YT KA T O V SX Y OX P M W Q Z U M CT Q RT AT Q YR N W T VV N X V L T P A W L Z S O Y TX R O A Q O U R C WOV T MT V Y W K V T NV VX TO A P

4915 W U O M CT TR M GA P TQ HX XW PH F T ST O C B S U L I X TY R I G W TO Q C XTW M K A T TT K C X P P L FXP YO M B W U O M CT P R M GA P TQ HX XW PH F T TT O CP S F L P X OY B I

4988 YC IA Z C I HI Z Y H U E DC H UI EZ G D E H ZHZ DI AG C D HE Y Y H I E Y C I YI E YGYE D Z HZ I E VG D U H EH G C IA Z C I HI Z Y H D EDC Y Y I E YG D E H VI Z V I DG H CHE

5596 W K V W DVD ZUZW H D V AVZD U WY DZVZ YZA G V W DV G Z V DW H V V DVD C U WW D C V A ZZA U V ZDZDZVZD K V WY V D ZVZW H G V AV W D U W D D ZVZ Y U A Z VZDZ GZ V

5738 OX BXA M B C T ZTZ A V C R T B YB V T C A WTX DX WXY OZ C V Z OZ B TA RB B TBT S A AC T S D Y XV YCZ XVX MX BXA M B C W Z TZ Z V C R O B YB A T C A S T YD V W C Y XZXV

6147 VY QW N P TW P P T RTR V N O Z V PW M U X Q Q M A Y KWK P A W U P V R QR N N TZ P P T MT X V Q O A V K WK U A Q W MV Y QW N P A W P P K R TR U N O ZQ P W MV X O Q VAW K UK Q A M U

6165 N N T WT N R TRT X N O V X TQ O SXS M O W W O YO N W WY N R TST N N T VTTR OR X X M O W X O QO SWS Y O RWS Y N N W W T N O T RT Y N O V S T Q O TXS M RW W O SOSW O Y W R YS

6673 WW A P X ZYX SW U ZW PW XY T Z VQ Z ZZ WT B W WV P B Z VA S W W Z A P X UYT S V U ZWAWTY W AVQ A Z V W S A W W Z P X Z W X SWB Z W P SX Y T A VQ Z YZ WT UWWV Z B W V A S

6911 O C ZW M JMR R U T E C W J O M U Z R W E OE J R R O C R R J JH R R U O M Z W M OM U R J TE CE J R MJ Z R W J JHJ W RO C ZW M JE R R U T E C W RO M U Z R W E OE J R R O C R J JJH R R

6947 DY YW D F Z W X F B ZBZX D E ZV X E CA X Y G C AYA UA FAW U F DB Y Z D DBZXV B CB X V G EC V AEA CAYW C D Y YW D F AW X F A ZBZ U D E Z YX E C V X E G V AEA CAYAC W

7163 SQ V V I R Q W R GR SS P AQ K T T Z O M X R U N J Y Q T V L R O W X G S S V PIQ Q T R ZR M S R A N K Y T T OL X O U V J S Q V V I R Y W R G L S S P XQK T V Z O M Q R U N R YQ TX L U O J X

7356 A U T W XXX R Q Z W VUT YT Z X M A V XV PV SVYUX W TX W R T Z A V TT XTXUQ A W X UP Y VZ Y M X VTVUVTV A U T WSX X RX Z W VW TYT A X M A X XV P Z SM Y V XVTVWVT

7529 I L V VX E U Z Y C C P T CS Y BX NX AW X S Z W E B L E V ZE Q VR C IPVC XC U X YX C W T N SWB B N E AE XQ Z R E IL Z V X E U Z Y CZ P T CR Y BXCXTWS S B W N B A E X ZZ Q ER

8381 IW S Z W GW USU G WN S U G P P B SB P N BNB Z P W L Z B G W U ZU IW S S W GW P S S G P N B UB P P B LB Z N WN Z Z I W B Z W GP U SU B W N S ZG P P S S B P W B NB G P W LU B G W Z Z

8421 O IQ M C I I WIW W F M L W T H S M Q S E J PVP AP IP M G I L W JW O FQ L CTISI QWEM P WPHP MPS A J L V J A O I W MC I I WIW T F M L W T H S P QS E J PVPWPIPW G I L AJ

8583 D V T DTD B V A RB D O A Y BY Q A W W A GA V WW GV T D YD V V D RD B TATB B QA Y B A OA YWY GAT W Y G V VWW DV AD TD G V A R Y D O A D B Y Q TW W A BA V W BG V T G Y V VW D

9312 SVQ FO RORV N M W P NP LP QPO E XQ HXM J R V W F R R NRN N SW Q N O LO Q V O M X PHP MP RP WE R Q N XN J S VQ F N RORV N M W Q N P LP QPOM X Q HXM J R P W F R X N JN

9324 R Y N Z Z K R OL Y T S PS T MS W N S FT JU M Y UQ Y LZ V K X O S YR S NS Z M RW L S TT PU T Y S Q N L F V J Y M S U R Y N Z Y K R OL Y T S WS T MS W N S ZT JUL Y UQ F L J V MX U S

9711 E Y Y MYM M E U S MS W M Y U M C A SKSMSYS U E YSM AM Q E U S E M YMY U M C U Y M S WS YSM M A SK A M Q Y M U E Y Y MYM S E U S MS W M S U M C A SKS ES YS U E YSY A M QU U

Red for closest, violet for most distant. Since the plaintext letter is always A, any variance or pattern from letter to letter should be entirely due to the encryption process. Note, however, that this display is showing the grids as they appeared in the scrambled output file. In contrast, here is the same colorization filter applied to the encrypted grids after undoing the reordering step:

1117 E E K KK EK KK EK KK EK KK EK KK EK KK EKQ Q KQ QQ KQ QQ KQK K EK KK EK KK EK KKKK QK KK QK KK QK KK QK KK QK KK QK KK QK KK QK KK QK K EKE EE KE EE K

1544 K MK S OT RS OT RS OT RS OT RS OT RS OT RS HQ OP LQ OP LQ OP LQ K L HM KL HM KKKL QN NOQ N NOQ N NOQ N NOQ N NOQ N NOQ K O PR O OPR O OPR OK LN K KLN K

1936 K W N TL C B HZ KB HZ KB HZ KT ZR CT ZR CT ZR CNW O ZQ WO ZQ WO ZQ W LWN TL WN N K Q M N TZ V W T Z V W T Z V W T R N O L R N O L R N O LQ PQ N T P Q N T P Q N TM N K Q M N

2172 S W L H FF KG FF KG FF KG GG LH GG LH GG L NM HMI HH MI HH MI HH R N MM RN MM RNS M MM HL MM HL MM HL MN IM NN IM NN IM X X N R SS NR SS NR SSS W XX SW XX

2746 K R PQ N Y YZ WA YZ WA YZ WA YX UY WX UY WX UY P V SW UV SW UV SW UV N R PQ NR PQ KR MQ O X SW QX SW QX SW Q VQU OV QU OV QU O R R V PW RV PW RV PW M Q KR MQ

2882 IU A O U U A O OO U I O O U I O O U IU U A O U U A O U U A O UAG U A A G U A A G U AUA O U U A O I U O O U GU U O A U U O A U U O A UAU G A A U G A A U U OUO A U U O A U U O A U U IU O O

3147 O LP N S K T Q S K T Q S K T Q S HQ N P H Q N P H Q NQ GP M O G P M O G P M O G R O Q I R O Q I R O Q L T YVSW Y VSW Y VSW V SPT V SPT V SPT R MJN P MJN P MJN PO LP R OLP R

3921 K RJQK Y J V O C N UO C N UO C N U P D O VP D O VP D O V K E P WQ E P WQ E P WQY J QK Y J QKRJ JK R G GH OG GH OG GH OG HI PH HI PH HI PHP Q XP PQ XP PQ XP P K RJ J

4312 K G HF M LKK N LKK N LKK N LLL O MLL O MLL O M J K N LKK N LKK N LKJ M KJJ M KJJ M K H F N J KI NJ KI NJ KI O K LJ OK LJ OK LJ O H IG LH IG LH IG LH HF KG HF

4581 U Z Y VRO A P P M T I P M T I P M T I P T A P W T A P W T A K R P W L S P W L S P W L S P V K R O V K R O Y V UWV L K P O L K P O L K P O S R W V S R W V S R W Y V V A Z W V A Z W V A Z W U Z Y V

4759 S WTYUZV D ZEA GZ EA GZ EA GZ B X DW BX DW BX D U Z Z FY DZ FY DZ FY D VBU ZV BU Z T Y S W W EY CZE Y CZE Y CZ B V ZWB V ZWB V ZW YW AXC W AXC W AXC W W TY

4816 N O K MOQ LO A C XA AC XA AC XAW Y TW WY TW WY TW WV QT TV QT TV QT TV L O OQ LO OQ K MNO V XYZ V XYZ V XYZ V TUV R TUV R TUV RM S T P RST P RST P R N O K M

4915 K M H I S T P P F G C C F G C C F G C C FBX X A B X X A B X X A T TT W X T T W X T T W X P P S T P P S T H IKM H U WY T UWY T UWY TP RT O PRT O PRT O POQ L MOQ L MOQ L MOM H I

4988 UD YC V Z ZY E I IH EI IH EI IH EI IH EI IH EI IH EZ Z DAE ED AE ED AE ED V Z ZY VZ Y C U D Y GY H CG Y H CG Y H CG Y H CG Y H CG Y H CG U D D H Z I DH Z I DH Z IY C

5596 G C KD G Y D V VU ZV VU ZV VU ZV VU ZV VU ZV VU ZVZ Y D W WV AW WV AW WV A ZZY DZ ZY D D GC D W ZV DW ZV DW ZV DW ZV DW ZV DW ZV D D G ZHA DZ HA DZ HA DZ K D

5738 O S MT O W TX Z CZD BC ZD BC ZD BA XB ZA XB ZA XB Z W Y C AB YC AB YC AB Y XVW TX VW T T OS M C XB VC XB VC XB VCV Z TA VZ TA VZ TA O X RY TX RY TX RY T S MT

6147 W P U VW P K W P Q N W P Q N W P Q N WM N K T M N K T M N K A O P M V O P M V O P M V OU R A T U R A T U R A W QY ZA T Y ZA T Y ZA T VWX Q V WX Q V WX Q UQ R K P QR K P QR K P V W P U V

6165 Y TY XY TORN MN QN MN QN MN Q ONO RO NO RO NO W T N O RO NO RO NO ROS T WT ST WT ST W Y S W VW RW VW RW VW RW WX SX WX SX WX S YS T OT ST OT ST OT S Y TY X

6673 Z AA X ZAB UAS W PV SW PV SW PV SW PV SW PV SW PV S B U W TX QW TX QW TX Q A XB UA XB U Z AA W YZZ W YZZ W YZZ W YZZ W YZZ W YZZ WZ A WT VWW T VWW T VWWX

6911 O M JE O M R J W R C U H C C U H C C U H C C R E Z Z R E Z Z R E Z Z J W R R J W R R J W R R J W R R J W R R J O M JE O U RM W U RM W U RM W ROJ T R OJ T R OJ T R J E O M JE O M JE O M JE

6947 W X U VWXZABCA E FG D EFG D EFG D ECD A BCD A BCD A A E F C DEF C DEF C DEC Z ABC Z A W X U VWYYZAB Y ZAB Y ZAB V WXY V WXY V WXY UY ZA X YZA X YZA X V

7163 A VZ X AVZ L R G N I OGN I OGN I O J Q L RJQ L RJQ L Y Q R M SKR M SKR M SK X S YQX S YQX S Y V V QTOS Q TOS Q TOS Q W RV T WRV T WRV X UPT R UPT R UPT R U V Z X

7356 WUU X WUUVX M R P T M R P T M R P T M V T X Q V T X Q V T XSX W A T Y W A T Y W A T YVZ S X V Z S X V ZU U A V TT W VTT W VTT WZ XX A ZXX A ZXX AW VV Y XVV Y XVV Y XVU X

7529 SV N Z S V N C E R Z C E Y Z C E Y Z C E YS V X R S V X R S V X RZ AC W X A C W X A C W X C E Y Z C E Y Z C E V N Z B E W I B E W I B E W I B X P B U X P B U X P B U T L X Q T L X Q T L X Q T N Z

8381 G BG ZG BG Z B P W GP IP GP IP GP IP GW PW NW PW NW P B S W PW NW PW NW PW NW U B SB UB SB U G Z Z U S LS NS LS NS LS N Z SZ UZ SZ UZ SZ B G U B WB UB WB UB WB Z

8421 GW A T G W A T W PI E O H I E O H I E O H I F P I J F P I J F P I JI S L M I S L M I S L M I W P Q M W P Q M W P A T W L P I V L P I V L P I V M Q J W M Q J W M Q J W WW P C S W P C S W P C S AT

8583 G DGBGDGB D Y A T V QV OV QV OV QV OV V A TA VA TA VAW DV A TA VA TA VA TA V DWD YD WD Y G BG Y B R WTWRWTWRWTW W BYBWBYBWBY G B D ADYDADYDADY

9312 M O EN MO EN M Q J L R MJ L R MJ L R MJ RX SP R X SP R X SP P W RO Q W RO Q W RO Q VQN P V QN P V QN NM R HQ PR HQ PR HQ P X NW VX NW VX NW VX F O NP FO NP FO NP F N

9324 W X P Y WX P Y WY L O UN NO UN NO UN NM SL LM SL LM SL L Z Z S ST ZS ST ZS ST F Y YZ FY YZ FY YYWU M V TU M V TU M V TU KT RS K T RS K T RS JS QR J S QR J S QR J Y

9711 MS E M M S E M M YM SY C Q W Y C Q W Y C Q W Y E S Y A E S Y A E S Y AY M S U Y M S U Y M S U Y M S U Y M S U Y M MM S ES S Y K S S Y K S S Y K U U A M U U A M U U A M UMS E M M S E M M S E M

As you can see, this output has more suggestion of patterns being present, in the form of short columns of colors as patterns align across grids. I had originally guessed, just as a hunch, that the encryption step came first, followed by the reordering step. This output seemed to confirm that my hunch was correct.

I also couldn't help noticing that a few of the entries had long sections of red, and that they happened to be entries with repeated digits in the key. So I tried an experiment where instead of sorting my grids by the key's numerical order, I sorted it so that keys with digits that were close together came earlier than keys with widely varying digits.

4312 K G HF M LKK N LKK N LKK N LLL O MLL O MLL O M J K N LKK N LKK N LKJ M KJJ M KJJ M K H F N J KI NJ KI NJ KI O K LJ OK LJ OK LJ O H IG LH IG LH IG LH HF KG HF

1544 K MK S OT RS OT RS OT RS OT RS OT RS OT RS HQ OP LQ OP LQ OP LQ K L HM KL HM KKKL QN NOQ N NOQ N NOQ N NOQ N NOQ N NOQ K O PR O OPR O OPR OK LN K KLN K

6673 Z AA X ZAB UAS W PV SW PV SW PV SW PV SW PV SW PV S B U W TX QW TX QW TX Q A XB UA XB U Z AA W YZZ W YZZ W YZZ W YZZ W YZZ W YZZ WZ A WT VWW T VWW T VWWX

5596 G C KD G Y D V VU ZV VU ZV VU ZV VU ZV VU ZV VU ZVZ Y D W WV AW WV AW WV A ZZY DZ ZY D D GC D W ZV DW ZV DW ZV DW ZV DW ZV DW ZV D D G ZHA DZ HA DZ HA DZ K D

7356 WUU X WUUVX M R P T M R P T M R P T M V T X Q V T X Q V T XSX W A T Y W A T Y W A T YVZ S X V Z S X V ZU U A V TT W VTT W VTT WZ XX A ZXX A ZXX AW VV Y XVV Y XVV Y XVU X

4988 UD YC V Z ZY E I IH EI IH EI IH EI IH EI IH EI IH EZ Z DAE ED AE ED AE ED V Z ZY VZ Y C U D Y GY H CG Y H CG Y H CG Y H CG Y H CG Y H CG U D D H Z I DH Z I DH Z IY C

6165 Y TY XY TORN MN QN MN QN MN Q ONO RO NO RO NO W T N O RO NO RO NO ROS T WT ST WT ST W Y S W VW RW VW RW VW RW WX SX WX SX WX S YS T OT ST OT ST OT S Y TY X

6947 W X U VWXZABCA E FG D EFG D EFG D ECD A BCD A BCD A A E F C DEF C DEF C DEC Z ABC Z A W X U VWYYZAB Y ZAB Y ZAB V WXY V WXY V WXY UY ZA X YZA X YZA X V

2746 K R PQ N Y YZ WA YZ WA YZ WA YX UY WX UY WX UY P V SW UV SW UV SW UV N R PQ NR PQ KR MQ O X SW QX SW QX SW Q VQU OV QU OV QU O R R V PW RV PW RV PW M Q KR MQ

4759 S WTYUZV D ZEA GZ EA GZ EA GZ B X DW BX DW BX D U Z Z FY DZ FY DZ FY D VBU ZV BU Z T Y S W W EY CZE Y CZE Y CZ B V ZWB V ZWB V ZW YW AXC W AXC W AXC W W TY

5738 O S MT O W TX Z CZD BC ZD BC ZD BA XB ZA XB ZA XB Z W Y C AB YC AB YC AB Y XVW TX VW T T OS M C XB VC XB VC XB VCV Z TA VZ TA VZ TA O X RY TX RY TX RY T S MT

1117 E E K KK EK KK EK KK EK KK EK KK EK KK EKQ Q KQ QQ KQ QQ KQK K EK KK EK KK EK KKKK QK KK QK KK QK KK QK KK QK KK QK KK QK KK QK KK QK K EKE EE KE EE K

2172 S W L H FF KG FF KG FF KG GG LH GG LH GG L NM HMI HH MI HH MI HH R N MM RN MM RNS M MM HL MM HL MM HL MN IM NN IM NN IM X X N R SS NR SS NR SSS W XX SW XX

8583 G DGBGDGB D Y A T V QV OV QV OV QV OV V A TA VA TA VAW DV A TA VA TA VA TA V DWD YD WD Y G BG Y B R WTWRWTWRWTW W BYBWBYBWBY G B D ADYDADYDADY

3147 O LP N S K T Q S K T Q S K T Q S HQ N P H Q N P H Q NQ GP M O G P M O G P M O G R O Q I R O Q I R O Q L T YVSW Y VSW Y VSW V SPT V SPT V SPT R MJN P MJN P MJN PO LP R OLP R

6147 W P U VW P K W P Q N W P Q N W P Q N WM N K T M N K T M N K A O P M V O P M V O P M V OU R A T U R A T U R A W QY ZA T Y ZA T Y ZA T VWX Q V WX Q V WX Q UQ R K P QR K P QR K P V W P U V

7163 A VZ X AVZ L R G N I OGN I OGN I O J Q L RJQ L RJQ L Y Q R M SKR M SKR M SK X S YQX S YQX S Y V V QTOS Q TOS Q TOS Q W RV T WRV T WRV X UPT R UPT R UPT R U V Z X

4581 U Z Y VRO A P P M T I P M T I P M T I P T A P W T A P W T A K R P W L S P W L S P W L S P V K R O V K R O Y V UWV L K P O L K P O L K P O S R W V S R W V S R W Y V V A Z W V A Z W V A Z W U Z Y V

9324 W X P Y WX P Y WY L O UN NO UN NO UN NM SL LM SL LM SL L Z Z S ST ZS ST ZS ST F Y YZ FY YZ FY YYWU M V TU M V TU M V TU KT RS K T RS K T RS JS QR J S QR J S QR J Y

4816 N O K MOQ LO A C XA AC XA AC XAW Y TW WY TW WY TW WV QT TV QT TV QT TV L O OQ LO OQ K MNO V XYZ V XYZ V XYZ V TUV R TUV R TUV RM S T P RST P RST P R N O K M

7529 SV N Z S V N C E R Z C E Y Z C E Y Z C E YS V X R S V X R S V X RZ AC W X A C W X A C W X C E Y Z C E Y Z C E V N Z B E W I B E W I B E W I B X P B U X P B U X P B U T L X Q T L X Q T L X Q T N Z

8421 GW A T G W A T W PI E O H I E O H I E O H I F P I J F P I J F P I JI S L M I S L M I S L M I W P Q M W P Q M W P A T W L P I V L P I V L P I V M Q J W M Q J W M Q J W WW P C S W P C S W P C S AT

2882 IU A O U U A O OO U I O O U I O O U IU U A O U U A O U U A O UAG U A A G U A A G U AUA O U U A O I U O O U GU U O A U U O A U U O A UAU G A A U G A A U U OUO A U U O A U U O A U U IU O O

3921 K RJQK Y J V O C N UO C N UO C N U P D O VP D O VP D O V K E P WQ E P WQ E P WQY J QK Y J QKRJ JK R G GH OG GH OG GH OG HI PH HI PH HI PHP Q XP PQ XP PQ XP P K RJ J

4915 K M H I S T P P F G C C F G C C F G C C FBX X A B X X A B X X A T TT W X T T W X T T W X P P S T P P S T H IKM H U WY T UWY T UWY TP RT O PRT O PRT O POQ L MOQ L MOQ L MOM H I

9312 M O EN MO EN M Q J L R MJ L R MJ L R MJ RX SP R X SP R X SP P W RO Q W RO Q W RO Q VQN P V QN P V QN NM R HQ PR HQ PR HQ P X NW VX NW VX NW VX F O NP FO NP FO NP F N

8381 G BG ZG BG Z B P W GP IP GP IP GP IP GW PW NW PW NW P B S W PW NW PW NW PW NW U B SB UB SB U G Z Z U S LS NS LS NS LS N Z SZ UZ SZ UZ SZ B G U B WB UB WB UB WB Z

1936 K W N TL C B HZ KB HZ KB HZ KT ZR CT ZR CT ZR CNW O ZQ WO ZQ WO ZQ W LWN TL WN N K Q M N TZ V W T Z V W T Z V W T R N O L R N O L R N O LQ PQ N T P Q N T P Q N TM N K Q M N

6911 O M JE O M R J W R C U H C C U H C C U H C C R E Z Z R E Z Z R E Z Z J W R R J W R R J W R R J W R R J W R R J O M JE O U RM W U RM W U RM W ROJ T R OJ T R OJ T R J E O M JE O M JE O M JE

9711 MS E M M S E M M YM SY C Q W Y C Q W Y C Q W Y E S Y A E S Y A E S Y AY M S U Y M S U Y M S U Y M S U Y M S U Y M MM S ES S Y K S S Y K S S Y K U U A M U U A M U U A M UMS E M M S E M M S E M

This output confirmed that general pattern, with red colors at the top slowly shifting across the spectrum into more blue and violet colors at the bottom.

Perhaps the most striking example was a single grid I had managed to find in which the key was 8888 — a single repeated digit. The scrambled grid for this key was all Gs. That suggested a number of possible underlying mechanisms for the encryption process, which unfortunately proved not to hold up. For example, you might naively expect that a key of three repeated digits might tend to have three out of every four letters be identical, but this was not the case. (Although such grids did tend to have a lot of repeated letters, demonstrating that this idea likely wasn't entirely wrong, just oversimplified.)

Still, you can see that there does seem to be an erratically recurring rhythm of fours in the grids. With the key being four digits long, there was definitely an attraction to the idea of the encryption process being based on a cycle of four steps, rotating through the digits of the key in some way or another. I found myself imagining a series of rotors, sort of like parts of the Enigma machine, for those of you who've read about that.

So the next colorizing display I tried out specifically indicated when a sequence of four letters was repeated.

1117 EEKKK EKKK EKKK EKKK EKKK EKKK EKQQKQ QQKQ QQKQ KKEK KKEK KKEK KKKKQ KKKQ KKKQ KKKQ KKKQ KKKQ KKKQKKKQKKKQ KKEKEE EKEE EK

1544 KMKSOTR SOTR SOTR SOTR SOTR SOTR SHQOPL QOPL QOPL QKLHM KLHM KKKLQNNO QNNO QNNO QNNO QNNO QNNO QKOPRO OPRO OPRO KLNK KLNK

1936 KWNTLCBHZK BHZK BHZK TZRC TZRC TZRC NWOZQ WOZQ WOZQ WLWNTLWNNKQMNTZVW TZVW TZVW TRNOL RNOL RNOL QPQNT PQNT PQNT MNKQMN

2172 SWLHFFKG FFKG FFKG GGLH GGLH GGLNMHMIH HMIH HMIH HRNMM RNMM RNSMMMHL MMHL MMHL MNIMN NIMN NIMXXNRSS NRSS NRSS SWXX SWXX

2746 KRPQNYYZWA YZWA YZWA YXUYW XUYW XUYPVSWU VSWU VSWU VNRPQ NRPQ KRMQOXSWQ XSWQ XSWQ VQUO VQUO VQUO RRVPW RVPW RVPW MQKRMQ

2882 IUAOUUAOOOUI OOUI OOUI UUAO UUAO UUAO UAGUA AGUA AGUA UAOUUAOIUOOUGUUOA UUOA UUOA UAUGA AUGA AUUOUOAU UOAU UOAU UIUOO

3147 OLPNSKTQ SKTQ SKTQ SHQNP HQNP HQNQGPMO GPMO GPMO GROQI ROQI ROQLTYVSW YVSW YVSW VSPT VSPT VSPT RMJNP MJNP MJNP OLPR OLPR

3921 KRJQKYJVOCNU OCNU OCNU PDOV PDOV PDOV KEPWQ EPWQ EPWQ YJQK YJQK RJJKRGGHO GGHO GGHO GHIPH HIPH HIPH PQXP PQXP PQXP PKRJJ

4312 KGHFMLKKN LKKN LKKN LLLOM LLOM LLOM JKNLK KNLK KNLK JMKJ JMKJ JMKHFNJKI NJKI NJKI OKLJ OKLJ OKLJ OHIGL HIGL HIGL HHFKGHF

4581 UZYVROAPPMTI PMTI PMTI PTAPW TAPW TAKRPWLS PWLS PWLS PVKRO VKRO YVUWVLKPO LKPO LKPO SRWV SRWV SRWYVVAZW VAZW VAZW UZYV

4759 SWTYUZVDZEAG ZEAG ZEAG ZBXDW BXDW BXDUZZFYD ZFYD ZFYD VBUZ VBUZ TYSWWEYCZ EYCZ EYCZ BVZW BVZW BVZW YWAXC WAXC WAXC WWTY

4816 NOKMOQLOACXA ACXA ACXA WYTW WYTW WYTW WVQTT VQTT VQTT VLOOQ LOOQ KMNOVXYZ VXYZ VXYZ VTUVR TUVR TUVR MSTPR STPR STPR NOKM

4915 KMHISTPPFGCC FGCC FGCC FBXXA BXXA BXXA TTTWX TTWX TTWX PPST PPST HIKMHUWYT UWYT UWYT PRTO PRTO PRTO POQLM OQLM OQLM OMHI

4988 UDYCVZZYEIIH EIIH EIIH EIIH EIIH EIIH EZZDAEE DAEE DAEE DVZZYVZYCUDYGYHC GYHC GYHC GYHC GYHC GYHC GUDDHZI DHZI DHZI YC

5596 GCKDGYDVVUZ VVUZ VVUZ VVUZ VVUZ VVUZ VZYDWWVA WWVA WWVA ZZYD ZZYD DGCDWZV DWZV DWZV DWZV DWZV DWZV DDGZHAD ZHAD ZHAD ZKD

5738 OSMTOWTXZCZDB CZDB CZDB AXBZ AXBZ AXBZ WYCAB YCAB YCAB YXVWT XVWT TOSMCXBV CXBV CXBV CVZTA VZTA VZTA OXRYT XRYT XRYT SMT

6147 WPUVWPKWPQN WPQN WPQN WMNKT MNKT MNKAOPMV OPMV OPMV OURAT URAT URAWQYZAT YZAT YZAT VWXQ VWXQ VWXQ UQRKP QRKP QRKP VWPUV

6165 YTYXYTORNMNQ NMNQ NMNQ ONOR ONOR ONOWTNORO NORO NORO STWT STWT STWYSWVWR WVWR WVWR WWXSX WXSX WXSYSTOT STOT STOT SYTYX

6673 ZAAXZABUASWPV SWPV SWPV SWPV SWPV SWPV SBUWTXQ WTXQ WTXQ AXBU AXBU ZAAWYZZ WYZZ WYZZ WYZZ WYZZ WYZZ WZAWTVW WTVW WTVW WX

6911 OMJEOMRJWRCUHC CUHC CUHC CREZZ REZZ REZZ JWRR JWRR JWRR JWRR JWRR JOMJEOURMW URMW URMW ROJT ROJT ROJT RJEOM JEOM JEOM JE

6947 WXUVWXZABCAEFGD EFGD EFGD ECDAB CDAB CDAAEFCD EFCD EFCD ECZABCZAWXUVWYYZAB YZAB YZAB VWXY VWXY VWXY UYZAX YZAX YZAX V

7163 AVZXAVZLRGNIO GNIO GNIO JQLR JQLR JQLYQRMSK RMSK RMSK XSYQ XSYQ XSYVVQTOS QTOS QTOS QWRVT WRVT WRVXUPTR UPTR UPTR UVZX

7356 WUUXWUUVXMRPT MRPT MRPT MVTXQ VTXQ VTXSXWATY WATY WATY VZSX VZSX VZUUAVTTW VTTW VTTW ZXXA ZXXA ZXXA WVVYX VVYX VVYX VUX

7529 SVNZSVNCERZCEY ZCEY ZCEY SVXR SVXR SVXR ZACWX ACWX ACWX CEYZ CEYZ CEVNZBEWI BEWI BEWI BXPBU XPBU XPBU TLXQ TLXQ TLXQ TNZ

8381 GBGZ GBGZ BPWGPIP GPIP GPIP GWPWN WPWN WPBSWPWN WPWN WPWN WUBSB UBSB UGZZUSLSN SLSN SLSN ZSZU ZSZU ZSZBGUBWB UBWB UBWB Z

8421 GWAT GWAT WPIEOH IEOH IEOH IFPIJ FPIJ FPIJ ISLM ISLM ISLM IWPQM WPQM WPATWLPIV LPIV LPIV MQJW MQJW MQJW WWPCS WPCS WPCS AT

8583 GDGB GDGB DYATVQVO VQVO VQVO VVATA VATA VAWDVATA VATA VATA VDWDY DWDY GBGYBRWTW RWTW RWTW WBYB WBYB WBYGBDADY DADY DADY

9312 MOEN MOEN MQJLRM JLRM JLRM JRXSP RXSP RXSP PWROQ WROQ WROQ VQNP VQNP VQNNMRHQP RHQP RHQP XNWV XNWV XNWV XFONP FONP FONP FN

9324 WXPY WXPY WYLOUNN OUNN OUNN MSLL MSLL MSLL ZZSST ZSST ZSST FYYZ FYYZ FYYYWUMVT UMVT UMVT UKTRS KTRS KTRS JSQR JSQR JSQR JY

9711 MSEM MSEM MYMSYCQW YCQW YCQW YESYA ESYA ESYA YMSU YMSU YMSU YMSU YMSU YMMMSESSYK SSYK SSYK UUAM UUAM UUAM UMSEM MSEM MSEM

In this display, the bland cyan is the background color. Green shows when a sequence of four letters is repeated once. Magenta indicates a "three-peat", i.e. another repetition, followed by yellow, then red, and so on. Once again, we see that there are vague patterns occurring across grids, and roughly moving towards the right as the digits in the key get larger.

However, what eventually caught my eye in this display was the stumpy little column of green down in the lower left. It surprised me because, unlike the other patterns, it was exactly vertical, straight up and down. And it was so close to the left edge, showing that these grids had a repeating pattern of four right from the start.

Upon studying this anomaly more closely, I realized that the column began right when the leftmost digit of the key increased to 8. Looking at the first eight letters of grids with a key starting with 7 revealed that they repeated three of the first four letters, but then the last letter invariably diverged. And likewise, keys starting with 6 suggested that they had the repeating-fours pattern as well, but interrupted after the sixth letter in the grid. Even more exciting, I noticed that when the key started with 4, then first four letters of the grid was repeated at the end of the grid.

Based on these observations, I theorized that this "quartet" encryption pattern actually ran all the way through the grid, from beginning to end. Mostly it was obscured with another layer of encryption, but it was visible along the leftmost edge, as long as the first digit of the key was greater than or equal to 4. I further theorized that if I isolated the underlying quartet pattern, I could subtract it out of the encrypted grid, and hopefully what remained would be simplified as a result. So instead of thinking of rotor wheels, I began to imagine the encryption process as a set of layers, each one applied in turn atop the other.

If I could completely associate the pattern of the underlying quartet to the four-digit key (and possibly the grid size), that would be a real milestone. In order to do that, I needed more data.

Interlude: Epicycles, Revisited

A while ago I drew comparisons with the my reordering model and the Ptolemaic system of planetary orbits, and I asked rhetorically, how does one go from epicycles to ellipses? I mentioned some of the facts that hinted at the incompleteness of the older model — Mercury and Venus, and the fixed stars — but really those were hints that the the geocentric model was wrong. Those hints led to the heliocentric model, but heliocentrism alone doesn't actually get rid of epicycles. You still need them to match the orbits of the planets, because you're still using circles for everything. In fact when Nicolaus Copernicus advanced his heliocentric system, he required more epicycles, not fewer, because he used them instead of equants, which geocentrism made necessary. Copernicus advocated for heliocentrism in part because it explained the aforementioned coincidences, but mainly because it allowed him to get rid of the equant points, which were hard to calculate with and, in Copernicus's opinion, aesthetically ugly. They sullied the perfection of the circles.

But if you're looking to ditch epicycles — if you want to make the leap to elliptical orbits, as Johannes Kepler did — those facts aren't enough. So: what did Kepler have that Copernicus didn't, that led him to leave circles behind completely, and consider a completely different model in search of simplicity? Probably more than anything, he had Tycho Brahe's data. Tycho Brahe compiled some of the most precise and thorough astronomical observations made before the invention of the telescope. It was by studying his numbers, and the story that they told, that Kepler could see the flaws in even the best epicyclical model, and was forced to abandon that system entirely, and to finally consider the possibility of orbits in the shape of imperfect, lopsided ellipses.

Collecting Data: Time For A New Approach

Like Kepler, I needed more data. In particular, I was now occasionally wanting to see the grid for a specific key. For example, I had noticed that most of the grids where my reordering algorithm failed were ones that had 1 or 2 for the first digit and 8 or 9 for the remaining digits. But I only had a few example of these. I needed more of them to see if this pattern held, or if it was just a coincidence. In order to do that, I either had to be extremely patient, or I had to figure out how to control which four-digit key was selected. Knowing that it was somehow related to the current time, I sat down and started doing some experiments. And in very little time I worked out the exact process by which the key was selected:

Take the current Unix time as a decimal number. Read the digits from right to left. Skip over the first (lowest) digit, and any zeros. Stop once four digits have been obtained.

For example:

time(0) = 1 2 1 8 8 6 8 0 9 5 ➡ 9 8 6 8

So now I knew how to pick a current time in order to get Across Lite to use a specific key. I could have set the computer clock just before scrambling a grid, but of course there's a much easier way. You may already be familiar with Unix's $LD_PRELOAD environment variable. By providing a path to a shared-object library in this variable, you can force it to be loaded ahead of any other dynamic libraries when a program is run — even before system libraries like libc. It's sometimes used to replace malloc() et al. with debugging versions, but a very common use, at least at one time, was to change the system time for a single program, typically demo programs that were set to expire at the end of some trial period. I did some poking around online and quickly found sample code that showed how to create a library that provided a replacement gettimeofday() system call. With the help of $LD_PRELOAD , I set it up so that this library was feeding the current time to my wine process, and thus by extension to the Across Lite windows program.

(Interesting aside: My first version of this library simply always returned the same time every time it was invoked, but to my surprise this caused the Across Lite program to malfunction, producing a negative value for the scrambling key. So I then tried just having my library set the initial time, but then count time normally afterwards. However, wine could sometimes be slow to initialize, and I found that starting too many wine processes simultaneously could cause one of them to crash, presumably due to some race-condition bug. So finally I modified my library so that it advanced the clock one microsecond every time gettimeofday() was invoked, and that worked perfectly. I also discovered that just starting a single wine process running Across Lite and then immediately shutting it down caused gettimeofday() to be invoked over 3000 times.)

Since I was already making changes to how I collected my data, I took this opportunity to do a major overhaul on the whole process. As proud as I was of my Rube Goldberg solution of cobbled-together scripts and obscure utilities, it really was a complicated and brittle solution. I have some experience with Windows code, and I knew that a Windows program would be able to handle all of the interactions with the Across Lite application, and it would be able to do so in a much more direct manner, for example not having to use OCR in order to read the text in a message box.

HWND wnd = NULL; int counter = 0; for (;;) { wnd = FindWindowEx(NULL, wnd, AL_WINDOW_CLASS, NULL); if (!wnd) break; char buf[AL_WINDOW_TITLE_PREFIX_LEN + 1]; GetWindowText(wnd, buf, sizeof buf); if (strcmp(buf, AL_WINDOW_TITLE_PREFIX)) continue; if (!PostMessage(wnd, WM_COMMAND, AL_IDM_SCRAMBLE, 0)) { warn("PostMessage to window %04X: %s", wnd, GetLastError()); ++failures; break; } ++counter; }

Here's a brief sample from the C program I wrote, which selects the "scramble" menu command on all running instances of Across Lite. Not only was this faster and more reliable, it was also much simpler.

Armed with my Windows program and my time-setting script, I set up a new data collection process. Now, instead of just fishing for grids at random, my scripts specifically gathered grids for every key in order (skipping over keys that were already in my collection). And this turned out to be extremely important. Because once I had long stretches of adjacent keys, I finally was able to see the patterns in explicit detail.

The Encryption Process: Layers Upon Layers

Here's an example of what some of the new data looked like:

7581 EBFXEBFUWXWMOQPMOQPMOQPMVXWTVXWTVXXUUWVSUWVSUWVSUYXUWYXUWBFXYVSKROSKROSKROZRYVZRYVZRYBFVCZDVCZDVCZDX

7582 EBF Y EBF VX X X OQ R ROQ R ROQ R RO W X XUW X XUW X YVV W WTV W WTV W WTV Y YVX Y YVX BF YZW U N TQU N TQU N TQ A T ZWA T ZWA T Z BF W CZD W CZD W CZD Y

7583 EBF Z EBF WY X Y QS S TQS S TQS S TQ X X YVX X YVX X ZWW W XUW W XUW W XUW Y ZWY Y ZWY BF ZAX W Q VSW Q VSW Q VS B V AXB V AXB V A BF X CZD X CZD X CZD Z

7584 EBF A EBF XZ X Z SU T VSU T VSU T VS Y X ZWY X ZWY X AXX W YVX W YVX W YVX Y AXZ Y AXZ BF ABY Y T XUY T XUY T XU C X BYC X BYC X B BF Y CZD Y CZD Y CZD A

7585 EBF B EBF YA X A UW U XUW U XUW U XU Z X AXZ X AXZ X BYY W ZWY W ZWY W ZWY Y BYA Y BYA BF BCZ A W ZWA W ZWA W ZW D Z CZD Z CZD Z C BF Z CZD Z CZD Z CZD B

7586 EBF C EBF ZB X B WY V ZWY V ZWY V ZW A X BYA X BYA X CZZ W AXZ W AXZ W AXZ Y CZB Y CZB BF CDA C Z BYC Z BYC Z BY E B DAE B DAE B D BF A CZD A CZD A CZD C

7587 EBF D EBF AC X C YA W BYA W BYA W BY B X CZB X CZB X DAA W BYA W BYA W BYA Y DAC Y DAC BF DEB E C DAE C DAE C DA F D EBF D EBF D E BF B CZD B CZD B CZD D

7588 EBF E EBF BD X D AC X DAC X DAC X DA C X DAC X DAC X EBB W CZB W CZB W CZB Y EBD Y EBD BF EFC G F FCG F FCG F FC G F FCG F FCG F F BF C CZD C CZD C CZD E

7589 EBF F EBF CE X E CE Y FCE Y FCE Y FC D X EBD X EBD X FCC W DAC W DAC W DAC Y FCE Y FCE BF FGD I I HEI I HEI I HE H H GDH H GDH H G BF D CZD D CZD D CZD F

7591 I E KA I E KWY W WQSUUQSUUQSUUQU W WSU W WSU W AWYSSOQSSOQSSOQ A AWY A AWY E KAYUYOWSYOWSYOWSAQYUAQYUAQY E KSAW C SAW C SAW C S

7592 IEK B IEK XZ W XRT U VRT U VRT U VRV W XTV W XTV W BXZT UQS T UQS T UQS A BXZ A BXZ EK BZVZ Q XTZ Q XTZ Q XTB S ZVB S ZVB S Z EK U BXD U BXD U BXD U

7593 IEK C IEK YA W YSU U WSU U WSU U WSW W YUW W YUW W CYAU WSU U WSU U WSU A CYA A CYA EK CAWA S YUA S YUA S YUC U AWC U AWC U A EK W CYE W CYE W CYE W

7594 IEK D IEK ZB W ZTV U XTV U XTV U XTX W ZVX W ZVX W DZBV YUW V YUW V YUW A DZB A DZB EK DBXB U ZVB U ZVB U ZVD W BXD W BXD W B EK Y DZF Y DZF Y DZF Y

7595 IEK E IEK AC W AUW U YUW U YUW U YUY W AWY W AWY W EACW AWY W AWY W AWY A EAC A EAC EK ECYC W AWC W AWC W AWE Y CYE Y CYE Y C EK A EAG A EAG A EAG A

7596 IEK F IEK BD W BVX U ZVX U ZVX U ZVZ W BXZ W BXZ W FBDX CYA X CYA X CYA A FBD A FBD EK FDZD Y BXD Y BXD Y BXF A DZF A DZF A D EK C FBH C FBH C FBH C

7597 IEK G IEK CE W CWY U AWY U AWY U AWA W CYA W CYA W GCEY EAC Y EAC Y EAC A GCE A GCE EK GEAE A CYE A CYE A CYG C EAG C EAG C E EK E GCI E GCI E GCI E

7598 IEK H IEK DF W DXZ U BXZ U BXZ U BXB W DZB W DZB W HDFZ GCE Z GCE Z GCE A HDF A HDF EK HFBF C DZF C DZF C DZH E FBH E FBH E F EK G HDJ G HDJ G HDJ G

7599 IEK I IEK EG W EYA U CYA U CYA U CYC W EAC W EAC W IEGA IEG A IEG A IEG A IEG A IEG EK IGCG E EAG E EAG E EAI G GCI G GCI G G EK I IEK I IEK I IEK I

7611 QV K QQV K J K OJOP T JOP T JOP T KPQUKPQUKPQU K J K OEJ K OEJ K OEJ K OEJ K OEJ K V K QUZOUU Z OUU Z OUU A PVVAPVVAPVVV K QQV K QQV K QQV K Q

7612 QVK R QVK KL O LQR U LQR U LQR U MRS V MRS V MRS V MLM P GLM P GLM P GL L O FKL O FKL VK R UZO V UZO V UZO V UAP W VAP W VAP W VV L S RWL S RWL S RW K R

7613 QVK S QVK LM O NST V NST V NST V OTU W OTU W OTU W ONO Q INO Q INO Q IN M O GLM O GLM VK S UZO W UZO W UZO W UAP X VAP X VAP X VV M U SXM U SXM U SX K S

7614 QVK T QVK MN O PUV W PUV W PUV W QVW X QVW X QVW X QPQ R KPQ R KPQ R KP N O HMN O HMN VK T UZO X UZO X UZO X UAP Y VAP Y VAP Y VV N W TYN W TYN W TY K T

7615 QVK U QVK NO O RWX X RWX X RWX X SXY Y SXY Y SXY Y SRS S MRS S MRS S MR O O INO O INO VK U UZO Y UZO Y UZO Y UAP Z VAP Z VAP Z VV O Y UZO Y UZO Y UZ K U

7616 QVK V QVK OP O TYZ Y TYZ Y TYZ Y UZA Z UZA Z UZA Z UTU T OTU T OTU T OT P O JOP O JOP VK V UZO Z UZO Z UZO Z UAP A VAP A VAP A VV P A VAP A VAP A VA K V

7617 QVK W QVK PQ O VAB Z VAB Z VAB Z WBC A WBC A WBC A WVW U QVW U QVW U QV Q O KPQ O KPQ VK W UZO A UZO A UZO A UAP B VAP B VAP B VV Q C WBQ C WBQ C WB K W

7618 QVK X QVK QR O XCD A XCD A XCD A YDE B YDE B YDE B YXY V SXY V SXY V SX R O LQR O LQR VK X UZO B UZO B UZO B UAP C VAP C VAP C VV R E XCR E XCR E XC K X

7619 QVK Y QVK RS O ZEF B ZEF B ZEF B AFG C AFG C AFG C AZA W UZA W UZA W UZ S O MRS O MRS VK Y UZO C UZO C UZO C UAP D VAP D VAP D VV S G YDS G YDS G YD K Y

7621 XBSWXBS QR VFJKOFJKOFJKOFKLPGKLPGKLPGPQULPQULPQULP R V M QR V M QR BSWRULPQULPQULPQVMQRVMQRVMQRB R VWA R VWA R VWASW

7622 XBS X XBS RS V IMN Q IMN Q IMN Q I MN Q IMN Q IMN Q I QR U MQR U MQR U MQS V NRS V NRS BS X R VM R RVM R RVM R R VM R RVM R RVM R RBR W WAR W WAR W WAS X

7623 XBS Y XBS ST V LPQ S LPQ S LPQ S L OP R KOP R KOP R K RS U NRS U NRS U NRT V OST V OST BS Y R WN T SWN T SWN T S VM S RVM S RVM S RBR X WAR X WAR X WAS Y

7624 XBS Z XBS TU V OST U OST U OST U O QR S MQR S MQR S M ST U OST U OST U OSU V PTU V PTU BS Z R XO V TXO V TXO V T VM T RVM T RVM T RBR Y WAR Y WAR Y WAS Z

7625 XBS A XBS UV V RVW W RVW W RVW W R ST T OST T OST T O TU U PTU U PTU U PTV V QUV V QUV BS A R YP X UYP X UYP X U VM U RVM U RVM U RBR Z WAR Z WAR Z WAS A

7626 XBS B XBS VW V UYZ Y UYZ Y UYZ Y U UV U QUV U QUV U Q UV U QUV U QUV U QUW V RVW V RVW BS B R ZQ Z VZQ Z VZQ Z V VM V RVM V RVM V RBR A WAR A WAR A WAS B

7627 XBS C XBS WX V XBC A XBC A XBC A X WX V SWX V SWX V S VW U RVW U RVW U RVX V SWX V SWX BS C R AR B WAR B WAR B W VM W RVM W RVM W RBR B WAR B WAR B WAS C

7628 XBS D XBS XY V AEF C AEF C AEF C A YZ W UYZ W UYZ W U WX U SWX U SWX U SWY V TXY V TXY BS D R BS D XBS D XBS D X VM X RVM X RVM X RBR C WAR C WAR C WAS D

7629 XBS E XBS YZ V DHI E DHI E DHI E D AB X WAB X WAB X W XY U TXY U TXY U TXZ V UYZ V UYZ BS E R CT F YCT F YCT F Y VM Y RVM Y RVM Y RBR D WAR D WAR D WAS E

Since I now had every single key value in sequence, I introduced a different coloring process. Here, the colors of each letter indicates its relationship with the same letter in the previous grid. Cyan indicates a letter that's the same as the one above it. Green marks a letter that's exactly one after the one above it. Yellow if it jumps by two, magenta if jumps by three. Dark blue if it appears to be unrelated. The grids making those dark blue dividing lines correspond, as you can see, to the points where the rightmost digits rolls back from 9 to 1, and the second-to-last digit increments.

Look at this. It's abundantly clear here, that when only the rightmost digit is changing, you can easily see its effect. And the effect remains consistent as long as the other digits are consistent. (The effect is not the same regardless of the other digits, though. It's not quite that simple.)

Anyway, the original motivation for collecting such complete data was trying to study the "underlying quartet" pattern. With this extra data, all it took was a single all-nighter, and I managed to work out the formulas that completely described the underlying quartet for all grid sizes.

if N % 4 == 0 or 1: z = a · (2 – b % 2 – c % 2) + c · (0 + b % 2 + c % 2) else if N % 4 == 2 or 3: z = a · (0 + b % 2 + c % 2) + c · (2 – b % 2 – c % 2) then, if N % 4 == 0 or 3: q 0 = z + a · (2 – a % 2) + c · (0 + a % 2)

q 1 = z + a · (0 + a % 2) + c · (1 – a % 2) + b

q 2 = z + a · (1 – a % 2) + c · (1 + a % 2)

q 3 = z + a · (0 + a % 2) + c · (1 – a % 2) + d

else if N % 4 == 1 or 2: q 0 = z + a · (1 + a % 2) + c · (1 – a % 2)

q 1 = z + a · (1 – a % 2) + c · (0 + a % 2) + b

q 2 = z + a · (0 + a % 2) + c · (2 – a % 2)

q 3 = z + a · (1 – a % 2) + c · (0 + a % 2) + d

(22)(02)(02)(22)(22)(02)(02)(12)(12)(12)(02)(12)(12)(12)(12)(02)(02)(22)(12)(02)

Plugging in the grid size for N and the four digits of the key for a b c d, this spits out four numbers which consistently matched the underlying quartet for all grids that I could examine.

8891 HIIBHIIBIBJUZSALZSALZSALZSALZSALZSALZSALZSALIBBMATBMATBMATBMATJUIBJUIBJUHIIBGHHAGHHAGHHAGHHAGHHAGHHAGHHAGHHAHIITZAATZAATZAATZAAB

8892 HII C HII C I C J W A U B O A U B O A U B O A U B O A U B O A U B O A U B O A U B O I CC P B V C P B V C P B V C P B V J W I C J W I C J W HII C GHH B GHH B GHH B GHH B GHH B GHH B GHH B GHH B HII V ABB V ABB V ABB V ABBC

8893 HII D HII D I D J Y B W C R B W C R B W C R B W C R B W C R B W C R B W C R B W C R I DD S C X D S C X D S C X D S C X J Y I D J Y I D J Y HII D GHH C GHH C GHH C GHH C GHH C GHH C GHH C GHH C HII X BCC X BCC X BCC X BCCD

8894 HII E HII E I E J A C Y D U C Y D U C Y D U C Y D U C Y D U C Y D U C Y D U C Y D U I EE V D Z E V D Z E V D Z E V D Z J A I E J A I E J A HII E GHH D GHH D GHH D GHH D GHH D GHH D GHH D GHH D HII Z CDD Z CDD Z CDD Z CDDE

8895 HII F HII F I F J C D A E X D A E X D A E X D A E X D A E X D A E X D A E X D A E X I FF Y E B F Y E B F Y E B F Y E B J C I F J C I F J C HII F GHH E GHH E GHH E GHH E GHH E GHH E GHH E GHH E HII B DEE B DEE B DEE B DEEF

8896 HII G HII G I G J E E C F A E C F A E C F A E C F A E C F A E C F A E C F A E C F A I GG B F D G B F D G B F D G B F D J E I G J E I G J E HII G GHH F GHH F GHH F GHH F GHH F GHH F GHH F GHH F HII D EFF D EFF D EFF D EFFG

8897 HII H HII H I H J G F E G D F E G D F E G D F E G D F E G D F E G D F E G D F E G D I HH E G F H E G F H E G F H E G F J G I H J G I H J G HII H GHH G GHH G GHH G GHH G GHH G GHH G GHH G GHH G HII F FGG F FGG F FGG F FGGH

8898 HII I HII I I I J I G G H G G G H G G G H G G G H G G G H G G G H G G G H G G G H G I II H H H I H H H I H H H I H H H J I I I J I I I J I HII I GHH H GHH H GHH H GHH H GHH H GHH H GHH H GHH H HII H GHH H GHH H GHH H GHHI

8899 HII J HII J I J J K H I I J H I I J H I I J H I I J H I I J H I I J H I I J H I I J I JJ K I J J K I J J K I J J K I J J K I J J K I J J K HII J GHH I GHH I GHH I GHH I GHH I GHH I GHH I GHH I HII J HII J HII J HII J HIIJ

8911 S M L ES M L EATT L J CCU J CCU J CCU J CCU I BBT I BBT I BBT I BBTATT L ATT L ATT L ATT L ATT L ATT L ATT L S M L ESUT M AUT M AUT M AUT M ATS L ZTS L ZTS L ZTS L Z M L ES M L ES M L ES M L ES M L E

8912 SML F SML F A U T N J D C W J D C W J D C W J D C W I C B V I C B V I C B V I C B V A V U O B V U O B V U O B V U O BU T N A U T N A U T N SML F S VU O BVU O BVU O BVU O BUT N AUT N AUT N AUT N ANM G TNM G TNM G TNM G T ML F

8913 SML G SML G A V T P J E C Y J E C Y J E C Y J E C Y I D B X I D B X I D B X I D B X A X V R C X V R C X V R C X V R CV T P A V T P A V T P SML G S WV Q CWV Q CWV Q CWV Q CVU P BVU P BVU P BVU P BON I UON I UON I UON I U ML G

8914 SML H SML H A W T R J F C A J F C A J F C A J F C A I E B Z I E B Z I E B Z I E B Z A Z W U D Z W U D Z W U D Z W U DW T R A W T R A W T R SML H S XW S DXW S DXW S DXW S DWV R CWV R CWV R CWV R CPO K VPO K VPO K VPO K V ML H

8915 SML I SML I A X T T J G C C J G C C J G C C J G C C I F B B I F B B I F B B I F B B A B X X E B X X E B X X E B X X EX T T A X T T A X T T SML I S YX U EYX U EYX U EYX U EXW T DXW T DXW T DXW T DQP M WQP M WQP M WQP M W ML I

8916 SML J SML J A Y T V J 