Let’s see, I’m not completely sure how these work(I sound like an 87 year old) but it all began when a professor mentioned that our assignments will have questions that might ask us to crack the Vigenere cipher.

For some reason, later that night I was bored and thinking of things to do and decided I wanted to teach a Neural Network how to decode the Vigenere cipher.

Wait a minute all I’ve worked with are CNNs to classify images and I’m very bad with math and a little fuzzy on how the CNNs even worked so should I actually attempt something like this?

Obviously, without a doubt, yes.

(This is the point in the story where I mention how I regret every decision I make because of how poorly thought out they are)

THE BEGINNING

So I began my work from the first step, ideating.

My initial approach was that I just need it to brute force all 26 shifts and then make the network(who I will train with some magical dataset of valid and invalid words)mark words as valid or invalid and the sentence with the most amount of valid words is probably the plaintext.

This approach fell apart rather quickly once I realized that English words are basically made out of whims and don’t have rigid structural rules to them. I mean, swag is a recognized English word. C’mon.

It totally did not take me 2 days to realize that my initial approach was stupid.

THE REAL BEGINNING

Shaking off my previous failure like a complete champion, I decided that I needed to find out if it was even possible to do something like this(my thought process isn’t sorted,haha). I researched online and discovered that hypothetically I could just train an RNN(yes, it is what what you’re thinking, RNN stands for right now now) to decode the cipher.

But just because something works in theory doesn’t mean it could actually be done so I decided to check if anyone had ever done this before. If not, I could probably use that as a cop-out and not do the project.(I understand that argument wouldn’t stand in court but hey, everybody has flaws)

My criterion for finding Proof Of Concept was :

Should have decoded the Vigenere cipher using RNNs. Model should have achieved an accuracy of at least 99.69%.

After Googling a bit, I found an article that mentioned how an RNN can be taught to crack the Vigenere cipher with 99.7% accuracy.

Damn it. Guess I gotta do it now.

But then I realized that I’m way out of my depth and decided to scale back to trying to teach a RNN to decode the Caesar Cipher.

It’s a couple of shifts how hard could it be to build something like that? All I got to do is learn how RNNs work, get a dataset and build a model. Easy peasy.

(Sometimes I stay up at night thinking about how dumb I can be)

But now I knew what to do and that it was possible, I was ready to solve this problem.

Except I still didn’t know what RNNs were.

Or how to build the dataset for the model.

Or ho- you know what this is depressing let’s just move onto the next section.

tl;dr for this section is that I’m dumb and also Jon Snow because up till now, I know nothing.

THE THING THAT COMES AFTER THE BEGINNING?

After I started reading up on RNNs, it took me very little time to get completely confused on what exactly they were. There were also so many additional types of them that I quickly decided that I was hungry and that I needed a nap.

And then a few after that.

But after 2 days(don’t judge I get sleepy when I’m stressed), I finally decided to figure out what I needed but the problem was that most of the RNN tutorials online only talked about text generation examples and pretty much nothing else so I was stuck.

After about 1500 articles(it could’ve been 7 I already mentioned I suck at math), I started piecing together that what I needed was text prediction(did I also mention that I’m slow?) so I started looking at seq2seq as something that would be my deliverance. I figured since it can be used for translating from one language to another I could obviously use it to translate the ciphertext to the plaintext, right?

Wrong.

Well not completely wrong, but seq2seq is usually used when the size of the input and the output are different as can be when a sentence in one language is translated to another. So, I gave up on seq2seq and moved on to brighter things.

Sidenote : F**k my life is baise ma vie in French. Don’t ask me why I know that.

After another day of research, I realized that since my input size and my output size would be the same because of the way Caesar Cipher works, all I needed was LSTM which is Long Short Term Memory.

Once I figured that out, I was unstoppable.

I’m kidding, I was tired so I took a nap.

I’M ALL OUT OF HEADER IDEAS

Now, the problem was that I still needed a set of clean sentences which didn’t have punctuation or any special characters that I could shift by 1–25 letters to build my training set.

Initially, I decided to download the 20 most popular books on Project Gutenberg and tried to strip them down to simple sentences.

That failed.

(At this point, we have established all my initial ideas suck.)

The reason of the failure was due to the fact that books have this lame thing called dialogues and they are also scattered with names, something I thought would affect testing accuracy. (I had a reason for it, I just forgot)

After a while of googling for datasets, I came across corpuses(That word sounds all kinds of wrong). Corpus is basically a large and structured set of texts.

Eureka!

(Simple sentences, and many of them? Dream come true who needs a remote control helicopter anyway.)

Except, most of them needed me to pay for them. Google had an ngram corpus of text they scanned from Google Books but it was so confusing to download and figure out I let it be and looked for other ngram corpuses(I’m 100% sure plural of corpus is not corpuses by this point).

I found an ngram corpus website which gave me 2,3,4, and 5 gram datasets. 5 gram basically means sentences of 5 words, 10 million of them.

Life is good.

Now, all that was left was preprocessing the data, creating the ciphertexts for training, building the model and fine-tuning it.

Whelp! That still sounds like so much work.

Oh well.

STILL GOT NOTHING

I finally decided it was time to start some real coding, so I opened a new notebook on Google Colab and loaded all the text documents from my Drive.

I decided to load the text documents using Pandas since I had never used it before and wanted to make my life harder, for some reason.

(Obviously, I was smart and thoughtful and didn’t account for the exhaustion and the complete doneness due to the learning curve of the RNNs already. )

The structure was simple, each line had a sentence with each word separated by a \t and a number in front of each sentence.

I dropped the first row of numbers since it was useless and then decided to use a lambda function to convert all text to uppercase to begin with.

Except, the damn thing wouldn’t convert.

I kept trying and Colab kept throwing random errors that changed again and again. I googled but was unsuccessful in figuring anything out.

I decided to give up.

The end.

Okay fine, I didn’t give up…yet.

After a couple more tries to resolve the issue, I decided to just use Sublime Text’s regex functions in Find and Replace to clean up the data and remove all the special characters, punctuation and the \ts. (Although the first time I tried it, my computer heated up so much it crashed. Fun times.)

Sidenote: Pandas is dumb. Not unlike actual Pandas.

I also managed to convert everything to uppercase, which is clearly the superior case so I was happy with my work.

Well, something’s done.

To be continued(hopefully) because its 4:57 AM here and I’m tired…