Writing with the machine

I made a thing!

Building this felt like playing with Lego, except instead of plastic bricks, I was snapping together conveniently-packaged blocks of human intellect and effort.

One block: a recurrent neural network, fruit of the deep learning boom, able to model and generate sequences of characters with spooky verisimilitude. Snap!

Another block: a powerfully extensible text editor. Snap!

Together: responsive, inline “autocomplete” powered by an RNN trained on a corpus of old sci-fi stories.

If I had to offer an extravagant analogy (and I do) I’d say it’s like writing with a deranged but very well-read parrot on your shoulder. Anytime you feel brave enough to ask for a suggestion, you press tab , and…

If you’d like to try it yourself, the code is now available, in two parts:

torch-rnn-server is a server that runs the neural network, accepts snippets of text, and returns “completions” of that text. In truth, it’s just a couple of tiny shims laid beneath Justin Johnson’s indispensable torch-rnn project.

is a server that runs the neural network, accepts snippets of text, and returns “completions” of that text. In truth, it’s just a couple of tiny shims laid beneath Justin Johnson’s indispensable project. rnn-writer is a package for the Atom text editor that knows how to talk to torch-rnn-server and present its completions to the user. I’m also providing an API for folks who want to try this but don’t feel up to the task of running a local server.

You’ll find instructions for both tools on their respective Github pages, and if you have difficulties with either, feel free to open an issue or drop me a line.

Mainly, I wanted to share those links, but as long as I’m here I’ll add a few more things: first a note on motivations, then an observation about the deep learning scene, and finally a link to the sci-fi corpus.

The vision

From my first tinkerings with the torch-rnn project, generating goofy/spooky text mimicry on the command line, I was struck—almost overwhelmed—by a vision of typing normally in a text editor and then summoning the help of the RNN with a keystroke. (When I say “help,” I mean: less Clippy, more séance.)

After fumbling around for a few weeks and learning five percent of two new programming languages, I had the blocks snapped together; the RNN trained; the vision realized. And then my first hour playing with it was totally deflating. Huh. Not as cool as I imagined it would be.

This is an unavoidable emotional waystation in any project, and possibly a crucial one.

As I’ve spent more time with rnn-writer , my opinion has—er—reinflated somewhat. I am just so compelled by the notion of a text editor that possesses a deep, nuanced model of… what? Everything ever written by you? By your favorite authors? Your nemesis? All staff writers at the New Yorker, present and past? Everyone on the internet? It’s provocative any way you slice it.

I should say clearly: I am absolutely 100% not talking about an editor that “writes for you,” whatever that means. The world doesn’t need any more dead-eyed robo-text.

The animating ideas here are augmentation; partnership; call and response.

The goal is not to make writing “easier”; it’s to make it harder.

The goal is not to make the resulting text “better”; it’s to make it different — weirder, with effects maybe not available by other means.

The tools I’m sharing here don’t achieve that goal; their effects are not yet sufficient compensation for the effort required to use them. But! I think they could get there! And if this project has any contribution to make beyond weird fun, I think it might be the simple trick of getting an RNN off the command line and into a text editor, where its output becomes something you can really work with.

Deep scenius

Like any tech-adjacent person, I’d been reading about deep learning for a couple of years, but it wasn’t until a long conversation earlier this year with an old friend (who is eye-poppingly excited about these techniques) that I felt motivated to dig in myself. And, I have to report: it really is a remarkable community at a remarkable moment. Tracking papers on Arxiv, projects on Github, and threads on Twitter, you get the sense of a group of people nearly tripping over themselves to do the next thing — to push the state of the art forward.

That’s all buoyed by a strong (recent?) culture of clear explanation. My excited friend claims this has been as crucial to deep learning’s rise as the (more commonly-discussed) availability of fast GPUs and large datasets. Having benefited from that culture myself, it seems to me like a reasonable argument, and an important thing to recognize.

Here are a couple of resources I found especially useful:

For getting acquainted with RNNs, the canonical document is Andrej Karpathy’s essay, The Unreasonable Effectiveness of Recurrent Neural Networks. It’s a really remarkable example of technical communication---deep and detailed but friendly, even playful.

Google’s free deep learning course is really very good, and it provided a crucial foundation for me. Structured learning: who knew??

Ross Goodwin's Adventures in Narrated Reality brings RNNs into a creative context and doesn't skimp on technical details. I learned some key tricks from Ross's piece.

149,326,361 characters

Most of the energy in the deep learning scene is focused on what I’d call “generic” problems, the solutions to which are very broadly useful to a lot of people: image recognition, speech recognition, sentence translation… you get the idea. Many of these problems have associated benchmark challenges, and if your model gets a better score than the reigning champ, you know you’ve done something worthwhile. These challenges all depend on standard datasets. And these—datasets—are—extremely boring.

So, a large part of the work (and fun) of applying the deep learning scenesters’ hard-won technical triumphs to weird/fun objectives is tracking down non-standard, non-boring datasets. For me, decisions about the collection and processing of the text corpus have been more consequential than decisions about the RNN’s design and subsequent training.

The corpus I’ve used most is derived from the Internet Archive’s Pulp Magazine Archive: 150MB of Galaxy and IF Magazine. It’s very noisy, with tons of OCR errors and plenty of advertisements mixed in with the sci-fi stories, but wow there is a lot of text, and the RNN seems to thrive on that. I lightly processed and normalized it all, and the combined corpus—now just a huge text file without a single solitary line break—is available on the Internet Archive.

So, in conclusion:

Snap. Snap. Snap!

May 2016, Berkeley