Hey all, Ernie here with a piece from contributor John Ohno, who last wrote a piece a year and a half ago about the home robot trend of the 1980s. Here’s a completely different take on automation. Today in Tedium: AI is in the news a lot these days, and journalists, being writers, tend to be especially interested in computers that can write. Between OpenAI’s GPT-2 (the text-generating “transformer” whose creators are releasing it a chunk at a time out of fear that it could be used for evil), Botnik Studios (the comedy collective that inspired the “we forced a bot to watch 100 hours of seinfeld” meme), and National Novel Generation Month (henceforth NaNoGenMo—a yearly challenge to write a program that writes a novel during the month of November), when it comes to writing machines, there’s a lot to write about. But if you only read about writing machines in the news, you might not realize that the current batch is at the tail end of a tradition that is very old. Today’s Tedium talks procedurally generated text. — John @ Tedium Today’s issue of Tedium is brought to you by IVPN. More from them in a second.

Six major events in the history of procedural text generation 1305: The publication of the first edition of Ramon Llull’s Ars Magna, whose later editions introduced combinatorics. (Image, above, is one of Llull’s wheels) 1921: Tristan Tzara publishes “How To Write a Dadaist Poem,” describing the cut-up technique. 1983: The “travesty generator” is described in Scientific American. The Policeman’s Beard is Half Constructed, a book written by “artificial insanity” program Racter, is published. 2005: A paper written by SCIgen is accepted into the WMSCI conference. 2014: Eugene Goostman passes Royal Society Turing Test. 2019: OpenAI releases the complete GPT-2 model.

In their own small way, refrigerator poetry kits fit inside the history of procedural text. (rmkoske/Flickr) The tangled prehistory of writing and writing machines Depending on how loosely you want to define “machine” and “writing,” you can plausibly claim that writing machines are almost as old as writing. The earliest examples of Chinese writing we have were part of a divination method where random cracks on bone were treated as choosing from or eliminating part of a selection of pre-written text (a technique still used in composing computer-generated stories). Descriptions of forms of divination whereby random arrangements of shapes are treated as written text go back as far as we have records—today we’d call this kind of thing “asemic writing” (asemic being a fancy term for “meaningless”), and yup, computers do that too. But, the first recognizably systematic procedure for creating text is probably that of the medieval mystic Ramon Llull. For the second edition of his book Ars Magna (first published in 1305), he introduced the use of diagrams and spinning concentric papercraft wheels as a means of combining letters—something he claimed could show all possible truths about a subject. While computer-based writing systems today tend to have more complicated rules about how often to combine letters, the basic concept of defining all possible combinations of some set of elements (a branch of mathematics that’s now called combinatorics) looms large over AI and procedural art in general. The 20th century, though, is really when procedural literature comes into its own. Llull’s combinatorics, which had echoed through mathematics for six hundred years, got combined with statistics and in 1906 Andrey Markov published his first paper on what would later become known as the “Markov Chain”—a still-popular method of making a whole sequence of events (such as an entire novel) out of observations about how often onekind of event follows another (such as how often the word “cows” comes after the word “two”). Markov chains would become useful in other domains (for instance, they became an important part of the “Monte Carlo method” used in the first computer simulations of hydrogen bombs), but they are most visible in the form of text generators: they are the source of the email “spam poetry” you probably receive daily (an attempt to weaken automatic spam-recognition software, which looks at the same statistics about text that Markov chains duplicate) and they are the basis of the nonsense-spewing “ebooks” bots on twitter. Fifteen years after Markov’s paper, the Dadaist art movement popularized another influential technique with the essay “HOW TO MAKE A DADAIST POEM”. This is known as the “cut up” technique, because it involves cutting up text and rearranging it at random. It’s the basis for refrigerator poetry kits, but literary luminaries like T. S. Elliot, William S. Burroughs, and David Bowie used the method (on source text edgier than the magnetic poetry people dare use, such as negative reviews) to create some of their most groundbreaking, famous, and enduring work. When computerized text generators use Markov chains, a lot of the appeal comes from the information lost by the model—the discontinuities and juxtapositions created by the fact that there’s more that matters in an essay than how often two words appear next to each other—so most use of Markov chains in computerized text generation are also, functionally, using the logic of the cut-up technique. That said, there are computerized cut-up generators of various varieties, some mimicking particular patterns of paper cut-ups.

Protect yourself from digital surveillance. Use IVPN with AntiTracker. Your privacy is under threat. Service providers log your browsing activities. Data brokers build profiles of you. Social networks track you across the internet. With IVPN, an encrypted tunnel protects your connection from monitoring, while web trackers and ads stop following you around. Sign up now for comprehensive privacy protection. Learn More

David Bowie used a computer program to automate a variation of the cut-up technique to produce the lyrics on his 1995 album Outside. Based on his description, I wrote a program to simulate it. He’s no stranger to the technique—he’s been using it since the 70s.

A portrait of early technological pioneer Claude Shannon. (thierry ehrmann/Flickr) Quantifying the strangeness of art Claude Shannon (scientific pioneer, juggling unicyclist, and inventor of the machine that shuts itself off) was thinking about Markov chains in relation to literature when he came up with his concept of “information entropy.” When his paper on this subject was published in 1948, it launched the field of information theory, which now forms the basis of much of computing and telecommunications. The idea of information entropy is that rare combinations of things are more useful in predicting future events—the word “the” is not very useful for predicting what comes after it, because nouns come after it at roughly the same rate as nouns come after all sorts of other words, whereas in english “et” almost always comes before “cetera.” In his model, this means “the” is a very low-information word, while “et” is a very high-information word. In computer text generation, information theory gets used to reason about what might be interesting to a reader: a predictable text is boring, but one that is too strange can be hard to read. Making art stranger by increasing the information in it is the goal of some of the major 20th century avant-garde art movements. Building upon Dada, Oulipo (a french “workshop of potential literature” formed in 1960) took procedural generation of text by humans to the next level, inventing a catalogue of “constraints”—games to play with text, either limiting what can be written in awkward ways (such as never using the letter “e”) or changing an existing work (such as replacing every noun with the seventh noun listed after it in the dictionary). Oulipo has been very influential on computer text generation, in part because they became active shortly after computerized generation of literature began. In 1952, the Manchester Mark I was programmed to write love letters, but the first computerized writing machine to get a TV spot was 1961’s SAGA II, which wrote screenplays for TV westerns. SAGA II has the same philosophy of design as 1976’s TaleSpin: what Judith van Stegeren and Marlet Theune (in their paper on techniques used in NaNoGenMo) term “simulation.” Simulationist text generation involves creating a set of rules for a virtual world, simulating how those rules play out, and then describing the state of the world: sort of like narrating as a robot plays a video game. Simulation has high “narrative coherence”—everything that happens makes sense—but tends to be quite dull, and to the extent that systems like SAGA II and TaleSpin are remembered today, it’s because bugs occasionally caused them to produce amusingly broken or nonsensical stories (what the authors of TaleSpin called “mis-spun tales”).

Computers go to Hollywood The screenplay by SAGA II (above) has quite a different feel from Sunspring, another screenplay also staged by actors (below). This is because, while SAGA II explicitly models characters and their environment, the neural net that wrote Sunspring does not. Neural nets, like Markov chains, really only pay attention to frequency of co-occurrence, although modern neural nets can do this in a very nuanced way, able to weigh not just how the immediate next word is affected but how that affects words half a sentence away, and at the same time able to invent new words by making predictions at the level of individual letters. But, because they are statistical, everything a neural net knows about the world comes down to associations in its training data—in the case of a neural net trained on text, how often certain words appear near each other. This accounts for how dreamlike Sunspring feels. SAGA II is about people with clear goals, pursuing those goals and succeeding or failing. Characters in Sunspring come off as more complicated because they are not consistent: the neural net wasn’t able to model them well enough to give them motivations. They speak in a funny way because the neural net didn’t have enough data to know how people talk, and they make sudden shifts in subject matter because the neural net has a short attention span. These sound like drawbacks, but Sunspring is (at least to my eyes) the more entertaining of the two films. The actors were able to turn the inconsistencies into subtext, and in turn, they allowed the audience to believe that there was a meaning behind the film—one that is necessarily more complex and interesting than the rudimentary contest SAGA produced.