“There’s no reason why these simple shapes of stories can’t be fed into computers“

by which he probably meant:

“Once you have a sequence of numerical scores, it is easy to draw these lines using computers”

click the image -> video on YouTube.com

…but how to get the scores? When I watched the video, I just happened to be developing natural language processing tools for analyzing text. Swimming in that context, it was very natural to wonder:

“if we don’t have Kurt Vonnegut on hand to interpret stories through his own experience, how can we plot the shapes of stories directly from the text itself?“



The problem (of plotting the shapes of stories) is similar to sentiment analysis on the narrator’s viewpoint. But it was not clear if existing sentiment models were good enough to pick up on this signal across the length of a long story. Specifically, language is full of subtle contextual cues that can interact with each other. Sentiment models can struggle to balance the interactions amongst many positive vs. negative terms and return stable predictions over long-range context windows. And even once you have the algorithmic bits worked out, validating the result can be tricky when human interpretations differ.

This notebook implements a series of experiments using indico’s machine learning API’s to quickly test hypotheses. If you simply want to consume the story, keep on scrolling. If you want to tinker, maybe do some experiments of your own…this post was written as a Jupyter notebook for exactly that reason. Grab the notebook here on Github. And when you discover something interesting, please let us know, we love this stuff.