I've been playing around with creating N-grams from books. Just to see how this is done, what the results look like, and how you can use these N-grams for visualizations. I started with the seven Harry Potter books to see what kind of N-grams were created. So after creating N-grams from size one to size 10, and counting them I got around 5.500.000 distinct N-Grams of various length and number of occurences. My next step was to find an interesting way of visualizing these N-Grams, and to get some inspiration I was looking though my 'to-read' list from getPocket. There I ran into someone, Mike Matola, who created posters from copying lines of film scripts to "compose portraits of music and film icons".

So I set out to do the same thing. Just not by hand, and using the most common occurences of sentences to fill an image. In the next couple of days I'll update my site and show you how I created the ngrams from the input material, and how that input can be converted to create these kind of images. I've you're interested in specific scenes, stencils, books or images let me know. The generation is a bit time consuming, but, luckily, completely automatically.

Follow @josdirksen Tweet to @josdirksen Tweet