Who do you write like?

Discover which famous writers match your style using automated text analysis

A person's writing style reflects a great deal about them. It can reveal much regarding their educational attainment, social class, reading habits, and patterns of thought. One's style is also an intensely personal thing - the inner voice made manifest on the page or screen. However, despite the seemingly ineffable qualities of style, great progress has been made in measuring writing quantitatively. This process, known as stylometry, can identify the influence of one writer on another or reveal the author of unattributed work. In a previous post, I explored the stylistic similarity between hundreds of historical authors whose works are freely available on Project Gutenberg. You can see the resulting network graph on the right (click to see a larger view in the original post). Here I've translated my text analysis code to javascript so that you can see where you fit in this network. If you enter a sample of your writing in the form below, you can see how similar your style is to that of some of history's most famous writers!

Edit: Wait! Please consider using the version of this tool on my research site: MySocialBrain.org. It's just like the version here, but your data can be used for Science(!) rather than just my personal blog.

Cut and paste a writing sample into the text box below (the longer, the better): Who wrote this text sample? Select Me Someone I know A famous writer Someone else What type of writing is this? Select Autobiography/memoir Children's literature Comedy Essay Fantasy Historical fiction Horror Journalism Literary fiction Mystery Other fiction Other nonfiction Romance Science fiction Young adult fiction

Click on the name of a writer to see which texts written by them are freely available on Project Gutenberg. The magnitude of (dis)similarities between your style and these authors' styles depend on a number of factors. In theory, they range between -1 and 1, but it is unlikely that you will observe such large values even if you submit a text sample from one of the authors themselves (unless you submit a composite of all of their writing). Providing larger samples of text will on average increase the range of observable magnitudes because the resulting estimate of your style will be more reliable. Genre will likely also have a substantial effect on stylistic similarity. If and when enough responses accrue, I may post a follow-up on the effects of genre on user-submitted writing styles. It's also worth emphasizing that each author's style is aggregated across all of their work. Lewis Carroll, for instance, shows up at the bottom of the graph with great consistency because, in addition to Alice and Wonderland his books on mathematics and logic (which contain lots of exotic punctuation) are included. Chaucer suffers from a similar issue because the text of his Canterbury Tales is riddled with asterisks denoting the modern English equivalent of his Middle English.

Methods

To analyze your style, we made counts of the various "stop words" that you use. Stop words are grammatical, content-free words such as conjunctions and articles. We also counted how frequently you used various punctuation marks. The code then compared all of these counts to similar sets we measured in advance from a number of famous writers. This comparison consisted of calculating the rank correlation between the (frequency-normalized and standardized) counts from your writing and those of the famous authors. Not all of the writers shown in the network above were included in this analysis. To keep the results tractable, I focused on a few of the most famous, and also limited the list to writers who originally wrote in English to minimize the influence of translators. To be considered, the authors also had to be prolific, as described in my previous post. Thus many famous writers who wrote a small number of works do not appear on the list. This limitation made famous authors easier to identify, but had the more important virtue of limiting the analysis to authors for whom we can estimate style robustly, without leaning too heavily on any single work.