'Nabokov's Favorite Word Is Mauve' Crunches The (Literary) Numbers

Enlarge this image toggle caption NPR Multimedia/AP Photo NPR Multimedia/AP Photo

Let's acknowledge this at the top: It's a thin slice.

To gaze across the great swath of written English over the past few centuries — that teeming, jostling, elbow-throwing riot of characters and places and stories and ideas — only to isolate, with dispassionate precision, some stray, infinitesimal data point such as which author uses cliches like "missing the forest for the trees" the most, would be like ...

Well. You get it. More like missing the forest for the raspberry seed stuck to the underside of the 395th leaf on the 139th branch of the 223,825th tree.

But that's what statistician Ben Blatt's new book, Nabokov's Favorite Word is Mauve, sets out to do, thin slice by thin slice.

He loaded thousands of books — classics and contemporary best-sellers — into various databases and let his hard drive churn through them, seeking to determine, for example, if our favorite authors follow conventional writing advice about using cliches, adverbs and exclamation points (they mostly do); if men and women write differently (yep); if an algorithm can identify a writer from his or her prose style (it can); and which authors use the shortest first sentences (Toni Morrison, Margaret Atwood, Mark Twain) versus those who use the longest (Salman Rushdie, Michael Chabon, Edith Wharton).

I can hear thousands of monocles dropping into thousands of cups of Earl Grey from here. "But what of literature?" you sputter. "What does any of that technical folderol have to do," — here you start wiping your monocle on your nosegay — "with ART?"

Not much, is the answer. Blatt's book isn't terribly interested in the art of writing. What it's fascinated by — and is fascinating about — is the craft of writing.

Technique. Word choice. Sentence structure. Reading level. There's something cheeky in the way Blatt throws genre best-sellers into his statistical blender alongside literary lions and hits puree, looking for patterns of style shared by, say, James Joyce and James Patterson.

A Balm For Bookish Know-it-Alls

Enlarge this image toggle caption Sierra Katow/Simon & Schuster Sierra Katow/Simon & Schuster

To say that you likely won't find much that's truly surprising in Nabokov's Favorite Word is Mauve isn't a critique. In fact, it's kind of the point. Reading it, you experience the feeling, again and again, of having some vague, squishy notion you've always sort of held about a given author getting ruthlessly distilled into a stark, cold, numerical fact.

Which is, if you're the kind of person who likes to get proven right (hi!), a hell of a lot of fun.

Now: It's a book of statistics, and statistics rest on distinct sets of assumptions that must get made before any number can start getting well and truly crunched. So if you're curious about Blatt's methodology, boy are you in luck. Every chapter begins with Blatt chattily sharing with the reader — as chattily as a book this eager to walk us through the formula used to calculate Flesch-Kincaide Grade Levels can be — every aspect of his thinking. How he defines "Great Books." What constitutes a long sentence. Which chapter-endings qualify as cliffhangers, and which merely ... abrupt.

He drags you into the weeds with him, but he's a personable writer, and he's brought along a picnic lunch, so you don't mind the bugs.

Herewith, some of my favorite of Blatt's findings in Nabokov's Favorite Word is Mauve:

MEN WRITE LIKE THIS, BUT WOMEN WRITE LIKE THIS

Enlarge this image toggle caption Simon & Schuster Simon & Schuster

It tuns out that — sit down for this next bit — authors who are women write equally about men and women, but men write overwhelmingly about men.

I know. I'm shaken, over here.

For every appearance of the word "she" in classics by male authors, Blatt found three uses of the word "he." In classics by women, the ratio was pretty much one-to-one.

Also: Male authors of classic literature are three times as likely to write that a female character "interrupted" than male characters. In contemporary popular and literary fiction, the ratio is smaller, but it's still there.

FAVORITE WORDS

Blatt looked for the specific words that authors use much more frequently than the rate at which those words generally occur in the rest of written English (i.e., compared to a huge sample of literary works — some 385 million words in total — written in English between 1810 and 2009, assembled by linguists at Brigham Young Univeristy).

His criteria: A favorite word -

Must occur in at least half of the author's books Must be used at a rate of at least once per 100,000 words Must not be so obscure that it's used less than once per million in the BYU sample of written English Is not a proper noun

Here's some that jumped out at me.

Jane Austen: civility, fancying, imprudence (Story checks out, right?)

Dan Brown: grail, masonic, pyramid (I am sagely nodding, over here.)

Truman Capote: clutter, zoo, geranium

John Cheever: infirmary, venereal, erotic (Boy howdy, that's a whole Cheever short story, right there.)

Agatha Christie: inquest, alibi, frightful

F. Scott Fitzgerald: facetious, muddled, sanitarium

Ian Fleming: lavatory, trouser, spangled ("Pardon me, Blofeld; must dash to the lavatory, got something spangled on me trouser.")

Ernest Hemingway: concierge, astern, cognac (Yuuup.)

Toni Morrison: messed, navel, slop

Vladimir Nabokov: mauve, banal, pun (As Blatt points out, Nabokov had synesthesia, a condition that caused him to associate various colors with the sound and shape of letters and words. "Mauve" was his favorite: He used the word at a rate that's 44 times higher than the rate at which it occurs in the BYU sample of written English.)

Jodi Picoult: courtroom, diaper, diner

Ayn Rand: transcontinental, comrade, proletarian

J.K. Rowling: wand, wizard, potion (Well, duh.)

Amy Tan: gourd, peanut, noodles

Mark Twain: hearted, shucks, satan

Edith Wharton: nearness, daresay, compunction (Man I love me some Edith Friggin' Wharton.)

Virginia Woolf: flushing, blotting, mantelpiece (Chandler Bing: "Could they BE more Virginia Woolf?")

ADVERBS

You know: nearly, suddenly, sloppily, etc. Writing teachers tell you to avoid them, that they sap the energy from a sentence. Strong, clear writing is fueled by verbs and nouns, they say, not by adjectives and adverbs.

Turns out, the adverb thing holds up: When Blatt combined several lists of the "Great Books" of the 20th century, he came up with 37 which were generally considered great.

When comparing these to the same authors' other novels, the "Great Books" used significantly fewer adverbs. Of these authors' books that kept to a strict adverb rate (less than 50 per 10,000 words) 67% were considered "Great," whereas only 16% of their adverb-loaded books (containing more than 150 per 10,000 words) were ever considered "Great."

EXCLAMATION POINTS

Well I mean: I hate 'em, at least. My husband uses them like they're powdered sugar and his emails are lemon bars. But I hate 'em.

You know who doesn't hate 'em? Besides my husband, I mean? James Joyce. Dude loved them.

Blatt took a sample of 50 authors of classics and contemporary best-sellers, totaling 580 books. The authors who used the most exclamation points per 100,000 words were:

5. J.R.R. Tolkien (767)

4. E.B. White (782. Gasp; nobody tell Mr. Strunk.)

3. Sinclair Lewis (844. I guess it CAN happen here.)

2. Tom Wolfe (929)

1. James Joyce (1,105)

Elmore Leonard — bless him — used the fewest: Just 49 per 100,000 words.

IT'S RAINING CATS AND DOGS AND CLICHES

When it comes to use of cliches, there's another gender split.

In Blatt's list of 50 classic and best-selling authors (scroll down to the bottom of this post to see them all), those who use cliches most frequently? All men.

5. Chuck Palahniuk (129 per 100,000 words)

4. Salman Rushdie (131)

3. Kurt Vonnegut (140. All those "And so it goes"es in Slaughterhouse-Five really hurt him here, I bet.)

2. Tom Wolfe (143)

1. James Patterson (160)

(In fairness to Patterson, Blatt includes cliches found in dialogue, and Patterson's characters aren't exactly going around coining new phrases with a Joycean fervor.)

The authors who used the fewest cliches? All women.

5. Veronica Roth (69)

4. Willa Cather (67)

3. Virginia Woolf (62)

2. Edith Wharton (62)

1. Jane Austen (A paltry 45 per 100,000 words, about 1/3 of the rate at which James "More Cliches Than You Can Shake A Stick At" Patterson busts them out.)

Now, again: It's a thin slice, looking at literature in this knowingly reductive way. It doesn't tell you everything, and of course it doesn't give you a true sense of the feeling you get when you read these authors for yourself.

But what it often succeeds in capturing, with astonishing clarity, is your feeling about these authors.

Case in point: The author who is most likely to mention the weather in the opening sentence?

Danielle Steele.

She does it in — precisely — 46 percent of her books.