In high school, writing term papers on the family PC, I’d often turn to Microsoft Word’s “readability statistics” feature to make sure I sounded smart enough. With a few clicks, Word assigned my papers a Flesch-Kincaid Grade Level: a number from one to twelve indicating how many years of education the average reader would need to have completed in order to decipher my language. I had no idea how Word made this calculation, but I noticed that it rewarded prolix sentences with a higher “grade.” So that’s what I wrote. I put my every word choice under close scrutiny. Soon my paragraphs buckled under the weight of clauses and polysyllables, but I, a ninth grader, was generating prose that only twelfth graders could read—which made me pretty hot shit, my thinking went.

Those Flesch-Kincaid trials came back to me as I read “Nabokov’s Favorite Word Is Mauve: What the Numbers Reveal About the Classics, Bestsellers, and Our Own Writing,” by Ben Blatt, which looks at the canon as a statistical gold mine to be dredged for patterns, variances, and singularities. In “literary experiments” on diction, punctuation, cliffhangers, clichés, and other aspects of style and usage, Blatt uses data to probe the body of conventional wisdom that surrounds creative writing. What if those who allegedly loathe adverbs are actually completely, totally addicted to them? What if it’s quite O.K. to use intensifiers very often, because Jane Austen is rather fond of them? What if I like exclamation points! Blatt’s jacket bio cites “his fun approach to data journalism”—a bit of prolepsis, maybe, aimed at those of us who’d sooner watch paint dry than look at anything quantitatively—and his book is laden with charts, lists, and tables printed in a gentle purple. The lessons here are valuable because of their workmanlike cast, not in spite of it. Put aside the “fun approach” and “Mauve” makes some enticingly heretical observations: that every great writer is a technician, every novel a mere agglomeration of prose effects.

The book is built on agreeable miscellany, and parts of it are willfully trivial. On the face of it, there’s not much to be gleaned from the fact that James Joyce uses 1,105 exclamation points per hundred thousand words, or that J. R. R. Tolkien leans too often on “suddenly,” that most accursed of adverbs. Blatt’s findings are more absorbing when he ditches the bean-counter approach. American writers of Harry Potter fan fiction are actually more liable to use “brilliant” than their British counterparts, who employ the word with native agility. And, in a study of erotica written by New Yorkers, Blatt notes a preponderance of the following words: subway, popsicle, senator, butthole, museum, landlord, thrusted, Jacuzzi, sin, and shrugs. Most of these choices are intuitive, even laudable—but what explains those last three? I grasp that a New Yorker might lust for a senator with a popsicle in his butthole; a shrugging sinner in a hot tub doesn’t quite rate.

Blatt’s research on diction and gender is especially revelatory. Looking at a broad swath of twentieth-century lit, he tallies the verbs most often used to describe one gender over another. The results find rich deposits of sexism running through the language. Male characters are most likely to mutter, grin, shout, chuckle, and kill; women are doomed to shiver, weep, murmur, scream, and marry. Male authors are far likelier to write “she interrupted” than “he interrupted.” A grim typology begins to emerge. Men are raffish, jolly, murderous sorts, while women are delicate and meek, except when they deign to interrupt men, as they often do. There’s some sexual self-loathing across the board, too: when writers assign verbs to someone of the opposite gender, they most often reach for “kiss,” “exclaim,” “answer,” “love,” and “smile”; characters of the same gender “hear,” “wonder,” “lay,” “hate,” and “run.”

The high point of the book is Blatt’s effort “to test whether something like a literary fingerprint exists for famous writers.” It does, he finds­—across their oeuvres, “authors do end up writing in a way that is both unique and consistent, just like an actual fingerprint is distinct and unchanging.” Even the way that writers deploy simple pairs of words—“and” and “the,” “these” and “then,” “what” and “but”—is often enough to identify them. The numbers bear out a romantic idea: that a writer is always ineluctably herself. Soon, Blatt zeroes in on writers’ “favorite” words—hence his title, indicating Nabokov’s predilection for “mauve.” The words must be used in half an author’s books, at least once per hundred thousand words; they can’t be proper nouns. His discoveries are startlingly apt. Almost without fail, the words evoke their authors’ affinities and manias. John Cheever favors “venereal”—a perfect encapsulation of his urbane midcentury erotics, tinged with morality. Isaac Asimov prefers “terminus,” a word ensconced in a swooping, stately futurism; Woolf has her “mantelpiece,” Wharton her “compunction.” (Melville’s “sperm” is somewhat misleading, perhaps, when separated from his whales.)

Cumulatively, these facts and figures make “Mauve” an effective craft book. By reminding us that literature is just strings of words and punctuation, Blatt has taken the whiff of the godhead out of it. Writers like to emphasize the psychology in their work, their strenuous labor toward depth and verisimilitude; they’re less inclined to talk about how few decent synonyms exist for “good.” The stats speak a cold truth: there are dozens of prosaic choices behind every artful sentence. Dwelling on this can inoculate writers against the preciousness of the workshop. “Mauve” has no truck with showing instead of telling, no druthers about sense of place or voice. Even in great books, it says, one word follows another, all of them slaves to grammar, sequence, and probability.

Then again, as the figures pile up, claustrophobia sets in. I caught myself wishing I worked in a less quantifiable medium. Ceramicists, for instance, never seem to answer to the Blatts of the world. Like the Flesch-Kincaid formula, which Blatt mentions in a passage about the verbal simplicity of children’s literature, “Mauve” and its number-crunching can feel joyless—the equivalent of an architecture critic who counts the number of bricks in a façade. At his weakest, Blatt sounds like a tour guide over a loudspeaker. “It’s by noting the role of each word and punctuation mark,” he writes, “that the greats are able to hone their writing”—an observation that at once understates the act of revision and ignores the ecstatic, almost compulsive sloppiness that makes some writers great.

“The written word and the world of numbers should not be kept apart,” Blatt writes, and I think he’s right; what’s frustrating is that no one has yet figured out how they might productively collaborate. Like last year’s “The Bestseller Code,” which described an algorithm that predicted the plots of popular novels, “Mauve” wagers that the “digital humanities,” as they’ve uneasily come to be known, can instruct audiences outside of the academy. The book’s finest moments prove that they can—but to what end? Blatt argues that his work is “not an attempt to ‘engineer’ art as much as a way to understand it”: “If you were a band in the 1960s you would want to know how the Beatles were recording their songs.” Maybe so, but is that really what this book professes to teach? Knowing the rate at which Ringo hits his snare drum does not a Beatle make.

Reading “Mauve,” I began to imagine two duelling schools of authorship, both motivated by statistics. In one, writers would cultivate their tics, inhabiting themselves so thoroughly that to encounter them on the page would be like finding their footprints in wet cement. In the other, writers would aspire to defy the data, styling their prose with such intricate, chameleonic grace that no statistician could betray their identity. A few decades ago, the advent of the word processor made it easier than ever to revise on the fly; it also made it easy to dwell on one sentence ad infinitum, gilding the lily where once one would’ve advanced to the next thought. The glut of data is another mixed blessing—past a certain point, writers would do better in a state of blissful ignorance. Otherwise, they might end up with work like my ninth-grade term papers, mannered and overwrought.

The Flesch-Kincaid Grade Level on this piece, by the way, is a mere 10.3.