I’ve got­ten sev­eral re­quests to ad­dress a new re­search study called Are two spaces bet­ter than one? The ef­fect of spac­ing fol­low­ing pe­ri­ods and com­mas dur­ing read­ing, writ­ten by Re­becca John­son, Becky Bui, and Lind­say Schmitt.

Ap­par­ently de­fy­ing Bet­teridge’s Law, the study claims to show that two spaces af­ter a pe­riod are eas­ier to read than one. On its face, this also seems to con­tra­dict my long­stand­ing ad­vice to put only one space be­tween sen­tences.

Con­fi­den­tial to two-space re­searchers: you might con­sider mak­ing your pa­per avail­able for free, as it may be the last time that a topic of your re­search over­laps with a wide­spread in­ter­net obsession.

Be­cause the study costs $39.95 for a PDF, I’m cer­tain the so­cial-me­dia skep­tics rush­ing to claim vic­tory for two-spac­ing have nei­ther bought it nor read it. But I did both.

True, the re­searchers found that putting two spaces af­ter a pe­riod de­liv­ered a “small” but “sta­tis­ti­cally … de­tectable” im­prove­ment in read­ing speed—about 3%—but cu­ri­ously, only for those read­ers who al­ready type with two spaces. For ha­bit­ual one-spac­ers, there was no ben­e­fit at all.

Fur­ther­more, the re­searchers only tested sam­ples of a mono­spaced font on screen (roughly as shown be­low). They didn’t test pro­por­tional fonts, which they ac­knowl­edge are far more com­mon. Nor did they test the ef­fect of two-spac­ing on the printed page. The au­thors con­cede that any of these test-de­sign choices could’ve af­fected their findings.

In sum—a small dif­fer­ence, lim­ited to a cer­tain cat­e­gory of test sub­jects, with nu­mer­ous caveats at­tached. Not much to see here, I’m afraid.

Of course, I ac­cept the study. Sci­ence is real! I have a few quib­bles, noted be­low. But over­all, it pro­vides an ex­am­ple of why leg­i­bil­ity re­search is of­ten not as use­ful to prac­tic­ing ty­pog­ra­phers as we might hope. Ty­pog­ra­phy is broad and deep, and re­search stud­ies can only test nar­row propositions.

Most of all, though I will read any fu­ture stud­ies on this topic that may emerge—bud­get per­mit­ting—I agree with the re­searchers’ clos­ing thought: “we should prob­a­bly be ar­gu­ing pas­sion­ately about things that are more important.”

An ide­al­is­tic thought, but ar­gu­ing pas­sion­ately about dumb top­ics is the web’s rai­son d’être. In­ter­net ran­dos have been try­ing to rope me into ar­gu­ments about two-spac­ing for 10 years. I in­dulged them at the out­set. But not for a long time.

Why? Spoiler alert: I really don’t care how many spaces you put be­tween sen­tences. One? Two? Seven? π/4? Knock your­self out.

Ty­pog­ra­phy is not a sci­ence. Like lan­guage it­self, it has some struc­tural and prac­ti­cal con­ven­tions. If your goal is to per­suade read­ers, it’s wise to be aware of these con­ven­tions, be­cause re­ly­ing on them can help you. Con­versely, de­part­ing from these con­ven­tions may have un­in­tended consequences.

Even if the Ty­pog­ra­phy Ge­nie gave me one wish, I still wouldn’t waste it on fix­ing space be­tween sen­tences. I would cure the overuse of all caps, which is a truly nox­ious habit.

But in the end, it’s up to you. I’ve never held my­self out as the apex tastemaker nor the ty­pog­ra­phy po­lice. My project is to ed­u­cate writ­ers about these ty­po­graphic con­ven­tions, be­cause tra­di­tion­ally, these con­ven­tions aren’t taught along­side writ­ing. (Though they should be—in the dig­i­tal age, ty­po­graphic skill seems just as es­sen­tial as typ­ing skill.)

To that end, wher­ever there’s room for dis­cre­tion and choice, I say so. I want read­ers to de­velop their own ty­po­graphic judg­ment, not merely recre­ate mine, cargo-cult style. I de­ploy my sternest tone only for those ty­po­graphic con­ven­tions that are im­mov­ably en­trenched. In other words: let’s not waste en­ergy dis­put­ing the indisputable.

Ten years ago, I is­sued a stand­ing chal­lenge to the skep­tics: show me a book, news­pa­per, or mag­a­zine that uses two spaces be­tween sen­tences. No one ever has.

That’s why I start this book with one space be­tween sen­tences. Not be­cause it mat­ters much vi­su­ally. But rather be­cause the rule is so well set­tled. It’s a lit­mus test for ty­po­graphic skep­tics: if you can’t ac­cept that pro­fes­sional ty­pog­ra­phers al­ways use one space be­tween sen­tences, you’ll likely find the other rules a bore.

And some do. The cost of in­tro­duc­ing thou­sands to the plea­sures of good ty­pog­ra­phy is en­dur­ing a few heck­lers. I once gave a talk about ty­pog­ra­phy to a group of UCLA law pro­fes­sors. To­ward the end, one of them—known as the “em­pir­i­cal guy” on the fac­ulty—said “That’s all very in­ter­est­ing, Matthew, but why don’t ty­pog­ra­phers re­solve these mat­ters with em­pir­i­cal re­search? Surely it can’t be sub­jec­tive which font is best.”

“That’s a great idea, pro­fes­sor. By the way, have you re­searched what kind of law-re­view ar­ti­cle is most likely to get you tenure? How many words in the first sen­tence? Av­er­age num­ber of vow­els per paragraph?”

I didn’t say that; only thought it. Ob­vi­ously, it wasn’t a se­ri­ous ques­tion. This pro­fes­sor thought ty­pog­ra­phy was a silly topic, so he was dis­miss­ing it with the em­piri­cist’s ul­ti­mate put-down: you’re not be­ing em­pir­i­cal enough.

What I ac­tu­ally said to him went more like this—

A good sur­vey of leg­i­bil­ity re­search through the years can be found in Sofie Beier’s book Read­ing Let­ters: De­sign­ing for Leg­i­bil­ity (see bib­li­og­ra­phy).

Ty­pog­ra­phy—like lan­guage and every other form of hu­man ex­pres­sion—doesn’t oc­cupy a realm of strict ob­jec­tive truth.

That doesn’t mean ty­pog­ra­phers are hos­tile to the idea of re­search, or that leg­i­bil­ity can’t be tested. On the con­trary, many type­faces have emerged from forms of em­pir­i­cal re­search, for instance—

Retina was de­signed (orig­i­nally for the Wall Street Jour­nal) to stay leg­i­ble on low-qual­ity newsprint. Type de­signer To­bias Frere-Jones stud­ied how ink spreads on newsprint, and cut notches into the let­ter­forms to com­pen­sate. Though these “ink traps” look bizarre at large size, at small size they’re in­vis­i­ble. And in­stead of the let­ters look­ing blobby and gloppy, they just look correct. The new font ap­proved for fed­eral high­way use, Clearview, de­signed by Don Meeker and James Mon­tal­bano. As a font for sig­nage, it had to be leg­i­ble for dri­vers at var­i­ous speeds and light lev­els, and was tested un­der these conditions. Mi­crosoft’s Sitka font, de­signed by Matthew Carter, like­wise emerged from leg­i­bil­ity stud­ies about on-screen read­ing in Windows.

The ex­am­ples above share an im­por­tant fea­ture. In each case, the type de­signer was asked to op­ti­mize leg­i­bil­ity in a spe­cific read­ing con­text. There­fore, it was pos­si­ble to pose re­search ques­tions nar­row enough to be turned into testable propo­si­tions. (Not in­signif­i­cantly, these projects also had a cus­tomer at­tached who could pay for test­ing.) In turn, these tests pro­duced re­sults that were suf­fi­ciently pre­cise to be­come con­crete de­sign guid­ance. In sum, em­pir­i­cal re­search was use­ful specif­i­cally be­cause the prob­lem do­main was narrow.

Let’s re­turn to the law pro­fes­sor’s ques­tion: if we con­tin­ued this process, could we dis­cover the best font for every­thing? It looks hope­less. Re­search, by its na­ture, tests nar­row ques­tions. As I said in what is good ty­pog­ra­phy, ty­pog­ra­phy can’t be re­duced to a math prob­lem with one right an­swer. Like­wise, it’s dif­fi­cult to imag­ine a nar­row re­search ques­tion about fonts whose re­sults could be ex­trap­o­lated to every pos­si­ble context.

Still, even if em­pir­i­cal re­search can’t re­solve that many ty­po­graphic prob­lems, we needn’t de­clare ty­pog­ra­phy to be a do­main of pure whimsy. We can al­ways imag­ine ty­po­graphic so­lu­tions to a prob­lem that aren’t suit­able. In­deed, think­ing crit­i­cally about the in­tended read­ing con­text is al­ways help­ful “re­search” be­fore start­ing to de­sign a ty­po­graphic so­lu­tion. For in­stance, in pre­sen­ta­tions, I rec­om­mend turn­ing off the lights while work­ing on your slides, to bet­ter sim­u­late how your au­di­ence will see them. That’s not em­pir­i­cal. But as re­search, it’s bet­ter than nothing.

In that way, ty­pog­ra­phy func­tions much like writ­ten lan­guage it­self. We can—and should—use prag­matic con­sid­er­a­tions to nar­row down the space of pos­si­bil­i­ties. But when it’s time to choose from among those pos­si­bil­i­ties, there’s some art, hu­man­ity, and ex­pres­sive­ness to it. Just as no one can tell you the best open­ing sen­tence for your re­search pa­per, no one can tell you the best font for that pa­per, either.

Against that back­drop, if the main ques­tion we ask about a re­search study is what did it find?, the fol­low-up ques­tion ought to be what did it test?

First, let’s be clear about what this study didn’t test. Though the pa­per cites 48 sources—many psy­chol­o­gists, and a hand­ful of the afore­men­tioned in­ter­net crit­ics—it point­edly does not cite to any ty­pog­ra­phy au­thor­i­ties. Not me, not Erik Spiek­er­mann, not Robert Bringhurst, not Ellen Lup­ton, not Bryan Gar­ner, not the Chicago Man­ual of Style, not anyone.

But that seems right. Why? None of us are hold­ing out this rule as an em­pir­i­cal claim. Nor are we invit­ing ar­gu­ment. We’re merely re­port­ing that there is a long­stand­ing con­ven­tion grounded in pro­fes­sional prac­tice: one space. We’re not of­fer­ing a deeper jus­ti­fi­ca­tion for this rule, any more than a dic­tio­nary tries to jus­tify why cough, tough, dough, and through don’t rhyme. I’m not even claim­ing the rule has al­ways been thus (it hasn’t) or al­ways will be (ty­po­graphic prac­tice changes, al­though slowly). But in the here & now, the prac­tice—and there­fore the rule—is clear.

Hav­ing yielded this ground, I do think the study au­thors err by open­ing with the flawed premise “There has been a long de­bate on the topic of how many spaces should fol­low a pe­riod” and then sum­ma­riz­ing the “ar­gu­ments ... on both sides”. This is a du­bi­ous ful­crum for a sci­en­tific ar­gu­ment, be­cause it sets up a false equiv­a­lence. What would we think if a pa­per started “There has been a long de­bate about whether men landed on the moon …”? The fact that cer­tain peo­ple on Red­dit have long de­bated a topic does not mean there’s “a long de­bate” in the rea­son­able-per­son sense. It would seem fairer to ad­mit “The es­tab­lished prac­tice is one space, though a de­voted mi­nor­ity still ad­vo­cates for two. We wanted to find out: is there an em­pir­i­cal difference?”

Not sure what con­clu­sion I should draw about a pro­fes­sional as­so­ci­a­tion of sci­en­tists that re­jects ev­i­dence and au­thor­ity in other fields.

The er­ror of false equiv­a­lence also makes me won­der about the true mo­ti­va­tions for this study. The au­thors seem in­ter­ested in vin­di­cat­ing the Amer­i­can Psy­cho­log­i­cal As­so­ci­a­tion, a promi­nent con­trar­ian that stan­dard­ized on two spaces in its style man­ual some years ago. As the pa­per ex­plains, the APA orig­i­nally jus­ti­fied its de­ci­sion on the grounds of “ease of read­ing com­pre­hen­sion”, de­spite there be­ing “no di­rect em­pir­i­cal ev­i­dence in sup­port of these claims.” But the pa­per’s fi­nal para­graph be­gins “In sum, the cur­rent find­ings pro­vide em­pir­i­cal ev­i­dence for the change made to the APA Man­ual spec­i­fy­ing ... two spaces.” Why would any re­search psy­chol­o­gist spend time & bud­get chas­ing this down? Es­pe­cially since the APA al­ready proved it­self happy to fly in the face of pre­vail­ing habit. Mys­te­ri­ous. [Up­date: lead au­thor Re­becca John­son has con­firmed in an in­ter­view that pro­vid­ing em­pir­i­cal sup­port for the APA was a goal of the study.]

Any­how, back to the sci­ence. The study aimed to mea­sure the dif­fer­ence in read­ing ef­fi­ciency in para­graphs of text that dif­fered in the num­ber of spaces—two spaces or one—af­ter pe­ri­ods and, in­ter­est­ingly, com­mas. The sub­jects were 60 Skid­more Col­lege stu­dents “who were na­tive speak­ers of Amer­i­can Eng­lish and had nor­mal or cor­rected-to-nor­mal vi­sion.” In the study, the sub­jects first typed a para­graph of text, so the re­searchers could clas­sify each sub­ject as a ha­bit­ual one-spacer or two-spacer—also an in­ter­est­ing detail.

Af­ter that, the sub­jects were asked to read a se­ries of 20 “ex­per­i­men­tal para­graphs” rang­ing from 71–166 words. A com­pre­hen­sion ques­tion was asked af­ter each para­graph to con­firm un­der­stand­ing. This gave a mea­sure of read­ing speed and com­pre­hen­sion ac­cu­racy. Fur­ther­more, as the sub­jects read the para­graphs, an eye-track­ing de­vice cap­tured the move­ments of their right eye, pro­vid­ing fur­ther data about how the ex­tra spaces af­fected reading.

So what did these “ex­per­i­men­tal para­graphs” look like? The pa­per tells us that the para­graphs were set in “14 point Courier New font” with “quadru­ple” line spac­ing, which I’ll take to mean quadru­ple the point size = 56 point. The au­thors don’t men­tion how wide the lines were, nor the color of the type, nor the back­ground (though these de­ci­sions would also af­fect readability).

So here’s my best guess of how that text looked, us­ing a para­graph of 89 words:

As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect. When he lifted his head a little, he could see his dome-like brown belly. The bed quilt could hardly keep in position and was about to slide off completely. His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes. It was no dream. His room, a regular human bedroom, only rather too small, lay quiet between the four familiar walls.

The para­graphs were not printed, but rather dis­played on a “21 inch NEC Accu­Sync 120 mon­i­tor” run­ning Win­dows (ver­sion not spec­i­fied). This CRT dis­play, first made in 2002, sup­ports a max­i­mum 1600 × 1200 res­o­lu­tion. CRTs, how­ever, can be op­er­ated at mul­ti­ple res­o­lu­tions, and the pa­per doesn’t men­tion ex­actly which was used—an­other de­ci­sion that would af­fect read­abil­ity, since it would change the ap­par­ent size of the font. Still, pretty much what you’d ex­pect to find in the psy­chol­ogy lab of a lib­eral-arts col­lege—equip­ment handed down two up­grade cy­cles ago—though not what most read­ers are us­ing today.

To get a truer fla­vor of the ac­tual pixel dis­play, here’s the same text from above, dis­played by a Win­dows XP emulator:

At the out­set, I said I ac­cepted the study and its find­ings. Still, I can’t avoid point­ing out that this style of ty­pog­ra­phy is nei­ther com­mon nor re­al­is­tic—not the mono­spaced font, not the point size, and not the line spac­ing. (Cer­tainly, this sam­ple para­graph doesn’t com­port with the most ba­sic ad­vice about body text given here at Prac­ti­cal Ty­pog­ra­phy.) Maybe those dif­fer­ences help iso­late the is­sue of two spaces vs. one. Or maybe they con­found it, by mak­ing the sur­round­ing text more dif­fi­cult to read. Con­sider that in an ear­lier read­ing study, lead re­searcher Re­becca John­son used nor­mally spaced Calibri.

Am I be­ing un­fair? I don’t think so. Re­searchers in­clude these de­tails so that oth­ers can as­sess the cred­i­bil­ity of their meth­ods, and there­fore their find­ings. The two are in­ex­tri­ca­bly teth­ered. But no ty­pog­ra­pher would de­fend the leg­i­bil­ity of Win­dows text on a CRT dis­play from 2002. It was aw­ful then, and worse now. To­day, even an en­try-level smart­phone screen is eas­ier to read. And al­though no mono­spaced font is a mir­a­cle of leg­i­bil­ity, Courier New is one of the worst—I de­scribed it as “beastly ... spindly, lumpy, and just plain ugly.”

There’s no ev­i­dence that the re­searchers con­sulted a ty­pog­ra­pher on the de­sign of their study. I wish they had. Not be­cause ty­pog­ra­phers know best. Rather, be­cause that col­lab­o­ra­tion might’ve pro­duced test cases that led to more fruit­ful re­sults. As it stands, the re­searchers ended up test­ing the leg­i­bil­ity of type­writer habits. Given that a com­puter can dis­play any font, at any size, I would’ve pre­ferred that they use a wider ty­po­graphic va­ri­ety. Given how easy it would’ve been to pre­pare printed sam­ples, I would’ve pre­ferred that they not rely strictly on an an­cient CRT.

But as I said—I’m not the ty­pog­ra­phy po­lice. And I’m def­i­nitely not the ty­pog­ra­phy-re­search po­lice. Though I ac­cept the find­ings of the study, the ty­po­graphic con­di­tions seem overly—and un­nec­es­sar­ily—ar­ti­fi­cial. Yes, sci­ence is real. But that cuts both ways. We com­mit to fol­low the ev­i­dence wher­ever it leads. But some­times it doesn’t lead very far be­fore the trail goes cold.

To be fair to the au­thors, they don’t over­sell their find­ings. (Well, aside from the ex­pec­ta­tions set by their cho­sen ti­tle.) Ac­cord­ing to the study, there were “sta­tis­ti­cally re­li­able and ... de­tectable” im­prove­ments due to two spaces af­ter a pe­riod: cer­tain re­search sub­jects were able to read the pas­sages faster with­out los­ing comprehension.

But that con­clu­sion comes with some sig­nif­i­cant caveats:

The mea­sured im­prove­ment was in read­ing speed, not read­ing com­pre­hen­sion, and was “small in mag­ni­tude”—about 3%.

“Com­pre­hen­sion ac­cu­racy was high across all con­di­tions.” Mean­ing, read­ing com­pre­hen­sion was not bet­ter with two spaces vs. one.

“The pas­sages .... were rel­a­tively short and may not have been long enough or dif­fi­cult enough to de­tect sub­tle dif­fer­ences” be­tween one space and two.

The au­thors ac­knowl­edge the prob­lem I men­tion above: that “the para­graphs ... were pre­sented in a mono­spaced” font and that “word proces­sors to­day uti­lize pro­por­tional fonts”.

They also ac­knowl­edge my broader point about the nat­u­ral­ism of the sam­ples, and that re­sults “may dif­fer when pre­sented in other font con­di­tions (or other writ­ing systems).”

But here’s the show­stop­per. Re­call that the re­searchers sep­a­rated sub­jects into two groups: one-spac­ers (i.e., those who or­di­nar­ily typed with one space) and two-spac­ers. The “small” im­prove­ment in read­ing speed was only de­tected among sub­jects “who al­ready type ac­cord­ing to this two-space con­ven­tion”. For one-spac­ers, there was no ben­e­fit at all.

For me, this dif­fer­ence be­tween the read­ing per­for­mance of one-spac­ers and two-spac­ers was the most in­ter­est­ing part of the study. Here’s the chart show­ing how each group per­formed (one-spac­ers on the left, two-spac­ers on the right):

No­tice four things about the chart—

Among the one-spac­ers, read­ing speed was ba­si­cally the same re­gard­less of punc­tu­a­tion spacing. Among the two-spac­ers, the big im­prove­ment in read­ing speed came when read­ing text with the punc­tu­a­tion spac­ing that matched their own typ­ing style—that is, two spaces af­ter pe­ri­ods, and one space af­ter commas. No par­tic­u­lar punc­tu­a­tion spac­ing im­proved or re­duced read­ing speed for both groups. And here’s the big one—two-spac­ers were faster read­ers than one-spac­ers re­gard­less of punc­tu­a­tion spac­ing! This hoists a gi­ant red flag that the out­come of this study was de­ter­mined not by punc­tu­a­tion spac­ing, but other fac­tors not di­rectly tested.

No, I’m not pre­pared to the­o­rize why two-spac­ers seem to be gen­er­ally faster read­ers than one-spac­ers. I’ll leave that for the in­ter­net hordes to ar­gue for the next few years. It’ll be a nice change of pace.

Mean­while, my ad­vice will re­main the same: one space be­tween sen­tences. Inas­much as the study showed a ben­e­fit for only one sub­set of read­ers, I’ll de­clare Bet­teridge’s Law safe as well. Are two spaces bet­ter than one? No. I just saved you $39.95.

—Matthew But­t­er­ick

30 April 2018