Strike Out

Reading Unedited Text

by John Walker

July 2005

Prior to the advent of large-scale data networks, and apart from communication among family and friends, the vast majority of text read by people was edited—reviewed and corrected by a professional editor before it was published in a newspaper, magazine, or book. Direct, computer-mediated communication between individuals has produced a grand disintermediation of communication; the medium of text has escaped the bounds and the mediation of the traditional media. This, like most things, has its good and bad points.

On the positive side, the rôle of the editor as gatekeeper has been largely transcended; everybody has potential access to an on-line audience of enormous size, and there is no entrenched power which determines whose work will be widely read—in principle, only merit and the ability of a writer to make joyful noise, generating awareness of their work, determines how broad an audience it will reach. On the other hand, one of the functions editors have provided over the centuries is sending back the work of drooling morons to their own respective mailboxes and fixing up the prose of those insightful enough to have something interesting to say yet sufficiently incompetent so as to stumble over the language between idea and its expression in words on the page.

When we move beyond the age of the editor, into this brave new world of newsgroups, discussion boards, weblogs, and the like, we dispense not only with the constraint which filters the words of writers, but the quality control which ensures they're ready to meet the eyes of readers. Modern-day readers of un-edited text are faced with a challenge which never faced their ancestors reading letters to the editor—that any elbow-typing moron can foul the intellectual commons with their incoherent and/or inane ravings. How can one cope with this?

An approach which works for me is what I call “Strike Out” or “One strike and you're out”. When reading unedited text in any medium, when you encounter the very first misspelled word, ungrammatical construct, or gratuitous obscenity, simply let the text mentally fade to white, avert your eyes, and skip to the next item by another author. In effect, you're telling the authors of prose you read, “My time is important. If you can't take the time to spelling check and read over the messages you post, then why should the thousands of people to whom they are addressed have the slightest regard for what you have to say? If you are so unimaginative that the only way you can add emphasis to your words is to sprinkle crude, uncultured obscenities among them, why should I admit you to those whose opinions I value?”

Note: I do not, in any way, intend to impugn English-language postings by people with a different mother tongue. Acquiring a language, especially in adulthood, is tough, and unlimited slack is due to those who have done so and have the courage to use their language skills in fora where criticism can be swift and devastating. I find it quite easy to distinguish the errors made by those writing English as a second (or third, or fourth, etc.) language from the slapdash scribblings of marginally literate Anglophones, although this may be due in part to my own experience using (and abusing) a second language in my day-to-day life.

Of course, to an engineer like me, the test of any scheme is how well it works, and anything worth considering should be tested under the most demanding circumstances available, so I will draw the example for this document from a discussion on the “Slashdot” site which, by the evidence, is disproportionately frequented by morbidly obese hateful people with extremely limited social skills who cannot spell. I do not recommend you visit this site, but it is a formidable proving ground for anything involving bad writing. I chose, as the test case, a discussion which took place on July 6th, 2005, regarding an individual who was arrested for poaching Internet access from somebody with an unsecured wireless access point.

Let's start with a message which is on topic, well written, and thus doesn't strike out. This one we read to the end and go on to the next.

Re:Open doors (Score:5, Insightful) (#12992166) It's more like sitting on the sidewalk outside someone's house at night. Their porch light is on and you're reading a book by that light. One could say you're using the light they paid for without their permission. On the other hand, they're letting the light spill out into public land.

Now let's move on to some messages which struck out. I've shown the word on which the message struck out in red with a line through it, after which the message “just fades away”. The unread part of the message is shown in the background colour to indicate how much you saved reading by striking out the message, with additional strikes within the message highlighted to indicate what you avoided by ceasing to read at strike one. The formidable challenge posed by the humble apostrophe strikes out the following message in the second sentence.

Re:Open doors (Score:5, Interesting) (#12991772) That is an interesting point that you've brought up. It is completely opposite way of thought than how American's have previously thought about property. For example how many of you grew up and left doors unlocked to your house or car all the time. I for one never locked my car doors at home nor the front door to my house. It is your private property and you never expect anyone who wasn't welcome to break those boundries, but we have welcomed the Internet with it's complete opposite point of view. I wonder if this same ideal is why people don't bother securing wireless even when most have some grasp of the reprocutions of not securing their wireless.

Our next batter swings and misses on the first character (although I've shown the first three words to indicate things continue to go downhill from there), being one of those Slashdot denizens who is so eager to expose their thoughts before a global audience that there isn't time to depress the Shift key when socially constructed language conventions demand it.

Water, water everywhere but not a drop to drink (Score:4, Interesting) (#12991499) and i supose if you go and drink water from a public fountatin i should be arrested too for the fact the water is open to the public and not locked down. Sounds like they dont want to take fault for not fencing up a public oasis in the middle of no where because you know if it isnt yours its owned already by some one else more powerful and richer then you. Also what if the wifi is a public wifi by choice for the people to use? is it still stealing then?

Now we encounter one of those folks who is so intimidated by the apostrophe that they eschew it entirely, striking out on the second word.

Signal Strength (Score:5, Interesting) (#12991979) Ok lets just say for arguments sake that he wanders with his laptop to the opposite side of his house, far away from his own wireless access point. The computer sees the other access point has a stronger signal and latches on to it during a break in communication with his own access point. He is unaware of the change and continues with his business. Are the default settings for wireless access communication illegal? What would stop someone from plugging in a wireless access point boosting the signal strength and calling the police any time someone accidentally connects? I live in an apartment complex with about 7 other visible access points. I occasionally get bored and plug in a spare access point with no internet connection attached to see who accidentally locks on to me and loses their internet access.

Our next contributor to the discussion is one of those people who is so clever they've come up with the idea to use the vulgar Anglo-Saxon word for copulation as a term of emphasis, striking out thereby in the first sentence.

This Story Isn't About WiFi... (Score:5, Insightful) (#12991996) It is about the fact that the guy was a obscenity creep. Seriously- if he REALLY thought what he was doing was OK, why did he act all cagy and close the laptop/drive away every time the homeowner saw him? WiFi or not, this guy was acting strange in front of someone's home in such a way that I think it would probably freak most people out. The cops used the WiFi excuse just to bust the guy and I say jolly good show on them. I would feel very diferently if the guy simply said to the homeowner who he was and the fact that he was surfing on his net connection, but he didn't.

Now we have a call to action, exhorting readers to call the office of a public official whose title figures in no English dictionary of which I am aware.

Re:Open doors (Score:4, Informative) (#12994481) The prosecuter's office that is handling this case can be reached at 727-555-6221. I suggest we let them know that if you broadcast an SSID into the public airwaves and then grant DHCP leases across it you are authorizing access to your network.

Strikeout Statistics

How well does striking out messages work, in terms of how much slipshod writing (and the presumably sloppy thinking it transmits) you avoid reading? Taking the examples above, which were picked essentially at random from the discussion thread, we find that a total of 34 words were read before the respective messages struck out, from a total of 444 words in the original postings (not counting headers or identification information). Striking out the messages thus eliminated more than 90% of the text you'd otherwise have read in these messages.

Another way to look at the effectiveness of striking out messages is to examine the percentage of all messages which strike out. I went through the entire discussion thread, looking only at messages moderated at levels +4 and +5 (the highest), and ignoring any messages moderated as “Funny” (including them would doubtless have increased the strikeout totals, but my constitution is not up to reading missives deemed “funny” by Slashdot regulars). With these pre-filters, a total of 34 messages made it through without striking out, while 28 struck out. Thus, about 55% of messages survived the strikeout filter. Figuring that striking out a message reduces the amount of it you read by about 90%, one can estimate that adopting the strikeout rule cuts the amount of bad text you read in unedited venues by about half, which is not bad for a simple mental trick. However, one can imagine technological fixes which could go well beyond this.

The Banish Button

One can imagine extending the strikeout concept so you never see further messages from an author you've struck out. Imagine if your favourite discussion board or blog comments displayed a Banish Button next to each posting, like the one I've affixed to the bottom right of the message below. As soon as the message strikes out, you simply press the button, and the message disappears along with all other messages from that poster, present and future. Try it on the message below. (For typographic consistency, I show the message like others which have struck out; in the actual implementation, the message would completely disappear, leaving only a placeholder in case you decided to give it a reprieve. And, yes, I'm aware the image swapping if you toggle the button several times doesn't always work with Internet Exploder; if you find this intolerable, switch to a better browser.)

Should charge the idiots who leave in unencrypted (Score:5, Insightful) (#12991486) If microsoft left xp disks at street corners unattended complete with legal cororate serial numbers would they be surprised if people were using them? Same idiocy here. Leave a network open and someone's going to get in. If you're lucky it's just for free internet.

Implementing this would simply require the bulletin board or blog software to maintain a list of banished posters for your login, and hide messages from them when preparing pages for your scrutiny. (Obviously, if you logged in anonymously, you wouldn't have access to this service [you might be able to banish messages during the current session, but the list of banished posters wouldn't be saved for future visits to the site]. But that's fine—most site operators want to encourage users to log in, anyway, and this would be another way to add value for those who do.)

If Web-wide identification systems such as Gravatar and OpenID catch on, it will be possible to banish posters globally—somebody you banish for an obscene or inane blog comment will disappear from all blogs you read which use the same identity system. Of course irritating people can make up multiple identities, but if the software provides them feedback which indicates how quickly and broadly they get banished every time they pop up under a new name, perhaps they will eventually be deterred by the knowledge that nobody is reading their ravings.

“Spelling Fascists”

I'm certain that my reward for all the five or six minutes of patient, unremunerated toil I've invested in describing how to cope with a Web bursting with unedited text will be castigation as a “spelling fascist”. So be it—it's a free network, at least for the moment. But let's consider some of the arguments (or “arguements”, as those advancing them may prefer to write) in favour of illiterate prose and see whether they hold water.

It's the message that counts, not the spelling. Well, yes … but if you're asking me to ponder your judgement of the merits of, say one operating system, desktop environment, application suite, or programming language against another, or the fine points of a political or legal argument grounded in a discussion of history, I'm inevitably going to weigh what you say according to my own perception of how authoritative an observer you are of the domain about which you're declaiming. Now all of these things: operating systems, desktops, applications, languages, history, law, and politics, are thousands of times more complicated, especially as regards their comparative merit to diverse individuals than, say, the proper use of the possessive case in English or the distinction between “its” and “it's”. If the writer can't master the latter, how much weight should one give to their opinions of matters which require far more attention to detail? The Internet is a dynamic medium; there's no time to carefully proofread text before posting it. Consider what the folks who advance this argument are saying. “I'm in such a hurry that I can't be bothered to critically read what I've written before I dispatch it to be read by hundreds, thousands, or millions of other people. My time is so valuable, the five or ten minutes it would take to spelling and grammar check my posting, then read it over for coherence and edit it accordingly cannot be justified. Better all of my readers spend the time to figure out what I was trying to say than I spend a minute making it clear.” And these guys want us to read their scribblings?

Questions and Quibbles

Since posting this document, I've received more comments and questions than were occasioned by anything I've posted on my Web site in years. This wasn't entirely unexpected, as I was aware that people might read some of what I have to say as incendiary, but I was a bit surprised since I didn't in any way highlight the document on the site nor front-load it with controversy.

The experience has, to some extent, restored my faith in the maturity of the on-line community. First of all, without a single exception to date, the comments I've received were well written and reasoned, and even when they took extreme exception to what I had to say, argued on the issues as opposed to descending to ad hominem arguments, maledicta, or invective. The comments I've received so far have highlighted both things I'd thought about but didn't include in the first draft of this document, and raised issues I hadn't considered which, after some head scratching, I'll address in the sections below.

Why is correct spelling so important? Why does bad spelling so offend you? Shortly after posting “Strike Out”, I came across a superb discussion of this issue in an with Bernard Pivot in the July 2005 issue of the French magazine Historia , in which he begins when asked whether we're living in the final days of correct spelling (my translation): “No, because [spelling] remains a politeness one owes to our language, and a politeness one owes to those to whom one writes.” Yes! I think that's one of the reasons I'm offended by bad spelling—because it's fundamentally as impolite as insulting somebody, especially in an age where it is almost 100% avoidable at the cost of a few seconds of additional effort (running a spelling checker). But consider that much of the Internet has always been observed to be an impolite medium—people say things in E-mail messages and newsgroup postings which they would never think of saying face to face, and almost any unmoderated/edited forum is likely to develop recurrent and unending flame wars. This, I think, may be a corollary of Heinlein's observation that “An armed society is a polite society.” If you insult somebody to their face, there's always the possibility you're going to get a punch in the nose or worse. When there is no possibility of immediate retaliation other than words, and even more when individuals lurk behind anonymity and pseudonyms and thus escape all accountability, you end up with a coarse and impolite society like the Internet. Perhaps the sloppy writing and easy resort to invective are consequences of an on-line society without accountability or recourse. Aren't you concerned about missing a nugget of wisdom in a misspelled or ungrammatical message? In almost every circumstance when you're reading unedited text, whatever you read has already been heavily filtered before you read it. I went back and ran the numbers for the Slashdot thread I used as the test case, and found that by reading at moderation level 4 and above, a total of 95% of the posted messages were discarded before I ever saw them by the moderation setting. Now that filtering was done by people unknown to the readers, based on criteria known only to the anonymous moderators of the moment. Doubtless some valuable information was lost in that filtering process as well, so I'm not overly concerned about losing a bit more with my filter. Even in completely unmoderated Usenet groups (which, I'll admit, I've found too painful to read since the mid-1990s), what you read is still filtered—randomly—by the mere fact that you can't possibly read it all. Unless somebody is totally obsessed and spends their whole life reading the newsgroup, they're going to miss some things simply because messages scrolled off when they weren't looking. Again, losing some content to a filter should be judged compared to how much is lost even without one. The fact that on-line discussions tend to contain a lot of redundant content means that if you miss something, you're likely to run into it again in a subsequent posting (and again, and again…). I call this “the principle of abundance”—what you gain by reading unedited media is such an abundance of messages that regardless of how you filter them, as long as it's vaguely rational, you will eventually find the information you need because anything worth saying is said repeatedly; if you skip a poorly-written message with valuable content, odds are there's a better-phrased message with the same information further down the list. Isn't it arrogant to insist on perfect spelling and grammar in messages you're reading for free? The fundamental question is whether there is a correlation between the quality of writing and the value of the message content. I have found that in the unedited venues I read there is such a correlation and that it is quite high. Other people disagree, but maybe that's because they're reading other material—if they do not see the correlation I do, then they should not use the strike out rule, which is only a personal heuristic, not a proposal for an Internet standard! One data point I find interesting in this regard is Jerry Pournelle's site, where he posts letters from readers on many of the same topics as Slashdot articles. Jerry selects the mail he posts based on interest to his audience (and the degree of filtering is quite high), but he does not edit letters for spelling or grammar—what he gets is what he posts. Well, it turns out that far fewer of these E-mail messages strike out than do messages on Slashdot, which to me says there is a correlation between content and good writing. On the other hand, on some of the law professor blogs, you hardly ever see a message strike out (even in the comments, not just the postings). You don't get to be a law professor without superb written communication skills, even if your ideas are complete gibberish, so here the strike out rule is of little use. “Arrogance” is defined as having an inflated sense of self-worth or importance. Is it arrogant to value one's time and try to avoid wasting it by reading messages which don't contain information of sufficient value to justify the time spent reading them? If so, then colour me arrogant. Choosing not to read all of somebody's message is not an insult, affront, or attack upon the author, nor does it imply a claim of superiority—it's simply a judgement that the time spent reading the rest could be better spent reading something else—a decision made innumerable times by anybody reading a newspaper or magazine. Your “Banish Button” is nothing more than a news reader kill file entry! Why make such a big thing about it? Precisely! For more than two decades, programs for reading Usenet news groups have supported “kill files”, which permit hiding messages based upon their content. When somebody despoils a news group, one need only add the sender's identity to the kill file, and it's as if they never posted further messages, as far as that reader is concerned. News reader kill files are a “client side” implementation of filtering. The “Banish Button” I propose is simply a server side implementation of the same thing, with the added fillip that with the advent of portable cross-site identities, banishing a user in one forum will hide their messages in all others where they post with the same identity (precisely as a kill file entry for a user name will hide postings by that user in all Usenet news groups).

by John Walker

July 2005

Revised August 13th, 2005