On 15 December 2006, Nate Anderson posted a piece on the online journal Ars Technica entitled "Are iPods shrinking the British vocabulary?" In it, Anderson reports on research by Lancaster University linguistics professor Tony McEnery. According to the Ars Technica report:

McEnery found that one-third of most teenage speech was made up of only 20 common words like "yeah," "no," and "but." This is problematic for teenagers seeking jobs in the corporate world, where at least some level of professionalism is required when communicating with others.





The report finds that "technology isolation syndrome" is part of the problem. Teenagers spend increasing amounts of time immersed in television, video games, and music from their iPods-activities where they listen rather than speak. As a result, they don't get much practice at communicating clearly with others, and they aren't exposed to a wide vocabulary.

This appears, on first blush, to be very striking news. Are iPods, television, and video games destroying young Briton's ability to communicate? Are teenagers' vocabularies shrinking?

Probably not.

A striking aspect of the Ars Technica report is that it does not link to McEnery's work at Lancaster University. Instead, it refers to a similarly alarmist report from BBC News.

The unnamed reporters from the BBC appear to have interviewed Professor McEnery, and to have consulted a press release from Lancaster University. According to the BBC report, teens have an average vocabulary of about 12,600 words, compared to about 21,400 words for young adults aged 25-34. Of particular interest to the BBC reporters is this claim attributed to McEnery: "[The words 'no' and 'but'] occur in the sequence 'but no' or 'no but' almost twice as frequently in teenage speech as it does in young adult or middle aged speech."

These collocations are of interest since they are used to parody British teen speech in the television program "Little Britain". The BBC reporters can thus provide 'scientific proof' that the parody reflects reality. They quote linguist McEnery as saying,"When things are funny it is because they ring true with people."

What, then, should we make of the claims presented by Ars Technica and the BBC? Let me take what I see to be three central claims, each in turn.

1. One third of most teenage speech is made up of twenty common words.



This claim is probably true. If so, however, it is utterly unremarkable. As early as 1935, George Kingsley Zipf noted that the most frequent word types in a natural language account for the majority of word tokens. (In corpus linguistics, type refers to the general category - say, every instance of the word the - while token refers to one instance of the type.) Zipf's law states that the most frequent word will occur twice as often as the second most frequent, which will occur twice as often as the fourth most frequent, and so on. It is not surprising, then, that a small number of word types accounts for most of the tokens produced.

In fact, in collections of English speech and writing such as the Brown Corpus or the British National Corpus the top twenty words usually account for about a third of all words in the corpus, depending, among other things, on how you define "word".

Linguists Mark Liberman, Geoff Pullum, and Arnold Zwicky have recently written quite a lot about the treatment of this non-story by the BBC in their blog, Language Log, including a response from Tony McEnery. I'll say no more about it here.

2. Teens have an average vocabulary of 12,600 words, compared to 21,400 words for adults.



This is a very difficult question to address given the difficulty of defining vocabulary and words. What does it mean to have an item in one's vocabulary? Does recognizing it in context (called passive vocabulary) count? Or must one be able to speak or write it (called active vocabulary)? The Cambridge Encyclopedia of the English Language (Crystal 1995) refers to an apparently unpublished study in which three subjects, an office secretary, a business woman, and a university lecturer, estimated their active and passive vocabularies by marking the number of headwords in an unspecified dictionary that they used and recognized. The three subjects had estimated active vocabularies of 31,500, 63,000, and 56,250 words, and passive vocabularies of 38,300, 73,350, and 76,250 words each. It seems clear that McEnery's methods must have differed from that used in this study.

It may seem easy enough simply to count the number of words each subject uses or recognizes. However, this assumes that words are discretely defined, which is not the case. A single lexeme (the minimal unit of a lexicon) may have multiple forms. The lexeme GO, for instance, has the forms go, goes, gone, and went. Does this count as one word, or four?

It's not clear how McEnery defined these issues for the purpose of his research. The press release from Lancaster University describing McEnery's study, though, provides a very sensible analysis of the reported difference in vocabulary sizes. According to the statement, "The research clearly demonstrated that teenagers are still developing their oral communication skills, underlining the need to ensure that they are given appropriate support by schools in doing so." In other words, those who are currently undertaking secondary education know less than those who have completed high school or even college. This seems utterly commonsensical, though it will probably attract fewer readers than claims that contemporary young people are failing to live up to the standards achieved by their elders.

3. "Technology isolation syndrome," caused by over-use of television, video games, and iPods, is part of the problem.



This is the issue that first got linguistic anthropologists - or at least, some contributors to this blog - interested in the claims. Alexandre Enkerli suggested that the Ars Technica piece presented a reductionist form of linguistic determinism. Enkerli noted that coverage of the study seemed to reiterate tired 'kids these days' discourses without using the study "as an opportunity to see the actual connections between technological developments, social changes, and language change." It seems that press coverage not only played up the technology angle - it introduced it.

As with other elements of this story, "technology isolation syndrome" appears to originate not from any academic study, but from the 12 December BBC piece. The Lancaster University press release makes no mention of the supposed syndrome, and it is not mentioned in any academic studies I can find. A Google search for "technology isolation syndrome" finds fewer than 400 references, all apparently variants of the BBC piece.

None of these reports is very close to descriptions of the study by Lancaster University News or Tony McEnery. (The research itself was carried out for a private sponsor and is confidential.) In fact, it appears that only one source, the BBC, had any direct contact with McEnery or his research. The other sources, mostly technology-related blogs, relied on the BBC report. That report contains several quotes from McEnery, which reflect a desire to improve the teaching of speech in British schools.

As the story has moved to technology-blogs, this focus on the teaching of spoken English has largely disappeared. Instead, one off-hand comment gets all the press: "This trend, known as technology isolation syndrome, could lead to problems in the classroom and then later in life."

Nowhere does McEnery mention television, video games, or iPods. The original study was, however, based on a corpus of speech (10,000,000 words) plus writing in blogs (100,000 words). This may be the slim foundation on which the edifice of "technology isolation" reporting rests.

According to McEnery, "[The] work itself was widely misrepresented in the press. I wrote a study looking at difference and, predictably, the press translated that into a discourse of deficiency." He directs interested parties to the Lancaster press release, which he says is "something closer to the spirit of the original report."

According to that press release:

New research by Professor Tony McEnery of the Department of Linguistics and English Language argues that it is important that we remember that teenagers are still developing their linguistic skills not merely in reading and writing, but also in oral communication. Schools need to focus on the development of speaking skills just as much as they need to focus upon the development of reading and writing.



...



Professor McEnery's research looked at the communication skills of 200 teenagers with an examination of 10,000,000 words of transcribed, naturally occurring speech from across the UK collated in a language database as well as 100,000 words of data gathered from blogs written by teenagers. The research clearly demonstrated that teenagers are still developing their oral communication skills, underlining the need to ensure that they are given appropriate support by schools in doing so.

This is sensible enough, with no trace of linguistic determinism, techno-phobia, or the fall of British society. On the other hand, it probably won't sell much advertising.

Crystal, D. 1995. The Cambridge encyclopedia of the English language. Cambridge: Cambridge University Press.



Zipf, G.K. 1935. The psycho-biology of language; an introduction to dynamic philology. Boston: Houghton Mifflin.

-Chad Nilep