Language defines what we are. Something linguists argue it’s the very essence of being human. It’s the key skill that sets us apart from animals. Yet while we use language routinely, on a daily basis, it’s something which we don’t fully consciously comprehend. It’s difficult to understand consciously the mechanics of language. Until recently, the sheer scale of language defied a comprehensive analysis, but that wasn’t for want of evidence. As you can see, we’re surrounded by the evidence for the use of language. There is a sea of words around me here, but of course, without suitable help, an analyst can drown in this sea of words, so we need to step out of the age of paper and ink.

The computer has changed everything. For the first time, we’re able to rapidly and reliability search through millions or even billions of words of data. At the same time, electronic publishing has made available to us, on a scale that’s quite unprecedented, electronic language data, texts. We can gather those texts together into a body of data called a corpus, the plural of which is corpora, that we use to study language on a computer. Now, the development of such corpora is leading to a golden age in the study of language. For the first time, as the vast collections of data become available, we can easily study language across a range of languages and even back through time.

By entering the digital age, analysts are able to search for patterns that would probably defy analysis by hand and eye alone. Take, for example, the word tendencies. It’s usually associated with negative things. Now, some of you may not have known that. Some of you may have suspected it. The great thing about using corpus data is you can look into the data. If you didn’t know it, you’re shown it. If you suspected it, you can confirm your suspicions. Now. This revolution in the study of language has probably touched on your everyday life already. Dictionaries, grammar, spell checkers, grammar checkers, speech synthesis systems, even web search engines, to some extent, rely on these insights into language provided by corpus data.

On this course, you’ll learn about the range of applications of corpus data in the study of language both in linguistics and beyond it, in the social sciences for example. Importantly, you’ll also get a sense of what it’s like to study at Lancaster University. You’ll have lectures, practical tasks, readings, additional lectures, and discussions available to you each week. So I welcome you to join me in this journey into language. I think you’ll find it interesting. You’ll certainly find it empowering because, by the end of the course, you too will be able to carry out some of these analyses on your own.

Language defines what we are.