IF FORCED to pick my favourite part of the history of English, I’d be torn. There are so many to choose from. Would I pick the Great Vowel Shift, the mid-millennium change in pronunciation that largely explains English’s inconsistent spelling? Perhaps I’d turn to colonial times, when English vocabulary ballooned. I do like Noah Webster’s attempts to change American English spelling in the name of efficiency, too.

But my favourite must be the Norman invasion of 1066. When the Normans, who spoke a dialect of Old French, ruled over England, they changed the face of English. Over the ensuing two centuries, thousands of Old French words entered English. Because the ruling class spoke Old French, that set of vocabulary became synonymous with the elite. Everyone else used Old English. During this period, England's society was diglossic: one community, two language sets with distinct social spheres. Today, English-speakers pick and choose from the different word sets—Latinate (largely Old French borrowings) and Germanic (mostly Old English-derived words)—depending on the occasion. Although English is no longer in a diglossic relationship with another language, the Norman-era diglossia remains reflected in the way we choose and mix vocabulary. In informal chat, for example, we might go on to ask something, but in formal speech we’d proceed to inquire. There are hundreds of such pairs: match/correspond, mean/intend, see/perceive, speak/converse. Most of us choose one or the other without even thinking about the history behind the split. Germanic words are often described as earthier, simpler, and friendlier. Latinate vocabulary, on the other hand, is lofty and elite. It’s amazing that nine hundred years later, the social and political structure of 12th-century England still affects how we think about and use English.

English isn’t alone in having this sort of split personality. Halfway across the world, languages spoken in southern India underwent similar changes. Kannada, Malayalam, Tamil, and Telugu, the four major languages spoken there, are Dravidian languages. They are structurally unrelated to the languages of northern India, which are Indo-European. But Sanskrit, an Indo-European language of ancient India and the liturgical language of Hinduism, has held prestige all over the subcontinent for over two thousand years. Kannada, Telugu, and Malayalam—and to a lesser extent Tamil—have absorbed, and continue to absorb, thousands of Sanskrit words. (A relatively recent movement among Tamil-speakers aimed to expunge the Sanskrit borrowings.) Much of southern India, just like Norman England, was diglossic between Sanskrit (used ritually and formally by Hindu elites) and vernacular Dravidian languages. Today, that diglossia is gone, but Sanskrit-derived vocabulary still forms an upper crust, mostly pulled out for formal speech or writing.

Some writing, especially poetry, still slants toward native vocabulary. Two influential religious movements among Hindu Kannada-speakers, the 12th-century Lingāyat and the 16th-century Haridāsa movements, treasured simple Kannada poetry. These movements arose in part to spread religious teachings beyond Sanskrit-educated elites to the common people. Works written then are largely devoid of obvious Sanskrit borrowings. To many Kannada-speakers, those works are softer and folksier than stiffer Sanskrit-heavy works. But caste and class politics didn’t end then, of course. Sanskrit still holds sway in India today, officially one of the "scheduled" languages listed in the constitution. It sometimes seems like any Kannada newscaster or speechwriter worth his salt swears by a Sanskrit dictionary. Sanskrit borrowings are used all over the place in order to sound proper, even when it sounds strange. (My favourite example of strained usage is the upscaling of “toilet” to shauchālaya, “abode of cleanliness”.) In the most tortured formal writing, Sanskrit words might just be strung together with Kannada grammatical endings. This has the strange consequence of allowing speakers of unrelated languages like Hindi to take a stab at translating the text. (Hindi, as it happens, is also split between the Sanskrit-heavy shuddh, “pure”, Hindi, popular in government and academia, and colloquial Hindi, which makes greater use of Arabic and Persian borrowings.) There’s some sweet spot in the middle of both extremes. Good writers seem to get it best.

It has always fascinated me how the Sanskrit/Dravidian divide in Kannada is so strikingly similar to the Latinate/Germanic divide in English. In English, word choice is often used to judge someone's class or education. In Kannada, caste is also mixed in. Picking certain words over others can have social consequences, branding the speaker or writer according to his vocabulary. In both languages, older borrowings underwent sound and spelling changes, but newer borrowings keep the roots intact. (In English, these old pre-Norman borrowings are mainly religious terms, like "nun", "monk", or "priest".) “Native” terms are considered earthier and Sanskrit/Latin-derived borrowings are stuffier. But there are interesting differences, too. English didn’t descend from Latin, though they’re both Indo-European. Dravidian languages, in contrast, aren’t related to Sanskrit at all. In Kannada, Telugu and Malayalam, the alphabet had to expand dramatically to incorporate Sanskrit sounds like voicing and aspiration. The shift was so complete that each language's alphabet, while written completely distinctly, contains nearly all of the same sounds as the Sanskrit-descended Hindi.

Many languages have "high" and "low" layers of vocabulary. But in most other languages, the two sets are drawn from the same source. By contrast, contact between Old English and French, Dravidian languages and Sanskrit, Japanese and Chinese, Persian and Arabic, and other pairings around the world have created fascinatingly hybrid languages. These mixed lexicons are, for linguistic and social historians, akin to the layers of fossils that teach paleontologists and archaeologists so much about eras gone by.

Some people even think English is descended from Latin, or Kannada from Sanskrit. That’s frustrating not only because it’s wrong, but also because the reality is far more interesting.