Hash yer dothrae chek asshekh?

This is how you ask someone how they are in Dothraki, one of the languages David Peterson invented for the phenomenally successful Game of Thrones series. It is an idiom in Peterson’s constructed language, meaning, roughly “Do you ride well today?” It captures the importance of horse-riding to the imaginary warriors of the land of Essos in the series.

Invented, or constructed languages, are definitely coming into their own. When I was a young nerd, about the age of eleven or twelve, I used to make up languages. Not very good ones, as I knew almost nothing then about how languages work. I just invented words that I thought looked cool on the page (lots of xs and qs!), and used these in place of English words.

I wasn’t the only person that did this. J.R.R Tolkien wrote an essay with the alluring title of A Secret Vice, all about his love of creating languages: their words, their sounds, their grammar and their history, and the internet has created whole communities of 'conlangers’, sharing their love of invented languages, as Arika Okrent documents in her book In the Land of Invented Languages.

For the teenage me, what was fascinating was how creating a language opened up worlds of the imagination and how it allowed me to create my own worlds. I guess it’s not surprising that I eventually ended up doing a PhD in Linguistics. Those early experiments with inventing languages made me want to understand how real languages work. So I stopped creating my own languages and, over the last three decades, researched how Gaelic, Kiowa, Hawaiian, Kiitharaka, and many other languages work.

A few years back, however, I was asked by a TV producer to create some languages for a TV series, Beowulf, and that reinvigorated my interest in something I hadn’t done since my early 20s. It also made me realise that thinking about how an invented language could work actually helps us to tackle some quite deep questions in both linguistics and in the psychology of language.

To see this, let me invent a small language for you now.

This is how you say “Here's a cat.” in this language:

Huna lo.

and this is how you say “The cat is big.”

Huna shin mek.

This is how you say “Here's a kitten.”

Tehili lo.

Ok. Your turn. How do you say “The kitten is big.”?

Easy enough, right? It’s:

Tehili shin mek.

You’ve spotted a pattern, and generalized that pattern to a new meaning.

Source: David Adger

Ok, now we can get a little bit more complicated. To say that the cat's kitten is big, you say

Tehili ga huna shin mek.

That's it. Now let's see how well you've learned the language. If I tell you that the word for “tail” is loik, how do you think you'd say: “The cat's tail is big.”?

Well, if “The cat's kitten is big” is Tehili ga huna shin mek, you might guess,

Loik ga huna shin mek

Well done (or Mizi mashi as they say in this language!). You've learned the words. You've also learned some of the grammar of the language: where to put the words. We're going to push that a little further, and I’ll show you how inventing a language like this can cast interesting facts about human languages into new light.

The fragments of the constructed language you've learned so far have come from seeing the patterns between sound (well, actually written words) and meaning. You learned that cat is huna and kitten is tehili by seeing them side by side in sentences meaning “Here's a cat.” and “Here's a kitten.”. You learned that the possessive meaning between cat and kitten (or cat and tail) is signified by putting the word for what is possessed first, followed by the word ga, then the word for the possessor.

This is a little like how linguists begin to find out how a language that is new to them works. I've learned how many languages work in this way: by consulting with native speakers, finding out the basic words, seeing how the speaker expresses whole sentences, and figuring out what the patterns are that connect the words and the meanings. This technique allows you to discover how a language functions: what its sounds and words are, and how the words come together to make up the meanings of sentences.

Now, how do you think you'd say: “The cat's kitten's tail is big.”?

You'd probably guess that it would be:

Loik ga tehili ga huna shin mek.

The reasoning works a little like this: if the cat's kitten is tehili ga huna, and the kitten is the possessor of the tail, then the cat's kitten's tail should be loik ga tehili ga huna. Similarly, if the word for “tip” is mahia, then you should be able to say

Mahia ga loik ga tehili ga huna shin mek.

That's a pretty reasonable assumption. In fact, it's how many languages work. But say I tell you that there's a rule in my invented language: there's a maximum of two gas allowed. So you can say “The cat's kitten's tail is big.”, but you can't say “The cat's kitten's tail's tip is big.” My language imposes a numerical limit. Two is ok, but three is just not allowed.

Would you be surprised to know that we don't know of a single real language in the whole world that works like this? Languages just don't use specific numbers in their grammatical rules.

I’ve used this invented language to show you what real languages don’t ever do. I’ll come back in the next blog to how we can use invented languages to understand what real languages can’t do.

We can see the same property of language in other areas too. Think of the child's nursery rhyme about Jack's house. It starts off with This is the house that Jack built. In this sentence we're talking about a house, and we're saying something about it: Jack built it. The dramatic tension then builds up, and we meet ... the malt!

This is the malt that lay in the house that Jack built.

Now we're talking about malt (which is what you get when you soak grain, let it germinate, then quickly dry it with hot air). We're saying something about the malt: it lay in the house that Jack built. We've said something about the house (Jack built it) and something about the malt (it lay in the house). English allows us to combine all this into one sentence. If English were like my invented language, we'd stop. There would be a restriction that you can't do this more than twice, so the poor rat, who comes next in the story, would go hungry.

This is rat that ate the malt that lay in the house that Jack built.

But English doesn't work like my invented language. In English, we can keep on doing this same grammatical trick, eventually ending up with the whole story, using one sentence.

This is the farmer sowing his corn,

That kept the cock that crow'd in the morn,

That waked the priest all shaven and shorn,

That married the man all tatter'd and torn,

That kissed the maiden all forlorn,

That milk'd the cow with the crumpled horn,

That tossed the dog,

That worried the cat,

That killed the rat,

That ate the malt

That lay in the house that Jack built.

This is actually a very strange, and very persistent property of all human languages we know of: the rules of language can't count to specific numbers. Once you do something once, you either stop, or you can go on without limit.

What makes this particularly intriguing is that there are other psychological abilities that are restricted to particular numbers. For example, humans (and other animals) can immediately determine the exact number of small amounts of things, up to 4 in fact. If you see a picture with either two or three dots, randomly distributed, you know immediately the exact number of dots without counting. In contrast, if you see a picture with five or six dots randomly distributed, you actually have to count them to know the exact number.

The ability to immediately know exact amounts of small numbers of things is called subitizing. Psychologists have shown that we do it in seeing, hearing and feeling things. We can immediately perceive the number if it's under 4, but not if it's over. In fact, even people who have certain kinds of brain damage that makes counting impossible for them still have the ability to subitize: they know immediately how many objects they are perceiving, as long as it's fewer than 4.

But languages don't do this. Some languages do restrict a rule so it can only apply once, but if it can apply more than once, it can apply an unlimited number of times.

This property makes language quite distinct from many other areas of our mental lives. It also raises an interesting question about how our minds generalize experience when it comes to language.

A child acquiring language will rarely hear more than two possessors, as I document in my forthcoming book Language Unlimited, following work by Avery Andrews. Why then do children not simply construct a rule based on what they experience? Why don’t at least some of them decide that the language they are learning limits the number of possessors to two, or three, like my invented language does?

Children's ability to subitize should provide them with a psychological ability to use as a limit. They hear a maximum of three possessors, so why don't they decide their language only allows three possessors. But children don’t do this, and no language we know of has such a limit.

Though our languages are unlimited, our minds, somewhat paradoxically, are tightly constrained in how they generalize from our experiences as we learn language as infants. This suggests that the human mind is structured in advance of experience, and it generalizes from different experiences differently, and idea which goes against a prevailing view in that we apply the same kinds of generalizing capacity to all of our experiences.