Mickey Mouse linguistics

by Pieter A.M. Seuren

This is my first blog.I am new to this medium,still finding my feet. My name is Pieter Seuren. You will find me at: www.mpi.nl/people/seuren-pieter. I invite you to share my thoughts on a variety of topics, but mainly on linguistics, since I am a linguist by nature, training and trade. But, as those who are familiar with my work will know, other subjects have kept me busy also, such as philosophy, logic, and, to no small extent, history. Privately, I have also been active, as far as time permitted, translating poetry from a variety of languages into English. I cannot promise anything, but there is a likelihood that any or all of those topics, and maybe others as well, will appear in this blog sooner or later. This first time, my text has become a bit longer than I had wanted. I will try to be more succinct in later deliveries.

Today I am writing about linguistics, the subject that I have dedicated my life to since I was a boy, sixty odd years ago. The reason is that I am angry, an angry old linguist if you like. I am angry because linguistics, especially theoretical linguistics, has come up for grabs. The term linguistics has lost any definite meaning as it stands for a spectrum of often diametrically opposed views and so-called theories or ‘schools’, connected by ties that range from circumspect alliance to deadly hostility—mostly the latter. The only form of linguistics that has managed to stay outside the fray, and thus to keep up some degree of dignity and respectability, is typological linguistics, which collects data from as many different languages as possible and tries to find the widest possible generalizations as regards the form of their sentences and words, their lexicon and their phonology. Here at least, data are taken seriously, without bias or prejudice. But this type of study is less focussed on theory as it aims for large (and very useful) corpora of observed facts, mostly leaving the theoretical interpretation to others. Despite typology, however, linguistics has acquired a bad name in academia (worse than philosophy, actually).

Outsiders are faced with a bewildering diversity of opinions regarding the notion of language. A few years ago I was at a conference somewhere in Scandinavia where a fully salaried professor of linguistics gave a plenary lecture in which she tried to tell us something to the effect that linguistics is like the study of how to arrange the desks and chairs and other furniture in an elementary school class room. She had been invited to give that plenary precisely because this idea was so widespread in Scandinavia at the time. I know linguistics professors who, believe it or not, preach that language is an extension of gesture and cannot be separated from it. They and many of their allies maintain that language emerges as an automatic consequence from the general properties of the human mind or brain on the basis of the statistics of input frequencies. For others, language is a largely isolated, ‘spare part’ or ‘organ’ of the mind or brain, developed through evolution as a specific aspect of human nature. Others again see any specific language as nothing but the product of social convention. For some, languages can differ arbitrarily and without limits, so that one should not be too surprised, for example, to find a language where yes/no questions are formed by inverting the word order of the corresponding assertive sentence. For others, all the different languages of the world are nothing but small variations within a highly specific set of universal precepts—even though no single empirically reliable and coherent such set of precepts has been produced to date: all we have is a small, loose collection of incidental putative universals—interesting in themselves, but without much clout as long as they remain as isolated as they are.

One could at least agree on some sensible common method to investigate matters of this nature, including an agreement to respect all well-established data, even when they show a particular theory untenable, but the parties involved have so far shown a distinct lack of enthusiasm in this regard. Apparently in linguistics, theoretical agreement is to be avoided like the plague, even if that means putting up with a whole lot of nonsense.

Fifty years ago we thought we had some unanimity on basic notions and distinctions. We thought, for example, that the distinction between competence and performance, introduced by Ferdinand de Saussure at the beginning of the century as the distinction between langue and parole, was here to stay. Views differed on many points, of course, but there was a large common ground accepted by all and based on solid argument, and a common honest desire to make progress. Give or take a few minor differences, competence in, or command of, a specific language or language variety L was the, largely nondeclarative, long-term presence in the collective mind of the members of any language community of a system, consisting of a lexicon, a grammar and a phonology, defining L. Performance, also called speech, was the use made of the system. Nowadays, one is free to say that there is no language system, only language use—a perverse form of neopositivism, which accepts only observable data and rejects any notion of a ‘formal’ or ‘abstract’ (i.e. non-observable) underlying system of rules or principles causing the observable data. (Among large groups of linguists, the words ‘abstract’ and ‘formal’ have become shibboleths for anything bad.)

To see how absurd this is, consider traffic. Nobody confuses actual traffic with the traffic system and its rules, and everyone understands the relation between the two. But some linguists hold that no distinction is to be made between forms of linguistic behaviour and a set of principles guiding that behaviour: linguistic behaviour is guided by frequency measures only —whatever that may mean. Applied to traffic, this would mean, presumably, that what guides actual traffic is the result of the statistics of collisions and casualties, which would supposedly tell us what form of behaviour minimizes the chances of an accident. As a matter of fact, that strategy would work well, as in the end the number of traffic participants would be so low that accidents would indeed be much less likely to happen. The same in language: miscommunication would be so frequent that language learners would soon stop using language altogether as a medium. In fact, of course, in speech as much as in traffic, there is overwhelming evidence for the existence of underlying formally precise systems of rules and principles guiding actual behaviour.

Why did this happen? One illusion must be removed at once: it did not happen on grounds of academic argument or new empirical findings. Far from it, it happened because of personal ambition and greed for power. Adroit marketing and clever acquisition of grant funds helped a lot, enough anyway to cover up an overall disregard for empirical criteria and lack of professional discipline. The technique is simple. You show, for example, that input frequency plays some role in the language learning process of a young child: frequently heard words will be among the first to be produced. No problem (but so what?). Then you extrapolate from that to language in general: a native language is learned merely on the basis of input frequency. The vast body of evidence showing that this is not so is simply ignored. Try that in the physical sciences!

Of course, serious linguists (and there are quite a few of them still around) will tell you that this is a totally illegitimate conclusion, but if you get enough grant money to let your PhD-students carry out statistical research on language and if, in the process, you keep telling these young people that by doing their work they are revealing the foundations of language and language acquisition, in opposition to all those wicked and stupid outsiders who claim that there is an ‘abstract’ system of rules and principles driving speakers’ utterances, you create a population of indoctrinated young men and women, who are largely ignorant of linguistics as an established discipline and will, for lack of competent knowledge, desperately cling to the beliefs instilled into them, as otherwise they might fail in their future careers. And by the time the group of people thus indoctrinated is large enough to have reached critical mass, your view becomes ‘respectable’ in some ill-defined and unjustified sense. You have the advantage, moreover, that this makes the study of language look easy, since all you have to do is count. Crucial observations showing that there is more to language than meets the count make it too difficult to study language, and we want to keep language simple, don’t we? Therefore, crucial counterevidence is simply disregarded, in the hope that it will go away, or at least that it will not rise above a critical threshold of public attention.

I call this Mickey Mouse linguistics, because it reduces natural human language—with all its fascinating complexity, its unexpected depths and mysteries, its never ending surprises—to cartoon proportions. At best (there are much worse cases around), our cartoon linguists hold that sentences consist of a verb plus its arguments (subject, object) plus a few possible adverbial adjuncts or clauses. I will give a few examples, randomly picked from thousands of possible ones, to show that there is much, much more to language than this cartoonist’s view allows us to see. (The examples are from Chapter 5 of my forthcoming book From Whorf to Montague, Oxford, OUP, 2013.)

As a first example, take the following sentences:

(1) a. While John stood on the balcony, he watched the crowd.

b. While he stood on the balcony, John watched the crowd.

c. John watched the crowd, while he stood on the balcony.

d. He watched the crowd, while John stood on the balcony.

The remarkable thing is that in a, b and c the pronoun he can denote the same person as is denoted by the name John (its antecedent), while in d this is impossible. This is due to a constraint on sentence-internal anaphora resolution, which says, in principle, that a definite pronoun can be coreferential with an antecedent A in the same sentence (i) if A precedes the pronoun (cases a and c), or (ii) if A is in the main clause while the pronoun is in a subordinate clause (cases b and c). This means that if A follows the pronoun and is in a subordinate clause (case d), coreference is excluded. This constraint is probably universal. At last, I know of no counterexamples in any language. Anecdotal evidence comes from a little episode, a few years ago, when a young student from Bali happened to look at my laptop screen and saw these four sentences. Without my having put him on the track, he immediately jumped up at seeing sentence d. Apparently, therefore, this is something both innate and universal. The Balinese student had not been taught this, nor could he have learned this as a result of input frequency. And there is no reason to assume that other people have.

Another example. Consider the following two sentences:

(2) a. I don’t think either Harry or John were late.

b. I don’t think either Harry or John was late.

Both are good English sentences but they have different meanings. Sentence a means that the speaker thinks that neither Harry nor John were late; sentence b means that it is not true that the speaker thinks that either Harry or John was late. Question: why the plural were in sentence a? The answer requires a formal analysis involving a system of ‘abstract’ rules, derivations and representations where, at some stage in the derivation, the formal conditions for plurality in English are satisfied. Input frequency is illusory as an explanation.

Third example. Consider the two sentences:

(3) a. Only last summer he made headlines (… and now he is dead).

b. Only last summer did he make headlines (… after many years of trying).

Questions: why does sentence b have, and does sentence a not have, auxiliary inversion? And how does this correspond with the semantic difference noted? The answer is unknown, but input frequency is just about the last corner where to look for one.

Next example. Consider the little dialogue (4), between a father and his young son who is crying because he has hurt himself (sentence b is pronounced with strong emphasis on the pronoun I and with rising final intonation):

(4) Father: Well-educated boys don’t cry.

Son: I didn’t educate me!

Questions: why no reflexive pronoun myself in the son’s reply, and why does the son’s reply with myself for me have the totally different meaning “I am not a self-educated man’, which is not appropriate in this context? The answer is fairly simple, but it involves a rule-governed, formal derivation procedure from an ‘abstract’ meaning representation to the surface structure of the sentence in question. It is because the son’s reply is (roughly) derived from (5a), which has the non-reflexive “x educated me”, whereas the same sentence but with myself is derived from (5b), with the reflexive “x educated x”:

(5) a. the x, such that x educated me, is not me

b. the x, such that x educated x, is not me

If anyone has a better explanation, I’ll be happy to hear. Again, there is no way input frequency could possibly account for this. Note that this is not an armchair example, but an observation from real, spoken language (which is what the input frequency fans seem to prefer).

Example five. Take the following two dependent clauses, one German, the other Dutch. Both share the meaning ‘… that it would have been raining today’:

(6) a. … daß es heute hätte regnen sollen/*werden.

b. … dat het vandaag had zullen regenen.

The German auxiliary verb werden is the semantic counterpart of the Dutch auxiliary verb zullen, both expressing futuricity (English will). But the use of werden in (6a) is definitely ungrammatical (the different auxiliary verb sollen (‘must’) has to be used, widening the semantic range of the clause), while zullen in the parallel construction in Dutch is perfectly OK. The constructions are exact parallels, except that the German verb cluster hätte regnen sollen is partly left- and partly right-branching, while its Dutch counterpart had zullen regenen is right-branching throughout. (The semantic hierarchy is: hätte/had—sollen/zullen—regnen/regenen; German is on the whole left-branching in its V-clusters, which would give regnen—sollen (gesollt)—hätte, but a special rule of ‘Oberfeldumstellung’ makes hätte induce right-branching, giving (6a). Dutch is right-branching throughout.)

Questions: why is werden ungrammatical in (6a) while zullen is fully grammatical in (6b), and why does the German verb cluster have a split branching directionality? Answer: these facts follow from (a) a theory of auxiliation, which shows that German werden has undergone auxiliation (with loss of infinitives and participles), whereas Dutch zullen has not, and (b) a theory of verb clustering showing how these verbs come to be united into one clustered verbal constituent with well-defined directionality. The point here is that no amount of input frequency rhetoric will provide the answer. The correct answer simply has to involve a precise, formal, and certainly ‘abstract’, system of rules and principles. Such an answer has been provided in my Semantic Syntax (Blackwell, Oxford, 1996); it is simple, elegant and in full agreement with the facts. The input frequency fans, however, prefer to simply ignore it.

Example six. Consider the sentence:

(7) The first Americans landed on the moon in 1969.

Questions: who are/were ‘the first Americans’? How come the sentence means ‘the first Americans to do so landed on the moon in 1969′? Answer: unknown; no-one, to my knowledge, has so far made a study of this particular phenomenon. But any adequate answer will have to involve a fairly abstract machinery relating the word first (and other similar words, like last or second) semantically to the action expressed in the verb phrase. Note, incidentally, that the negation of (7), The first Americans did not land on the moon in 1969, makes no sense or is at least awkward. Where are the input frequency fans?

Example seven is to do with peripheral scope ambiguity. Consider the following sentence pairs:

(8) a. I let John use my bicycle for two months.

b. For two months, I let John use my bicycle.

(9) a. I lent John my bicycle for two months.

b. For two months, I lent John my bicycle.

(10) a. The sheriff jailed Robin Hood for two years.

b. For two years, the sheriff jailed Robin Hood.

(11) a. The secretary had left at six.

b. At six, the secretary had left.

These four sentence pairs have in common that the first of each pair is semantically ambiguous in a way the second is not. (8a) can mean either that I allowed John to keep and use my bike for a period of two months, or that for two months I (repeatedly) allowed John to use my bike. (8b) can only mean the latter. Question: how come? The answer is complex, and of central importance to the theory of grammar. These are scope differences, as the semantic difference corresponds with the position of the adverbial phrases in question in the hierarchical semantic structures of the sentences in question. (Scope differences have never been studied seriously in any of the existing linguistic schools. Chomsky’s 1965 Aspects theory is unable to account for them, while it should, given its premises. The attempt to do so led to Generative Semantics, a movement quelled by Chomsky for reasons of personal vanity—the beginning of the rot in linguistics.) Sentence (8a) is to be analysed as either (12a) or (12b), where I have italicized “for two months” to emphasize its status as a scope-bearing operator:

(12) a. I let [for two months [John use my bicycle]]

b. for two months [I let [ John use my bicycle]]

In a, the peripheral adverbial phrase (PAP) “for two months” has lower scope (under “I let”); in b, it has highest scope (over the rest of the sentence). There is an, as far as we know universal, principle which says that PAPs taking highest scope may appear in the surface structure either in final or in initial position. English has the special rule that PAPs taking scope over embedded clauses or infinitivals ‘land’ in final position. Therefore, both (12a) and (12b) can result in (8a), which thus becomes ambiguous, but only (12b) can result in (8b), which is, therefore, not that way ambiguous.

This analysis, however, does not work for (9), (10) and (11), which have no embedded dependent clauses or infinitivals. This was one of the reasons, during the late 1960s, to introduce the notion of prelexical syntax. When the semantically complex verb lend, instead of simply having the semantic argument frame “a lend b to c”, is considered to contain the syntactic structure “a allow [c use/have b]”, and if it is assumed that “allow” or “use” are clustered, within the lexicon, into one complex predicate “allow to use”, then we have the clause embedding required for transferring the explanation of (8) to (9).

Moreover, given that this sort of clustering is found in the ‘open’ syntax of many languages, normally leading to the assignment of dative case to the subject c of the embeddeds clause, there is a rationale for the dative case assigned to c under the verb lend. Take a look, for example, at the syntax of the French verb faire (‘cause, make’), as in (13a), syntactically derived from (13b):

(13) a. Jean V[fera voir] la lettre à Pierre.

(Jean will-make see the letter to Pierre)

‘Jean will show the letter to Pierre.’

b. Jean fera [Pierre voir la lettre] ==> Jean V[fera-voir] Pierre la lettre ==> (13a)

(J. will-make [P. see the letter] ==> Jean V[will-make-see] Pierre the letter)

(I wrote an extensive paper on this in 1972, ‘Predicate Raising and dative in French and sundry languages’, but could not have it published, as the climate had already deteriorated as a result of Chomsky’s bitter personal campaign against Generative Semantics. The paper was circulated, at the time, by a student organization called LAUT or ‘Linguistic Agency University Trier’. I finally published it myself as chapter 7 in my book A View of Language, Oxford, OUP, 2001.)

The same analysis can be applied to (10), where the causative verb jail is then analysed as a prelexical “cause-to-be-in-jail”, allowing for different scope assignments to the PAP “for two years”, either over the whole sentence or just over “be in jail”.

(11a) is ambiguous between the reading ‘the secretary’s leaving had taken place at six’ and ‘at six, the secretary’s leaving had already taken place’. (11b) can only have the latter reading. Here, the structural space required for the resolution of the ambiguity is secured by the (well supported) assumption that the pluperfect tense is a composition of two hierarchically ordered past tenses (see Jim McCawley’s 1971 paper ‘Tense and time reference in English’). The same pattern is seen here: the PAP in final position gives an ambiguity, which the same PAP in initial position does not have.

Leaving out technical details, this shows how the systematic ambiguity of the (a)-sentences in (8)-(11), as opposed to the non-ambiguity of the corresponding (b)-sentences, can be caught under one unified, highly explanatory hypothesis. The input-frequency club has nothing to put in the place of this hypothesis: they just hope that it and the observations on which it is based will go away. Strangely, however, the same happens in the more formally oriented currents in linguistics. Though fully aware of this hypothesis, they simply ignore both the facts and the theory explaining them—a very saddening state of affairs, showing that, indeed, there is something fundamentally rotten in the discipline of linguistics.

How is this possible in an academic community? Only, I think, if it has become a social reality in that community that one can get away with that kind of behaviour and still be respectable, as what counts is just one’s position on the grant scale. A couple of years back, I confronted a colleague at some German university, who is a staunch input frequency fan, with some of the facts cited above and sent him relevant literature, at his request. The next time I met him, he nervously told me that he hadn’t yet had time to read the stuff, while his gaze went to the nearest door handle that would allow him to escape. After that, the subject was never brought up again. But the man is still a full member of his input frequency club, even though, presumably, he has meanwhile had plenty of time to read the stuff (which would keep him awake at night if he were a proper academic).

There are, actually, two engrained bad habits here. One is that relevant existing literature is not taken into account. The other that one has grown used to accepting wishy-washy quasi-theorizing as proper theorizing. The input frequency fans and pragmaticists suffer from both. The formalist linguists and semanticists only from the first. Both bad habits can be stopped or reduced by imposing on referees of all kinds the holy duty of insisting that all relevant literature must be taken into account, especially the literature that threatens to pose an obstacle to the work under review. One should learn to be eager for counterevidence, not shun it. Engineers proposing a minimalist construction for a bridge or building do anything they can to find faults or weaknesses, on pain of heavy penalties. We linguists should do the same.