The underlying structures that are common to the world's languages bear an intriguing connection with early emerging forms of “core knowledge” (Spelke & Kinzler, 2007), which are frequently studied by infant researchers. In particular, grammatical systems often incorporate distinctions (e.g., the mass/count distinction) that reflect those made in core knowledge (e.g., the non‐verbal distinction between an object and a substance). Here, I argue that this connection occurs because non‐verbal core knowledge systematically biases processes of language evolution. This account potentially explains a wide range of cross‐linguistic grammatical phenomena that currently lack an adequate explanation. Second, I suggest that developmental researchers and cognitive scientists interested in (non‐verbal) knowledge representation can exploit this connection to language by using observations about cross‐linguistic grammatical tendencies to inspire hypotheses about core knowledge.

1 Introduction Although pure linguistic universals are difficult if not impossible to find (Evans & Levinson, 2009), linguists agree that there exist important statistical trends concerning certain recurring grammatical patterns across languages (Christiansen & Chater, 2008; Dryer & Haspelmath, 2013). For example, while gender marking is not a linguistic universal, a survey of 257 languages (Corbett, 2013) found that 84 employed a sex‐based gender system, while another 28 employed a non‐sex‐based gender system (which generally marks for animacy). Similarly, while not all languages require that an agentive subject be mentioned first in a standard transitive sentence, a recent survey found that of 1,377 languages, 1,053 did exactly this (Dryer, 2013). Such statistically recurring features across languages span a wide range of feature types and include at least phonological, morphological, syntactic, and lexical regularities (Dryer & Haspelmath, 2013). The origin of these patterns is one of the central questions in cognitive science. Here, I offer a new sort of explanation for (at least some of) these recurring features. In particular, I argue that some morphological and syntactic regularities arise, in part, from deep‐seated and early emerging forms of non‐linguistic thought that systematically bias language use and learning. In this way, they shape language structure via evolutionary processes. There are many forms of cross‐linguistic regularities that one might concentrate on. One type of regularity concerns language‐specific choices regarding which words to include in “open” lexical classes (for further discussion of the open vs. closed class distinction, see Bock, 1989; Bradley & Garrett, 1983; Cinque & Rizzi, 2008; Friederici, Opitz, & von Cramon, 2000; Gordon & Caramazza, 1985; Osterhout, 1997; Segui, Mehler, Frauenfelder, & Morton, 1982; Van Petten & Kutas, 1991). Open word classes are defined as categories of lexical items that languages can easily add to or modify. Such open word classes consist of content words that often include nouns and, potentially (depending on the language). verbs, adjectives, and adverbs. While tracking and explaining regularities in vocabulary choice for items in such open lexical classes could be both informative and interesting, doing so also potentially presents difficulties. By definition, the precise content of such classes in any given language can easily change over short time spans and can therefore be quickly influenced by environmental and contextual variables. Thus, for example, English has recently coined new words like “buzzword” and “click bait” to refer to relatively recent technological phenomena. In contrast, the current paper focuses on a second type of cross‐linguistic regularity: functional regularities regarding the types of morphosyntactic structures that languages employ. For example, English, as well as a many other languages from across a wide range of language families (Doetjes, 2012; Koopman, 2014; Kulkarni, Rothstein, & Treves, 2013), classifies all nouns as being either “count” (e.g., the English terms “rock,” “ball,” and “cloud”) or “mass” (e.g., the English terms “sand,” “water,” and “sky”). Which class a word belongs to, in turn, dictates how it will behave syntactically. For example, one can say “two rocks” but not “two sands” because only count nouns can take a plural marking in English. In contrast to language choices regarding which specific items to include in open word classes, the range of grammatically relevant categories that languages employ cannot be easily modified (Cinque & Rizzi, 2008). This potentially makes explaining their origins all the more difficult. How can one best explain such cross‐linguistic regularities in morphological and grammatical structures? The current paper argues that pre‐verbal “core knowledge” may induce biases in language evolution, therefore rendering certain corresponding grammatical forms to be more likely to appear cross‐linguistically. In what follows, I first provide an overview of the core knowledge framework in section 2. In section 3, I focus on two specific cases that help make the connection between grammatical structure and core knowledge clear: the mass/count distinction and numeral classifiers. For each case, I argue that cross‐linguistic grammatical trends reflect basic aspects of pre‐verbal core cognition. In section 3.1, I describe a theoretical framework that provides a plausible set of mechanisms explaining such reflections of core cognition in language. In section 3.2, I demonstrate the breadth of the current approach. Section 4 discusses the empirical predictions made by the model, and illustrates how these predictions play out in two specific contexts. Section 4.1 concludes by considering how this proposal could lead to a research program that involves a fruitful exchange of information between experimental psychologists and language researchers.

2 Core knowledge The “core cognition” approach is an attempt at explaining the origins of human cognition (Baillargeon & Carey, 2012; Spelke & Kinzler, 2007). It suggests that the human mind may be innately endowed with a small number of basic cognitive systems that provide human infants with a head start in learning. These include systems for reasoning about the behavior of physical objects (Baillargeon, 2001; Valenza, Leo, Gava, & Simion, 2006), numerical cognition (Coubart, Izard, Spelke, Marie, & Streri, 2014; Hyde & Spelke, 2011), reasoning about social actors (Spelke, Bernier, & Skerry, 2013; Spelke & Kinzler, 2007), and reasoning about basic geometrical properties (Dillon, Huang, & Spelke, 2013; Spelke, Lee, & Izard, 2010). In contrast to general learning theories (Hume, 1748/2007; Locke, 1689/1975; Rumelhart & McClelland, 1986), the core knowledge approach accepts that some domain‐specific representations and/or specialized learning mechanisms are present from birth. In contrast to massive modularity theories (Cosmides & Tooby, 1994), this view also emphasizes the role of general learning abilities that interact with core knowledge to flexibly produce certain types of skills and beliefs from experience. While the core cognition perspective explicitly endorses innate cognitive systems (or at the very least an innate propensity to acquire those systems), here I will remain agnostic regarding specific origins and instead use the term “core knowledge” to refer to systems (or pieces of knowledge within systems) that are inherently non‐verbal, embedded into automatic cognitive processes, and likely to be universally represented across cultures. Some particularly clear instances of core cognition (which are thus theorized to be a subset of core cognition more generally), which this paper will focus on, have two corresponding symptoms: (a) they are grasped early on by pre‐verbal infants, sometimes as early as they can be tested, and (b) they structure automatic perceptual processes in adults. This working definition is consistent with recent proposals suggesting that certain core representations, such as that of physical objects or the containment/occlusion distinction, have these properties (Cheries, Mitroff, Wynn, & Scholl, 2009; Strickland & Scholl, 2015).1 Theoretically, these two symptoms could be produced by a variety of sources, including innate causes, non‐innate causes, or some mixture of these. For example, they could be the product of domain‐specific learning algorithms or may result from a complicated interaction between innate perceptual mechanisms and a general learning device (e.g., Mandler, 2012). What follows here may apply to any knowledge that is “core” only in the sense that it possesses these two characteristic properties. Crucially, however, the features listed above (i.e., early emergence and presence in perception) capture many of the important similarities across cognitive domains like numerical cognition, social cognition, and naïve physics, which have typically been treated as core cognitive systems (Spelke & Kinzler, 2007). On the other hand, these features do not describe many other examples of human knowledge. For example, consider the difference between, on the one hand, representations of physical objects, which emerge early in infancy (Spelke & Kinzler, 2007) and appear to be embedded in perceptual processes (Cheries et al., 2009), and, on the other hand, representations of day versus night, cloudy versus sunny, or rich versus poor. Although these latter distinctions are omnipresent in virtually all human environments, these are not (to my knowledge) represented by pre‐verbal infants or the adult perceptual systems in any systematic way. One important characteristic of the variety of core knowledge being discussed here is its non‐verbal character. Mastery by pre‐verbal infants, who by definition lack a full human language, is a hallmark of such core knowledge. Given this fact, it is perhaps surprising that many of the world's language systems have much in common with core knowledge—incorporating close analogs to core distinctions into their underlying morphosyntactic structure. Below I explore this connection in much more detail by arguing that this overlap is not due to simple coincidence. Instead I suggest that the formation of grammatical categories is subtly biased by core knowledge. On this view, pre‐verbal core knowledge makes certain grammatical distinctions more salient, learnable, and memorable than other possible distinctions, which in turn causes languages to regularly incorporate them in their morphology and syntax. This view does not predict that all core distinctions are regularly imported into language or that all grammatical distinctions are necessarily a product of core knowledge. It does, however, predict that there should be a correlation between core knowledge structures and grammatical structures across languages. Thus, for example, if one were to compare a list of randomly selected core conceptual distinctions versus a list of matched non‐core conceptual distinctions, one would expect to find more grammatical categories based on the first set of conceptual distinctions than those based on the second set. Actually running a large‐scale study of this sort, while useful, would present a number of technical and methodological issues that would be far afield of the goals and scope of the current paper. Instead, the analytic strategy here is to present evidence that there is a substantial amount of overlap between core cognition and morphosyntactic structures while relying on the observation that, at the very least, many non‐core distinctions (e.g., day vs. night, rich vs. poor, cloudy vs. not cloudy) are not grammatically encoded on a wide scale across human languages.

3 Establishing the link between core knowledge and language While much recent work has concentrated on demonstrating that language builds on and adds to the representational capacity of core knowledge (e.g., as in the number domain, see Carey, 2004, 2009; Feigenson & Carey, 2003), the current paper, in contrast, concentrates on the significant overlap between the grammatical structure of language and that of core knowledge. In particular, core knowledge concepts often appear to underlie the meanings of closed‐class vocabulary (e.g., prepositions, determiners, bound morphemes, numerical classifiers, etc.) and lexical categories (e.g., count vs. mass nouns, nouns vs. verbs, etc.; Carey, 2009), both of which are important elements of functional linguistic systems. 3.1 Example 1: The mass/count distinction One example is that of the count/mass distinction. Count nouns frequently refer to discrete and countable entities (e.g., in English “chair,” “ball,” and “table”), while mass nouns typically (though not always) refer to conceptually undifferentiated masses or entities that cannot be counted (e.g., in English “water,” “sand,” and “air”; Kodera, 2011). This semantic distinction can also be encoded morphosyntactically. Thus, in English, which category a noun belongs to determines the range of determiners, quantifiers, and plural morphemes it can/must appear with. So, for example, English count nouns but not mass nouns that can appear with a numeric quantifier. One can therefore say “two chairs/balls/tables,” but not “two waters/sands/airs.” Additionally, when specifying indefinitely large or small quantities, mass nouns and count nouns require different quantifiers. One can say “many tables,” but not “many airs” or “many air.” One can only use a quantifier in the context of a mass noun by adding a measuring term as in “many molecules of air.” On the other hand, one can felicitously say “much air,” but not “much tables” or “much table.” Thus, the distinction between mass and count nouns is one that English morphosyntax encodes via restrictions on the use of functional morphemes. Cross‐linguistically, many other languages, from a diverse set of language families, distinguish between mass and count nouns (often via restrictions on the use of quantifiers and/or singular vs. plural marking). Examples come from a wide range of language families, including (at least) Afro‐Asiatic, Indo‐European, Uralic, Algic, Tupian, and Niger‐Congo. This list includes such languages as Dutch, Spanish, Serbian, German, Greek, French, Italian, Finnish, Armenian, Hebrew, Hindi, Marathi, Ojibwe, Innuaimun, Karitiania, and Dagaare (Gillon, 2010; Grimm, 2012; Kodera, 2011; Kulkarni et al., 2013; Mathieu, 2012; Müller, Storto, & Coutinho‐Silva, 2006). In other languages known as “classifier languages” (e.g., Mandarin, Japanese), the English‐like strategy is not available because one is obligated to introduce a numeral classifier (similar to measuring terms in English) in order to use a quantifier in the context of any noun. On the basis of this, some authors have argued that all nouns in classifier languages should be treated as mass nouns (Kulkarni et al., 2013), with numeral classifiers providing a trigger for count readings. Others have argued that even in classifier languages, the morphosyntactic system distinguishes count from mass nouns by restrictions on the types of classifiers that are allowed and restrictions on meaning shifts for certain lexical items (Doetjes, 2012; Kodera, 2011). Regardless of which of these two theories ends up being correct, the distinction between countable and non‐countable entities is one that many languages appear to encode. Considerations along these lines have led some theoreticians to conclude that count nouns refer to discrete individuals while mass nouns refer “homogenously” (Doetjes, 2012), where homogenous reference implies the absence of distinct divisible parts. For example, given a piece of gold, a subpart of it will still be gold. This distinction is problematic, however, when one considers that certain terms can be mass in one language while being count in another (as in “furniture” vs. “les meubles” in English vs. French). Thus, those who claim that the count versus mass distinction can be cashed out purely in terms of reference to discrete countable entities and noun‐countable entities must dissociate between linguistic properties of meaning and properties of the referents themselves. According to Bunt (1985), mass nouns do not highlight or pick out any particular parts of the entities to which they refer, even if those entities might nevertheless have distinct divisible parts. And according to Lonning (1987; quoted in Doetjes, 2012), “… it is not critical that mass terms really refer homogenously […]. Rather what is of importance is whether they behave as if they did and what it means to behave in such a way.” In addition to being important for language, both the adult visual system and pre‐verbal infants make an analogous distinction between individuated objects and substance‐like non‐individuals, and thus respond differently to entities from each category. Interestingly, both infants and adults show impaired quantification and tracking abilities for substances relative to objects in tasks that require precise representation of the entities in a display (e.g., Huntley‐Fenner, Carey, & Solimando, 2002; vanMarle & Scholl, 2003; vanMarle & Wynn, 2011). This somewhat mirrors the language system's willingness to directly quantify over count nouns using numerals, and relative reluctance to quantify over mass nouns (in the absence of a measurement term). For example, in a multiple object tracking task, vanMarle and Scholl (2003) showed that the adults can visually track of up to four continuously moving entities in a display (among a number of distractors) when the entities move as discrete objects. However, when the entities were apparently poured from one location to another, thus moving as substances, tracking ability was severely impaired. Studies have also shown that quantificational abilities of pre‐verbal infants are impaired in substances relative to objects (Huntley‐Fenner et al., 2002; vanMarle & Wynn, 2011). For example, vanMarle and Wynn (2011) showed that while infants at 10 months of age are able to compare quantities of objects that differ by a 1:2 ratio (and thereby select the appropriate location to search for a preferred food item), they are only able to accurately compare quantities of substances that differ by a 1:4 ratio. More visual cues were needed for accurate comparison of substance quantities, and memory for quantities was also found to be worse for substances relative to objects. Thus, it appears that the syntactic/semantic structure of languages and core knowledge make a similar distinction between countable, discrete entities (realized as count nouns across languages) and non‐countable, substance‐like entities (realized as mass nouns across languages). The parallels between the core system and the grammatical system go even further than first meets the eye. In particular, a detailed analysis of the semantic distinction between count and mass nouns in English reveals that these grammatical categories do not always perfectly map onto objects and substances, respectively. On the one hand, while count nouns virtually always refer to countable individuals (Barner & Snedeker, 2005), mass nouns are more flexible. While they often refer to substances (e.g., “water”), they also can refer to collections of countable individuals (e.g., “furniture”) that are mentally represented as such. Thus, adults and children will quantify over individuals referred to by object–mass nouns (i.e., mass nouns like “furniture,” which refer to collections of individuals) in a fashion that is more similar to how they quantify over count nouns than substance–mass nouns (Barner & Snedeker, 2005). So while count noun representations in English appear to be somewhat rigid, mass noun representations are less specific with regard to the types of entities that they can refer to. A recent study by Kulkarni et al. (2013) made this point in a different way by surveying 1,434 nouns across six languages (Armenian, Italian, English, Hebrew, Marathi, and Hindi). They found that in five of the six languages, about half of the nouns were consistently count nouns while the other half differed substantially across and within languages. On the other hand, very few nouns were consistently treated as pure mass nouns across the languages. One possible interpretation of these linguistic observations in conjunction with the infant and adult perception results (in which object representations appear to be more reliable and exploitable than substance representations) is that the core distinction is not a distinction between individual objects and substances but is instead a distinction between individual objects and “unspecified,” with the latter category often but not necessarily encompassing substances. The key point here is that one finds a surprising degree of similarity in the representations of object versus non‐object employed in non‐verbal cognition and the morphosyntactic distinction between count and mass nouns. 3.2 Example 2: Animacy in numeral classifier systems A second example of the overlap between the core knowledge system and linguistic systems comes from numeral classifier languages. In most European languages (as in English), one expresses precise quantities of objects by use of a numeral, a noun, and plural/singular morphology (as in “three boxes” or “one pencil”). However, languages like Japanese, Thai, and Vietnamese (among a long list of others; see Gil 2013, for a quantitative overview2 ) require the use of an additional element called a “numeral classifier” in order to express precise object quantities. Such numeral classifiers carry semantic information related to the noun class of the entity being counted. Consider sentences (1) and (2) below (from Yamamoto, 2005, p. 2): (1) Japanese enpitsu san‐bon pencil three‐classifier (long objects) “three pencils” (2) Thai ma si tua dog four classifier (animals) “four dogs” In the Japanese example, the numeral classifier is –bon, while the Thai classifier is ‐tua. The ‐bon classifier signals that the type of object being counted (in this case pencil) is a long inanimate object. Thus, it is also used for objects like umbrellas, cigarettes, and carrots. On the other hand, ‐tua is a classifier for animals in Thai. Such classifier categories are not limited to animals and long inanimate objects. Instead, there is a diversity of classifier types both within and between languages (for an excellent summary, see Yamamoto, 2005). For example, here is a (partial) list of Japanese classifiers: humans (‐ri), humans in formal settings (‐mei), humans for whom one wants to show respect (‐kata), animals (‐hiki), large animals (‐too), birds (‐wa), inanimate entities (‐tsu), concrete objects (‐ko), objects with salient 1D properties (‐hon), thin flexible objects (‐suji), objects with salient 2D properties (‐mai), objects that are not spatially independent (‐men), machines with specific functions (‐dai), large water vehicles (‐seki), and small water vehicles (‐soo). Such classifiers are partially overlapping. For example, large and small water vehicles are both types of concrete objects, and concrete objects are a type of inanimate entity. It is not the case that an object that fits into multiple categories will be marked with multiple classifiers. Instead, one can think of language classifier systems as having a taxonomic structure with the more general categories (e.g., that of an inanimate entity) dominating the more specific categories (e.g., large water vehicles). The chosen classifier for any given circumstance is then the lowest node on the taxonomic hierarchy that applies to the object in question (thus, for a large boat one would use ‐seki as opposed to ‐tsu). When looking at any particular classifier language, many of the categories that they reflect (especially the specific categories at the bottom of the taxonomy) are likely to include culturally specific phenomena related to things like technological advancements or the cultural values of the relevant language communities (Denny, 1976; Yamamoto, 2005). For example, Japanese has a classifier for air vehicles (‐ki), and it is hard to imagine that such a classifier could exist without culturally specific technological achievements (such as the innovation of the airplane). But upon analyzing the patterns of classifier schemes across many languages, a pattern emerges in which the basic (i.e., more general) categories reflect conceptual divides present in core knowledge. For example, Adams and Conklin (1973) carried out a study examining 1,406 classifiers from 37 Asian languages. They concluded that the distinction between animate and inanimate objects was the most basic given that without exception every language they studied had general classifier categories for (at least certain types of) animate beings, although languages varied according to the specifics of what specific classes of animate actors they encode for. While many languages include a simple distinction between animates and inanimates, which group humans and animals in the same class (e.g., the Micronesian languages Gilbertese, Nauru, Ponapean, Sonsorol‐Tobi, Trukese), others (e.g., Dravidian languages) distinguish between human and non‐human entities. The non‐human entities can then further distinguish animals from various types of inanimate entities. Yamamoto (2005) modeled these cross‐linguistic trends by way of a lexical‐semantic framework incorporating hierarchical taxonomies. Within this framework, a broad animate category is contrasted against a broad inanimate category. The animate category can then be subdivided into humans and animals. By postulating that languages differ on which specific nodes in the hierarchy they encode explicitly, this allows the theory to capture much of the observed cross‐linguistic variance. This trend in classifiers to systematically distinguish animates from inanimates mirrors the core knowledge distinction between animate and inanimate entities. Within the first 6 months of life, pre‐verbal infants use a variety of cues in order to classify objects as being animate versus inanimate (Molina, Van de Walle, Condry, & Spelke, 2004; Rakison & Poulin‐Dubois, 2001). In turn they generate predictions and expectations about behavior accordingly—expecting animate creatures (but not inanimate objects) to behave rationally, to morally evaluate others, and to possess mental states like desires, goals, and beliefs (Gergely & Csibra, 2003; Hamlin, Wynn, & Bloom, 2007; Luo, 2011; Newman, Keil, Kuhlmeier, & Wynn, 2010). Moreover, just as in the case of the object/substance distinction, certain aspects of the representation of animacy appear to be embedded into the procedures of the adult visual system. Thus, the perception of animacy can arise in ways that people cannot control and that may even conflict with the judgments they make after careful reflection. One prominent example comes from the Heider and Simmel (1944) displays, in which simple animations create the illusion that basic geometric forms possess intentions even if the viewer is consciously aware that this is not the case. More recently, it has been shown that the visual system preferentially attends to animate over visually similar inanimate objects (New, Cosmides, & Tooby, 2007), and that the visual system automatically and irresistibly picks out animate “chasers” from a sea of seemingly inanimate objects in simple displays (Gao, McCarthy, & Scholl, 2010). Thus, the animate/inanimate distinction provides another example of how morphosyntactic systems incorporate distinctions that are also present in core knowledge. An important word of caution is necessary here, though. As alluded to earlier in this section, mapping between properties in the world and grammatical categories is not one‐to‐one, nor is the mapping between the relevant core knowledge distinctions and grammatical categories. These points can be illustrated by three types of example in the context of the mass/count distinction. First, cross‐linguistic differences in mass/count categorization illustrate that the properties of objects alone are not sufficient to determine which grammatical category a given noun will belong to. For example, when presented with a quantity of pasta, an English speaker can refer to this using a mass noun (“the pasta”), while a French speaker translating the English term would refer to the same pasta with a count noun (“les pâtes”). Thus, the reference entity alone is not sufficient to determine whether a noun will be mass or count. Second, there are mismatches between core knowledge categorization and grammatical categorization. For example, in English “furniture” is conceptualized by young children as being a collection of individual objects (Barner & Snedeker, 2005); it is nevertheless a mass noun. Third, there are cases in which core knowledge makes no categorization, but the linguistic system does (Chierchia, 2003). For example, in English, “knowledge” is treated as a mass noun, while “idea” is treated as a count noun despite the fact that the core physics system would not treat knowledge as a substance and treat ideas as objects. This shows that the mapping between count/mass and object/substance is not perfect, and that language may extend core distinctions in ways that go over and above their basic function. This phenomenon of language extending core knowledge is also exemplified in classifier languages. For example, the Thai classifier ‐tua (discussed above) was originally used for animals, but it has been extended in modern usage to also refer to some types of furniture and clothing (Yamamoto, 2005). Thus, there again appears to be an imperfect mapping from core knowledge over to language, with language often over extending core distinctions. The overall goal of this section is not to formally explain these “problem” cases. It is instead merely to demonstrate that some interesting degree of overlap exists between core knowledge and morphosyntactic structures. Nevertheless, these problem cases are likely to be explainable within the current theoretical framework. The general idea adopted here is that there is a correlation, albeit not a one‐to‐one correlation, between real‐world properties and grammatical categories, which is mediated by core knowledge. Thus, given certain real‐world properties (e.g., a single bounded contour for an entity with a low viscosity composition) for a given entity E, there is a function that determines the likelihood that E will be categorized in a certain way by core knowledge (e.g., as being a physical object vs. a substance). Then a similar function may mediate between core knowledge and linguistic categories. Thus, when E is categorized as being an object, there is another function dictating the likelihood that the noun referring to E will be assigned to the count or mass category. Such a model predicts that there should be a correlation between real‐world properties, conceptualization, and grammatical categorization without any of the correlations being determinate. Such a prediction is compatible with the existing empirical evidence. In principle, the model can accommodate the three problem cases listed above. In cases of cross‐linguistic variability for the same entity (e.g., the English “pasta” being mass while the French “les pâtes” is count), this can be explained by random noise in how the likelihood functions play out in various languages. If there is a 70% chance that an entity categorized by core knowledge as an object will be treated as a count noun; by luck, two languages could assign different grammatical categories. Additionally, there could be non‐random factors that intervene to produce cross‐linguistic differences. For example, perhaps there are cross‐cultural differences in whether a culture tends to globally attend to a quantity of pasta (thus not attending to individual pieces of pasta) or locally attend to a quantity of pasta. The ability to attend globally versus locally to a given stimulus is a well‐established psychological phenomena (e.g., Yamaguchi, Yamagata, & Kobayashi, 2000), and it is a plausible factor that could influence conceptual categorizations that would, in turn, influence grammatical categorization. With regard to cases in which there is a conflict between conceptual categories and grammatical categories (e.g., as in “furniture,” which is a mass noun but is conceptualized as a collection of individuals by young children), these can also be explained by appeal to random noise in how the likelihood functions described above play out in specific languages. Finally, cases in which core cognition is silent but a word nevertheless receives a grammatical category (e.g., “idea” is count but core physics does not represent ideas), assignment to linguistic category may be purely random or there may again be non‐random factors at play. For example, perhaps there is some deep similarity between the concept for “idea” and the concept for “ball,” such as being represented as an individual, such that both are likely to end up being categorized as a count noun. Ultimately finding the correct answers to these questions will require rigorous theory testing and experimentation. Whatever such a research agenda ends up uncovering, nothing in the current theoretical framework prevents a satisfactory explanation of these cases (and indeed the current framework may generate new and more explanatorily adequate frameworks than those that currently exist).

4 Language as shaped by the brain: A role for core cognition Following Darwin (1874), recent scholars have suggested that it is useful to think of individual languages as organisms that have evolved in concert with human minds and social environments (Beckner et al., 2009; Bentz & Winter, 2013; Christiansen & Chater, 2008; Lupyan & Dale, 2010). According to this view, learning and processing biases will tend to become embedded in language structure as languages evolve to become easier to learn and use over time (Christiansen & Chater, 2008). This approach has the ability to explain many morphological and syntactic statistical regularities in terms of cognitive biases which may impact language use and acquisition in both adults (e.g., Bentz & Winter, 2013; Lupyan & Dale, 2010; Rosenbach & Jäger, 2008) and children (e.g., Kirby, 1999). Researchers in this area have discussed two complementary mechanisms that are likely to lead to cognitive influences on languages' morphosyntactic structures. The first is frequency of use (Kirby, 1999; Kroch, 1989). This view states that as certain linguistic forms become used more and more frequently, those forms tend to become fixed in the grammar or morphology of the language (Bybee & Thompson, 2000; Diessel, 2007). For example, prepositions and postpositions are often based on terms referring to body parts because, prior to being fixed as a preposition/postposition, these terms are frequently metaphorically used to refer to spatial locations, as with the English term “back” (Diessel, 2007; Heine, Claudi, & Hunnemeyer, 1991; Heine & Reh, 1984). One possibility is that core cognition encompasses information that both children and adults are generally interested in and motivated to talk about. Thus, terms or structures that refer to core categories are frequently mentioned and thereby become likely to undergo a process of grammaticalization. This proposal presupposes that many of the core categories that are highly salient for children continue to remain salient even into adulthood, a view that has received direct empirical support in work showing that core distinctions from infancy guide adult performance on perceptual tasks (e.g., Cheries et al., 2009; Strickland & Scholl, 2015) as well as tasks in higher level reasoning (Beier & Carey, 2014). The more specific view that core cognition may influence frequency of use, which in turn influences morphosyntax, has also received some recent empirical support. In a cross‐linguistic analysis, Strickland and Chemla (unpublished data) compared the lengths of prepositions referring to the mechanical relationships “in” and “on” to the lengths of prepositions referring to similar but non‐mechanical relationships (e.g., “behind” and “above”). As an illustration of the sense of “mechanical” being used here, consider the difference between containment and occlusion. Imagine that a ball is placed in a cup. This is a mechanical relationship in the sense that if the cup moves, then so does the ball. On the other hand, if one places the ball behind the cup, this is not a mechanical relationship because if one moves the cup, this will not necessarily affect the location of the ball (unless one creates a mechanical relationship and brings the two into contact). Mechanics has been purported to be a core system in the literature (e.g., Scholl & Leslie, 1999). In Strickland and Chemla (unpublished data), the authors hypothesized that due to their core status, prepositions referring to mechanical relationships would be more frequently used and therefore generally shorter than their non‐mechanical counterparts. Across 38 languages, this was indeed the case. Moreover, an analysis of Hebrew and Hungarian revealed that the morphemes referring to the relevant mechanical relationships from the study can be realized as suffixes or prefixes (in the respective languages), while the non‐mechanical terms cannot. These facts are consistent with the view that core cognition drives frequency of use, which in turn influences patterns of grammaticalization. To return to the examples above, it may be that some languages began frequently referring to the count/mass nature of nouns or whether the object is animate versus inanimate, and these distinctions gradually became encoded in the morphosyntax of languages for this reason. While the mechanism of frequency can explain similarity between morphosyntactic patterns and core cognition, it can also potentially explain divergence as when language appears to extend the use of core categories in seemingly arbitrary ways (as in the above case of the Tai classifier ‐tua that was originally used for animals, but it has been extended in modern usage to also refer to some types of furniture and clothing). In essence, once a morpheme or syntactic unit has made the switch from being frequent to being a fixed functional element in the language, then the language may force users to employ that element even in cases where it might not normally have been applicable. A second, complementary, potential influence on language evolution is that morphemes and syntactic structures may be more learnable when they are based on core knowledge categories. Due to their increased memorability/learnability, they serve as universal attractors in the language evolution process. This view would again operate on the plausible assumption that core categories are highly salient in both children and adults, and therefore syntactic units/morphemes connected to such categories would be easier to remember and therefore learn than those based on arbitrary categories. Such a view is consistent with the vast literature showing that salience heavily influences memorability (e.g., Bahrick, Gogate, & Ruiz, 2002; Fine & Minnery, 2009), but it would contain the further, empirically testable, assumption that morphosyntactic elements based on core cognition are more memorable even when not all of their referents fit into the core category. So, for example, one might predict that speakers of Mandarin, which does not overtly distinguish between mass and count nouns, would nevertheless more readily learn and remember even arbitrary assignments to these two categories (as in the case of “pasta” in English vs. “pates” in French) than they would assignments to many categories which are not considered to be part of core cognition. 4.1 Biological and cultural evolution The basic claim is that certain morphosyntactic structures based on core distinctions are likely to emerge cross‐linguistically because they have a specific set of properties (e.g., being salient, more likely to be used, more memorable, etc.…). This does not rule out the possibility that other potential morphosyntactic structures that are not based on core knowledge could also have that same set of properties. For example, Japanese contains numeral quantifiers for large water vehicles (‐seki) and small water vehicles (‐soo). These are morphosyntactic categories that are clearly not based on core cognition. One can nevertheless explain their existence by positing that terms referring to these categories are, for the Japanese, just as salient and memorable as those that are based on core knowledge. The important aspect of the current theory is that morphosyntactic categories based on core knowledge are likely to have the relevant properties across a much wider set of languages. In essence, they are postulated to act as natural attractors in processes of language evolution. This view of language evolution is compatible with the possibility that cross‐linguistic regularities relating to core cognition could emerge both out of processes of cultural evolution (such as grammaticalization of frequently occurring terms) as well as via functional biological adaptations that are specific to language (Barrett, Frankenhuis, & Wilke, 2008; Christiansen & Chater, 2008). Consider (hypothetically) that languages tend to encode core distinctions in their morphosyntax because doing so contributes to their learnability. Such a possibility would of course be compatible with the idea that the cultural evolution of individual languages is influenced by this factor. The question then arises: “What aspects of our cognition make these particular linguistic forms more learnable?” A first possibility is that whatever mental mechanisms are at play, none are specific to language. So core distinctions would be just as beneficial to learning in language as it would be to learning, for example, the causes of non‐linguistic sounds. However, a second possibility is that in addition to core distinctions carrying a general learning benefit, there may also be genetically specified linguistic mechanisms which facilitate the acquisition of grammatical categories based on core cognition. To understand one way in which this could be possible, consider the Baldwin effect (Baldwin, 1896; Simpson, 1953; see also Christiansen & Chater, 2008, for a discussion) whereby characteristics that are initially acquired through interactions with the environment can become inherited. One well‐known example is the development of calluses on the sterna of ostriches (Waddington, 1942). According to this view, calluses initially developed where the sternum touches the ground while the ostrich lands. Natural selection then favored individuals who were capable of developing calluses more rapidly through their interactions with the environment, and eventually favored individuals who developed them in the embryo without any interactions with the environment. Pinker and Bloom (1990) suggest that such Baldwinian effects could be behind some language‐specific adaptations. Along these lines one could imagine that at some earlier stage in human history, protolanguages were regularly structured around core cognitive distinctions, thus creating a situation where natural selection would favor individuals who were capable of acquiring the relevant linguistic structures more quickly. Through such processes, language‐specific biological adaptations may have encoded a learning bias for grammatical structures based on core cognition. Currently, we must be agnostic with regard to which particular mechanisms explain why core cognition serves as an attractor in language evolution. Nevertheless, explanations at the level of cultural evolution and the level of biological evolution of language should not be considered mutually exclusive, and both are compatible with the current account. 4.2 Core cognition Just as it has been useful to consider the origins of certain grammatical regularities (for the purpose of describing the possible theoretical landscape), it is also worth considering two possibilities concerning the origins of core knowledge itself and the resulting theoretical implications these would have for the current proposal. The first possibility is that pre‐verbal core knowledge (as functionally defined here as “early emerging in infancy” and “embedded in perceptual processing”) is not innate nor is it the product of an innate domain‐specific learning system. This possibility logically requires that such knowledge is acquired by general learning mechanisms, and then goes on to shape language. From this perspective the contribution of the current proposal is the observation that a certain class of cognitive categories (i.e., those that emerge early in infancy and are present in perception) shape languages on a large scale. The challenge in this case is to explain why these categories appear to be more salient cross‐culturally and why they function differently than many other conceptual categories with regard to their impact on language (such as “small water vehicles”). A second possibility is that core knowledge is innate or results from domain‐specific learning mechanisms. This approach has the advantage of being able to easily explain the large‐scale impact of core knowledge on language: It does so because it is a universal part of human nature. It is also has the advantage of being able to explain why core knowledge categories function differently than many other conceptual categories—that is, by postulating that innate knowledge plays a special role in modulating how we attend to the perceptible world; see Cheries et al. (2009). Given the emphasis on innateness, it is natural to ask how this second alternative would relate to (and differ from) semantic bootstrapping proposals such as those endorsed by Pinker (1984), Macnamara (1982), and Grimshaw (1981). Semantic bootstrapping theories posit the child approaches the problem of acquiring morphosyntactic structures with a head start that comes from his or her grasp of certain semantically salient notions. On this approach children are hypothesized to use unlearned mappings between syntax and semantics to identify abstract syntactic units. For example, on one possible version of this view, the child could possess the innate, but possibly probabilistic, hypothesis that if something is a discrete physical object, then it is (likely to be) referred to by a count noun while if something is a substance (or simply not a discrete physical object), then it is (likely to be) referred to by a mass noun. With respect to views like this, the contribution of the current approach is potentially twofold. First, it would specify the range of cognitive characteristics of the representations that are referred to in the left‐hand side of these conditional statements, something that was not known when bootstrapping theories were first introduced. Second, it would explain why syntactic and morphological categories bear semantic content like this at all: because linguistic systems have a tendency to evolve toward the employment of learnable and useable structures, and those based on core cognition meet this definition. In summary, morphosyntactic structures based on core distinctions are likely to emerge because they are generally more learnable and useable than similar but non‐core distinctions, and language(s) has/have a general tendency to evolve to incorporate such structures. Such a view is compatible with the possibility that cross‐linguistic regularities could emerge purely out of processes of cultural evolution or with a heterogeneous view that cross‐linguistic regularities are the product of a mixture of cultural evolution and biological evolution.

5 The breadth of the current approach Section 3 above discussed two detailed examples of overlap between core knowledge and grammatical categories in an attempt to show how the current account could work. One theoretical advantage of the core knowledge approach is that it does not just work for a few cherry‐picked examples, but potentially has a broad scope in accounting for regularities in morphology and syntax. I briefly discuss a few of these examples below in an attempt to convey the potential breadth of this perspective. In each case, I first describe the relevant grammatical phenomenon and then provide empirical evidence that the related conceptual distinction is part of core cognition. A first example of core cognition influencing morphosyntax is that of direct physical causality. Many languages allow for causative syntactic constructions for verbs that denote basic physical events (see Escamilla, 2012, for a discussion of causative patterns across 50 languages sampled from more than 30 language families; see also Dixon, 2000; Dixon & Aikhenvald, 2000; Nichols, 1993; Song, 1996). For example, in English one can say that “The ball rolled,” but one can also say that “John rolled the ball.” The latter sentence is considered a causative because it can be loosely translated as meaning that “John caused the ball to roll.” However, a verb like “cry” (which denotes a human activity with emotional character) cannot enter into such a construction. So while one can say, “The girl cried,” one cannot say, “John cried the girl” (to mean that John caused the girl to cry). The logic of when and under what circumstances languages allow for causative expressions is interesting and intricate (Pinker, 2007), but it seems clear that some underlying representation of causality is important for determining this syntactic construction. Potentially underpinning the presence of such causative constructions across languages, we find independent evidence that a basic understanding of physical causality is present from about 6 months of age in pre‐verbal infants (Cohen, Amsel, Redford, & Casasola, 1998; Kotovsky & Baillargeon, 2000; Leslie & Keeble, 1987; Newman, Choi, Wynn, & Scholl, 2008), and similarly that the adult visual system automatically represents basic physical causal interactions (Choi & Scholl, 2006; Michotte, 1946/1963; Rolfs, Dambacher, & Cavanagh, 2013). A second example of potential core influence on morphosyntax is that of biological gender (i.e., sex), which could broadly be considered part of the core “social cognition” domain. Across many languages, biological gender (e.g., masculine, feminine, and/or neuter) is explicitly marked. The essential property of gender is agreement. One can conclude that a language has a gender system if one finds different agreements that are ultimately dependent on nouns of different types (Corbett, 2013), as in French where determiners or adjectives take a different form depending on whether a noun is masculine or feminine. Languages vary with regard to whether they employ a gender system at all and by what type of gender system they can employ if they do have one. Some languages have no gender system at all, while others employ non‐sex‐based gender systems, which are almost all semantically based on some form of animacy (Corbett, 2013). Finally, one also finds languages that employ gender systems based on biological sex. This latter class of languages varies with regard to the semantic overlap between gender and the referent of the noun. Some languages such as Tamil show almost perfect overlap (i.e., virtually all nouns denoting human males are masculine and all masculine nouns denote human males), whereas others such as French only show partial or prototypical overlap. So while in French “woman” (“dame”), “girl” (“fille”), and “mother” (“mere”) are feminine, so are “table” (“table”), “liberty” (“liberté), and “democracy” (“démocratie”) despite the fact that there is no intrinsic relationship between the entities that these latter nouns refer to and femininity (Corbett, 1991). A recent sample of 257 languages (Corbett, 2013) showed that 84 of these employed biological sex‐based gender systems in some form. Moreover, these 84 languages came from a diverse range of geographic regions and language families, suggesting that the use of sex‐based gender systems is indeed a popular cross‐linguistic phenomenon worthy of explanation. Again for sex‐based gender there is evidence that this meets the current working definition for core knowledge. Pre‐verbal infants from around the age of 3 months prefer to look at faces whose gender matches that of their primary caregiver (Quinn, Yahr, Kuhn, Slater, & Pascalis, 2002), and there appear to be dedicated visual mechanisms for identifying gender based on both minimal cues of bodily motion (Johnson, Gill, Reichman, & Tassinary, 2007; Mather & Murdoch, 1994) and facial cues which operate automatically and even in the near absence of attention (Reddy, Wilken, & Koch, 2004). A third example is that of event or thematic roles (Dowty, 1991; Fillmore, 1968; Gruber, 1965), which may have their roots in core knowledge. In linguistics, this theoretical construct helps capture the intuition that there is something semantically shared across the argument structure of many verbs within a given language (Strickland, Fisher, Knobe, & Keil, 2015; Wagner & Lakusta, 2009). Take, for example, sentences (3–5): (3) John broke the door. (4) John punched the door. (5) John painted the door. Although all of these sentences describe different actions (one involves breaking while another involves painting), there is some sense in which John and the door play a similar role in all cases. John is the one performing the action while the door is having the action performed on it. In general, the grammatical subject of the sentence denotes the “doer” (or AGENT3 ) of the action in question while the grammatical subject denotes the item that undergoes a change of state (PATIENT; although there are important exceptions like “experiencer” verbs like “fear”). For example, the rule holds for the following verbs just to name a few: “throw,” “launch,” “give,” “wash,” “smack,” “staple,” and “eat.” Theorists have developed a number of proposals regarding the relationship between thematic roles and their related semantic properties. A first type of theory posits a list of distinct thematic roles, with distinct inferential properties assigned to each role (e.g., Fillmore, 1968; Gruber, 1965). A second type of theory (Dowty, 1991) provides no detailed list of distinct thematic roles but only two broad prototypical roles—proto‐AGENT and proto‐PATIENT—each defined in terms of a prototype. According to both types of theory (i.e., inferential licensing and prototype views), the (proto‐)AGENT role is associated with intentionality, causation, and independent existence, while the (proto‐)PATIENT role is associated with undergoing a change of state and being causally affected by the event. Which particular theory ends up being correct is not of direct relevance for the current paper. What is of relevance is that the use of such thematic roles in creating mappings between semantics and syntax is a popular linguistic device across many languages (Bierwisch, 2006). Dryer (2013) examined 1,377 languages from across the world and found clear patterns in dominant word orders which were influenced by AGENT and PATIENT like thematic roles. Of the 1,377 languages studied, he found that 1,188 had a dominant word order related to AGENTS and PATIENTS. For example, 565 languages could be classified as AGENT–PATIENT–VERB languages, while another 488 could be classified as AGENT–VERB–PATIENT languages. In such languages, word orders can be said to be rigid, in which case the dominant order is grammatically required, or they can have flexible word orders, in which case a dominant word order is more frequently used than a non‐dominant word order (as in the Philippine language Cebuano), but is not grammatically required. Finally, it is worth noting that of those 189 languages in the Dryer survey which do not incorporate dominant word orders, it may still be that they employ an AGENT/PATIENT distinction in assigning cases like nominative versus accusative (Fillmore, 1968), or they may lack any AGENT/PATIENT distinction at all in the morphosyntax of the language (e.g., as may be the case in Riau Indonesian as reported by Gil, 2001). Similarly to the mass/count distinction, the semantic basis for numeral classifiers, gender, and causality, both pre‐verbal infants and the adult visual system appear to spontaneously assign (non‐verbal) “event roles” which are analogous to the AGENT/PATIENT distinction in language. For example, Hafri, Papafragou, and Trueswell (2012) recently showed that participants are immediately capable of discriminating agents from patients in photographs. In their experiment, observers briefly viewed an image depicting a simple action involving a boy acting on a girl or vice versa (e.g., a girl pushing a boy vs. a boy pushing a girl). In the most striking condition, images appeared on‐screen for 37 ms and were then followed by a visual mask. Observers were then shown a sentence (e.g., “The boy pushed the girl”) and had to indicate if this sentence was consistent or inconsistent with the image. The authors found that even at 37 ms, participants showed above chance performance. Some relevant work on pre‐verbal infants has also been done which suggests that infants spontaneously assign event roles in events depicting physical causality. For example, Leslie and Keeble (1987) used a logic of role reversal, reasoning that if one were to habituate infants to visual events in which a specific object acted as the agent of a causal event (for example) and were then shown a test event in which that character was a patient, perhaps they would dishabituate more compared to a case in which there was no role reversal. In their study, they habituated infants to causal displays in which object A caused object B to move, and at test the infants were shown either displays in which the objects maintained their causal roles (with A continuing to be the object causing the launch and B the object which is launched) or in which the roles were reversed. Infants indeed dishabituated more in the latter condition compared to the former, and the same result was not obtained in a non‐causal control. While the above evidence shows an ability of pre‐verbal infants to assign thematic‐role like attributes to inanimate actors in causal events, more recent work suggests that infants are also capable of making such assignments in social settings. Hamlin et al. (2007) have carried out a series of studies showing that 6‐month‐old infants prefer characters who perform a positive (helping) action toward a second actor when presented with a choice between the positive character and a neutral character who was not involved with the event. However, when shown a character who performed a negative (harmful) action (again toward a second actor), infants prefer to play with the neutral character instead of the “mean” one. Follow‐up studies (Hamlin, Wynn, Bloom, & Mahajan, 2011) have shown that children's preferences do not extend to just any character who was involved in a positive or negative event. Instead, their preferences appear to be specifically attuned to the roles that the characters play in such events. Thus, children will not positively evaluate the character being helped in a helping event in the same way that they positively evaluate the helper. Similarly, they will not negatively evaluate the victim of a negative action in the same way that they will negatively evaluate the perpetrator. Thus, the infants across these studies appear to be keeping track of who the agents and the patients of the events are, and these attributions of non‐verbal correlates of “thematic roles” appear to be influencing their social evaluations. Finally, some other recent work has shown that pre‐verbal infants are capable of linking intentionality to causal agents. Thus, when 10‐month‐old infants are shown a display in which a beanbag arrives on scene after apparently being thrown over an obstacle, they look for a shorter amount of time if an occluding surface is removed to reveal a plausible causal agent that could have intentionally brought about the event in question (e.g., a hand) than if an impossible, non‐intentional causal agent is revealed (e.g., a toy truck; Saxe, Tenenbaum, & Carey, 2005). In summary, the current approach potentially has wide breadth and lends itself to an account not only of the mass/count distinction or the hierarchical structure of numerical classifiers, but also can potentially explain an array of other cross‐linguistic grammatical regularities such as gender marking, causative constructions, and argument selection via thematic roles. In each case, we observe striking overlap between core knowledge and cross‐linguistic morphosyntactic structures. The factors mentioned in Section 3.1 provide plausible mechanisms by which core cognition may have exerted an influence over language morphology and syntax. Given that the relevant distinctions from core cognition are likely to be salient in both children and adults, one would expect that speakers are likely to frequently convey information about them in discourse (prior to their being rigidly encoded in the grammar of a language), and for such distinctions to be more memorable than other logically possible morphosyntactic distinctions that a language might make. Given that languages have a tendency to grammaticalize frequently used structures and evolve to be more memorable and learnable, one would expect such structures to be statistically likely cross‐linguistically. Thus, on this account, non‐verbal core knowledge substantially biases grammars by making certain structures more likely to occur. This claim is strengthened by computational modeling techniques showing that even slight inductive biases in the language learning process can yield robust cross‐linguistic patterns (Briscoe, 2000; Kirby, Dowman, & Griffiths, 2007).

6 Predictions of and tests for the current model The assumptions of this model should be tested by integrating typological and experimental data. The current view suggests that the formation of grammatical categories is subtly biased by core knowledge. This view is not necessarily committed to all core distinctions being regularly imported into language or to all grammatical distinctions being a product of core knowledge. It does, however, predict a correlation between core knowledge structures and typologically frequent grammatical structures. For the theory to be falsifiable, this requires setting some threshold for a given structure to count as “typologically frequent,” since there are a wide range of frequencies that a core knowledge‐based grammatical structure could occur at while still supporting the theory. For example, gender systems based on biological gender appeared in roughly 30% of the languages studied in Corbett (2013), while 86% of the languages studied by Dryer (2013) had word orders influenced by the AGENT/PATIENT thematic roles. Although 86% and 30% are very different with respect to frequency, both could be considered to support the theory. Setting the relevant threshold could be accomplished by identifying grammatical categories which are clearly not based on core cognition (e.g., the Japanese “‐seki,” which is the numeral classifier for large water vehicles) and examining the frequency at which grammatical categories based on these conceptual categories recur typologically. One might imagine that on average grammatical categories based on water vehicles (like “‐seki”) and other non‐core concepts only occur in, say, 1% of languages. If this were the case, then the current theory could be falsified by showing that, on average, grammatical distinctions based on core categories only occurred at the same rate or less frequently. An average rate higher than 1% would provide support for the theory. A research program along these lines would then consist of creating two independent lists of types of conceptual distinctions (core vs. not core) and checking the rates at which grammatical categories based on each distinction recurs typologically. In addition to testing the theory directly, it could also be used as a source of empirical hypotheses regarding specific diachronic or language learning phenomena. For example, similarly to existing artificial language learning experiments showing that languages with consistent orders are easier to learn than those with inconsistent orders (Christiansen & Devlin, 1997), artificial language experiments could be designed to test whether languages with word orders (or morphological markings) based on core thematic roles are more learnable than those whose word orders are based on other, non‐core knowledge based attributes. In addition to being a source of hypotheses concerning specific aspects of language, the current proposal also makes other concrete predictions that should be of relevance to developmental psychologists and perception researchers. One of the key insights into the current account is that by examining cross‐linguistic grammatical regularities with clear semantic content that have been noted by linguists, psychologists should expect to see that the non‐verbal correlates of these grammatical forms show signs of being “core knowledge.”4 One could exploit this link in the opposite direction to that described above. Thus, instead of making hypotheses on the basis of core knowledge to predict linguistic facts, one could look to grammatical patterns across languages, and then ask (in cases where this is not yet known) whether corresponding representations (a) appear early in infancy and/or (b) are part of the structure of perceptual processes. Below I concentrate on two specific cases to which this framework could be applied in order to illustrate how these predictions play out in some concrete examples. These are meant only to illustrate how the theory's predictions can be applied in practice, but they should by no means be taken as an exhaustive list of all predicted outcomes. 6.1 Events as core knowledge? Consider first the noun/verb distinction. This broad distinction between grammatical classes is one that is extraordinarily common across the world's languages (Langacker, 1987). These grammatical classes appear to be prototypically underpinned by certain conceptual characteristics. For example, although exceptions can be found, nouns prototypically refer to entities or regions that persist in certain domains (e.g., objects persist in space). On the other hand, verbs prototypically refer to events or states. Based on the current view, if this observation is correct, then just as different types of object representations are part of core knowledge, so too should event representations be. Developmental psychologists and researchers interested in perception could exploit this insight to ask if event representations are early emerging in infancy and are embedded in automatic perceptual processes. Indeed, there are hints in the literature already that this may be the case. For example, pre‐verbal infants at 6 months of age are capable of individuating and quantifying the number of discrete human actions (Sharon & Wynn, 1998; Wynn, 1996). Moreover, Baldwin, Baird, Saylor, and Clark (2001) showed that after habituating pre‐verbal infants to events depicting a completed goal, they show greater levels of dishabituation (as evidenced by increased looking time) when shown a subsequent event that is paused just prior to goal completion (i.e., prior to an event boundary) compared to when they are shown a subsequent event that is paused at the moment of goal completion (Baldwin et al., 2001). Taken together, these findings tentatively suggest that infants represent individuated events, and that their visual attention is guided by online processing of event structure. Based on the theoretical framework being presented here, given that event representation is both grammaticalized across languages and is early emerging in infancy, one should also expect that event representation should be an important computational unit in the visual system. Wood (2007) has done some groundbreaking work in this area by developing a change detection method for investigating memory capacity for events. The lesson from this work is that event variables play a crucial computational role in the organization of adult visual memory that mirrors the organizing role that objects play in visual memory (Luck & Vogel, 1997). Future studies could go on to ask whether one finds visual processes that are dedicated to the detection of events in a bottom‐up fashion which would have a subsequent influence on processes of visual attention or tracking (as was the case in the object vs. substance tracking literature mentioned above). Some studies have touched on this idea. So, for example, while the adult visual system seems to be better at detecting small interruptions or breaks in the video at event boundaries (Newtson & Engquist, 1976), people are worse at detecting visual probes appearing at event boundaries (Huff, Papenmeier, & Zacks, 2012). 6.2 Telicity as core knowledge? A second class of (potential) core event representation introduced in the linguistics literature concerns telicity. According to this tradition, which has its historical roots in book six of Aristotle's Metaphysics, verbs describing dynamic events can be classified into two broad grammatical categories: telic and atelic (Bach, 1986; Dowty, 1979; Garey, 1957; Parsons, 1990; Vendler, 1957; Verkuyl, 1989). Verbs that refer to atelic events do not logically require an endpoint (and thus could logically continue indefinitely) and are composed of homogenous subparts (e.g., “to swim”, “to think”). Verbs denoting telic events, on the other hand, logically entail a culmination point (e.g., “to decide,” “to make something”). The interface with syntax in English is evidenced by the fact that there exists a series of syntactic tests that are capable of discriminating between these categories. For example, the “how long did it take” test distinguishes atelic from telic events (Vendler, 1969): (6) **How long did it take for John to think ? (atelic) (7) How long did it take for Ron to close the door? (telic) (8) How long did it take for John to decide ? (telic) Although each of these sentences has a nearly identical syntactic structure (complicated only slightly by the extra noun phrase in (7)), sentences (7) and (8) are grammatically acceptable but not sentence (6) because only telic verbs are allowed to appear with the “how long did it take” construction. Thus, by the metric of this test, telic verbs function differently from atelic verbs. Cross‐linguistically, morphosyntactic sensitivity to telicity appears to be a statistically common strategy (Wagner, 2009; Wilbur, 2008), but languages differ in important ways in how telicity is expressed. In languages such as English, its expression is fairly covert and is dependent on subtle syntactic patterns. However, telicity is morphologically marked in certain Slavic languages like Russian and other spoken languages (Comrie, 1976; Filip, 2004) as well as across a range of sign languages (Wilbur, 2008). Based on the current theoretical model, the telic/atelic distinction is a good candidate for being an important element of core knowledge for non‐linguistically represented events given that this distinction has a clear semantic underpinning and appears to be a cross‐linguistic morphosyntactic regularity. This proposal is broadly compatible with theoretical work in computational linguistics claiming that telicity in language is based on a more basic cognitive capacity to understand events (Narayanan, 1997). While there has been quite a bit of empirical work in psychology examining how people process telicity in verbal contexts (see Folli & Harley, 2006, for a review), little empirical work has explicitly addressed the question of whether the telic/atelic divide is a basic part of non‐linguistic core knowledge. In one of the few related studies, Wagner (2009) first familiarized 11‐month‐old infants to displays in which a toy bunny rabbit repeatedly moved toward one of two objects (e.g., a box on the left‐hand side of the display). Importantly, the rabbit moved in one of three manners: hopping, gliding, or scooting. These manners were chosen based on their naturalness and complexity, with hopping being the most natural and least complex, while scooting was the least natural and most complex (and gliding was moderate). After familiarization, the child then saw the rabbit in a virtually identical display but in which the location of the two objects had been switched. At test, the bunny rabbit either moved toward the same goal–object in its new location (“old‐goal/new‐location”) or moved toward a different goal–object which was in the same location as the original goal–object from the familiarization trials (“old‐location/new‐goal”). Wagner found that for the most natural and second most natural manners, infants showed increased looking times in the old‐location/new‐goal trials relative to the old‐goal/new‐location trials. This suggests that on these trials, infants encoded the natural endpoint of the goal‐directed action and were sensitive to changes therein. However, for the most difficult manner of motion trials (i.e., scooting), infants failed to show a difference in looking time between the two old‐location/new‐goal trials and the old‐goal/new‐location trials. The author interpreted these results as being compatible with the possibility that infants analyzed the hopping and gliding events as being telic while they analyzed the scooting events as being atelic. However, this is not the only possible interpretation. Alternatively, it may be that if infants encoded actors' goals when the event was not too complex or overwhelming, but when processing demands were too high (as was potentially the case in the scooting events), then the ability to encode goals was overwhelmed. On this second interpretation, there is no need to appeal to an infant ability to represent the telic/atelic distinction. A related infant study has shown that 12‐ and 18‐month‐old infants imitate actions differently depending on how they represent the goal of an actor (Carpenter, Call, & Tomasello, 2005). In this study, infants watched an adult perform an action like making a toy mouse hop across a mat. In the “house condition,” the event ended by the adult making the mouse enter the house, while in the “no house” condition there was no house at the final stopping point for the mouse. In the house condition, infants imitated the final goal of the action by placing the mouse in the house without imitating the hopping motion. On the other hand, in the no house condition, infants instead imitated the hopping motion. The authors interpreted this as showing that infants inferred that the goal was either to put the mouse in the house (in the house condition) or to produce a specific kind of motion (in the no house condition), and they then imitated what they believed to be the adult's goal. However, these results could also be interpreted as suggesting that infants have the capacity to differentiate telic actions that have a specific endpoint (as in the house condition) and atelic actions, which contain no logically entailed endpoint (as in the no house condition). A habituation or looking time paradigm might further explore this question. For example, one could habituate infants to a series of atelic events and then at test either display, either a telic or atelic event, and one might expect more dishabituation in the latter case compared to the former. One important consideration in a such a paradigm, however, would be to ensure that any positive result could not be explained by uninteresting low‐level differences in motion patterns but instead are best explained by appeal to abstract event categories. Some work in adult sign language is also compatible with and is suggestive of the hypothesis that telicity is an important element of core cognition. Wilbur (2008) has shown that across many sign languages there are visual regularities in the ways that signs are employed to refer to atelic events (i.e., processes) and telic events (i.e., achievements and accomplishments). Atelic processes are regularly referred to by signs that involve homogenous, repeated motion, while telic events are referred to by signs that demarcate a clear endpoint by abrupt changes in hand aperture, orientation, velocity, or location. The presence of these cross‐linguistic regularities suggests the existence of pre‐linguistic connections between visual form and abstract representations of telicity. One possible explanation (although not the only one) is that there exist visual routines for detecting such event categories. Such mechanisms could create an unlearned propensity to associate certain types of visual forms (e.g., repeated motion) with certain abstract event categories (e.g., atelic events) in the creation of sign languages. Consistent with this proposal, Strickland, Fisher, et al. (2015), Strickland, Geraci, et al. (2015) recently showed that native English speakers who lack significant signing experience are nevertheless able to correctly extract telicity from the visual patterns of entirely unfamiliar signs. Thus, when shown a sign (from Italian Sign Language, Turkish Sign Language, or Sign Language of the Netherlands) meaning “to run,” participants were more likely to guess that this sign meant “to think” (which is also atelic) than “to decide.” However, when shown a sign meaning “to leave,” they were more likely to guess that this meant “to confirm” (which is also telic) than “to think.” Thus, these results suggest that there exist a set of unlearned associations between event categories (i.e., telic vs. atelic) and visual patterns. In order to follow‐up on this possibility, perception researchers could also inquire as to whether event telicity is automatically encoded in non‐linguistic processing. One could look at potential effects on memory or visual attention. Perhaps, for example, memory “bleeds” out for processes but respects its boundaries for achievements and accomplishments. Or perhaps there are attentional switching costs incurred by a change in event category such that a visual stimulus is processed more rapidly in a sequence in which the same event type is repeated (e.g., atelic/atelic) compared to a sequence in which different event types are shown in succession (e.g., atelic/telic). Similarly, there may be effects on the subjective experience of time where, for example, participants could be likely to overestimate the time that an atelic event lasts compared to a telic event.

7 General discussion Here, I have presented a new framework that potentially helps explain patterns of cross‐linguistic grammatical regularities. According to this view, non‐verbal core cognition, which plays an important role in infant cognition and adult visual cognition, biases the process of language evolution by influencing the use and acquisition of morphosyntactic forms in both adults and children. There are (at least) two potential mechanisms by which this happens. The first is that core cognition makes certain syntactic structures or morphemes more likely to be used, and therefore more likely to be grammaticalized. The second is that core cognition makes certain syntactic structures or morphemes easier to learn, and therefore these structures function as attractors in the language evolution process. This proposal adds to the growing body of research which suggests that languages adapt to non‐linguistic aspects of the human mind. Given that such a heavy emphasis is placed on cross‐linguistic information for the current paradigm, more collaboration between psychologists and linguists employing big data methods (e.g., World Atlas of Language Structures; Syntactic Structures of the World's Languages) could be an important avenue in future research. In such research, psychologists could look to linguistics in order to formulate hypotheses about core knowledge, but they could also contribute to the process by formulating hypotheses about likely core knowledge structures, that could then be charted cross‐linguistically. Such research could provide important information regarding both the scope and the limits of the putative language evolution mechanisms at issue here. The rich connections between core cognition and language make this a prime candidate for an exciting body of research that would span many of the subfields within cognitive science and potentially help us glean a better understanding of human nature.

Acknowledgments I thank the following people for their comments and insightful feedback: Frank Keil, Laurie Santos, Brian Scholl, Joshua Knobe, Philippe Schlenker, Emmanuel Chemla, Benjamin Spector, Jonathan Philipps, Matt Fisher, Brendan Ritchie, Alexis Wellwood, Alon Hafri, Bridget Copely, David Nicolas, Veronique Izard, Christopher Vogel, Brandon Liverence, Hilda Koopman, Laura Wagner, Dan Sperber, and Jonathan Bobaljik. I would also thank Bodo Winter, Barbara Spellman, Rick Dale, and two anonymous reviewers for their efforts in improving this paper during various stages of the review process. The research was supported by the European Research Council under the European Union's Seventh Framework Programme (FP/2007‐2013)/ERC Grant Agreement no. 313610, ANR‐10‐IDEX‐0001‐02 PSL*, ANR‐10‐LABX‐0087 IEC, and a Fyssen Foundation Postdoctoral grant.