Human infants, like immature members of any species, must be highly selective in sampling information from their environment to learn efficiently. Failure to be selective would waste precious computational resources on material that is already known (too simple) or unknowable (too complex). In two experiments with 7- and 8-month-olds, we measure infants’ visual attention to sequences of events varying in complexity, as determined by an ideal learner model. Infants’ probability of looking away was greatest on stimulus items whose complexity (negative log probability) according to the model was either very low or very high. These results suggest a principle of infant attention that may have broad applicability: infants implicitly seek to maintain intermediate rates of information absorption and avoid wasting cognitive resources on overly simple or overly complex events.

Funding: CK and STP were supported by Graduate Research Fellowships from the National Science Foundation ( www.nsf.gov ). The research was supported by grants from the National Institutes of Health (HD-37082, www.nih.gov ) and the J. S. McDonnell Foundation (220020096, www.jsmf.org ) to RNA. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Introduction

Human infants face two daunting problems as they begin to learn about their surroundings. First, they enter the postnatal world with only rudimentary mechanisms–provided by their evolutionary heritage–for interpreting environmental information. Second, the potential information available in the environment is both voluminous and complex. These two problems led William James to coin his famous phrase about “the blooming, buzzing confusion” that confronts the newborn [1]. Nonetheless, infants show remarkable feats of learning, beginning in the last trimester of fetal life, continuing through the perinatal period, and accelerating through infancy and early childhood [2]–[5]. Infants are able to extract the statistical properties of their environment in a diverse array of learning tasks and domains, including sounds, words, people, shapes, and objects [6]–[11]. But how is it that infants are able to learn efficiently in such a complex environment? One solution is to have a small set of innate biases; for example, seeking to look at and listen to biologically significant stimuli such as faces and speech. However, innate biases alone cannot be the solution for the vast majority of stimuli from which infants must learn. Given the slow time-course of evolution, we also need general purpose learning mechanisms to deal with a changing environment and with classes of stimuli that could not plausibly be processed by a small set of specialized mechanisms.

Here, we focus on this general-purpose learning mechanism by avoiding the use of special stimuli and asking whether infants deploy a sensible (and likely implicit) strategy for allocating attention to arbitrary, neutral stimuli. Our goal is to determine whether infants are biased to gather information from the environment in a principled way that serves as a key component of an efficient learning mechanism [12], [13]. Specifically, we provide evidence that infants avoid spending time examining stimuli that are either too simple (highly predictable) or too complex (highly unexpected) according to their implicit beliefs about the probabilistic structure of events in the world. Rather, infants allocate their greatest amount of attention to events of intermediate surprisingness–events that are likely to have just enough complexity so that they are interesting, but not so much that they cannot be understood. This approach builds on a longstanding tradition in developmental psychology, as exemplified by Piaget [13]. He argued that when children are confronted with a new piece of information, they initially attempt to incorporate it within their existing knowledge structures through a process of assimilation. When this is not possible, children either fail to learn new structures (and move on to sample other information) or they adapt by creating new knowledge structures, a process he called accommodation.

Piaget had no objective measure of assimilation or accommodation; they remained hypothetical constructs. However, in subsequent research, a proxy for these theoretical constructs centered on the relative duration of visual attention to objects or events varying in complexity or familiarity. Many researchers have speculated about what underlying mental operations are indexed by infants’ looking times or attentional patterns [14] (for review, see Aslin 2007 [15]). The generally accepted view is that looking times reflect some combination of (a) stimulus-driven attention, (b) memory of past stimuli, and (c) comparison between the current and the past stimuli. If infants are presented with an already familiar stimulus, they prefer it over a novel stimulus, but quickly tire of it after a brief period of re-familiarization (habituation), and subsequently show preferences for novel stimuli. Similarly, if repeatedly exposed to an initially novel stimulus, infant looking times decline and then recover to the presentation of another novel (i.e., completely unfamiliar) stimulus. Theoretical accounts for these familiarity and novelty preferences all share a common theme: As infants attempt to encode various features of a visual stimulus, the efficiency or depth of this encoding process determines their subsequent preferences. Familiarity preferences arise when infants have not yet completed encoding the familiar stimulus into memory, or when the novel stimulus is too dissimilar from the infants’ existing mental representations to be readily encoded [16]–[22].

However, these theories lacked an objective measure of the relevant independent variable–an event’s complexity or relationship to existing representations. Instead, researchers overwhelmingly relied on qualitative judgments of stimulus complexity to select materials to test infants’ visual preferences. These qualitative judgments relied on inferences about infants’ existing mental representations, to which researchers had no direct access. With no reasonable way of modeling infants’ existing representations, it was impossible to quantitatively measure the complexity of the information conveyed by a particular stimulus. Thus, researchers had only post hoc estimates of stimulus complexity–those obtained by measuring the very patterns of visual preferences that the theories were designed to predict. Two exceptions are Civan, Teller & Palmer 2005 [23] and Kaldy & Blaser 2006 [24] in that both papers quantified the perceptual salience of visual stimuli in order to effectively demonstrate its importance in eliciting infants’ preferences for novel versus familiar stimuli.

We overcome these problems by formalizing a notion of stimulus complexity and behaviorally testing the relationship between complexity and infants’ probability of looking away at each successive point in a sequence of events. We assume that at each point in the experiment–and in everyday life–infants have used observed data to form probabilistic expectations about what events are likely and unlikely to be observed next [25], [26]. We model these expectations using an idealized observer model of our experimental stimuli. We then measure complexity as the negative log probability of an event according to this idealized model. This measure quantifies each event’s information content [27]. (This measure has also been called surprisal [28], since it may also be interpreted as representing the “surprise” of seeing the outcome.) We show that infants preferentially look away at events that are either very simple (high probability) or very complex (low probability), according to the idealized model. Intuitively, high probability events convey little information–infants’ attentional resources are best spent elsewhere. Low probability events may indicate that the observed stimuli are unlearnable, unstructured, or difficult to use predictively in the future. Negative log probability also quantifies the number of bits of information an ideal observer would require to encode that sequence of events in memory. Thus, infants may avoid stimuli that require encoding too much information or information that could only be extracted by prolonged attention to rare events, thereby incurring a higher processing cost than shifting attention to less complex events.