Abstract

Abstract What is the relationship between top-down and bottom-up attention? Are both types of attention tightly interconnected, or are they independent? We investigated this by testing a large representative sample of the Dutch population on two attentional tasks: a visual search task gauging the efficiency of top-down attention and a singleton capture task gauging bottom-up attention. On both tasks we found typical performance—i.e., participants displayed a significant search slope on the search task and significant slowing caused by the unique, but irrelevant, object on the capture task. Moreover, the high levels of significance we observed indicate that the current set-up provided very high signal to noise ratios, and thus enough power to accurately unveil existing effects. Importantly, in this robust investigation we did not observe any correlation in performance between tasks. The use of Bayesian statistics strongly confirmed that performance on both tasks was uncorrelated. We argue that the current results suggest that there are two attentional systems that operate independently. We hypothesize that this may have implications beyond our understanding of attention. For instance, it may be that attention and consciousness are intertwined differently for top-down attention than for bottom-up attention.

Introduction

The senses are continuously bombarded with a multitude of sensory impressions. A key challenge is to select which impressions are relevant and which inputs should be ignored. This process of selecting a subset of the input, and ignoring the rest, is referred to as attention (Broadbent, 1958 ; Desimone & Duncan, 1995 ; Neisser, 1967 ; Treisman, 1960 ). Within such a conceptual scheme, a central question of debate has been whether the moment of selection is early or late (Broadbent, 1958 ; Deutsch & Deutsch, 1963 ).

Note that described like this, attention seems to be a unitary phenomenon. It seems that there is one selection mechanism, and this selection mechanism filters all incoming input.

However, currently, two types of attention are commonly distinguished in the literature: bottom-up and top-down attention, or stimulus-driven and goal-oriented attention (Carrasco, 2011 ; Corbetta & Shulman, 2002 ; Desimone & Duncan, 1995 ; Kastner & Ungerleider, 2000 ). Top-down attention refers to the voluntary allocation of attention to certain features, objects, or regions in space. For instance, a subject can decide to attend to a small region of space in the upper-left corner or to all red items. Both cases are examples of top-down attention, the first of top-down spatial attention, the latter of top-down feature attention (Beauchamp, Cox, & Deyoe, 1997 ; Bressler, Tang, Sylvester, Shulman, & Corbetta, 2008 ; Giesbrecht, Woldorff, Song, & Mangun, 2003 ). On the other hand, attention is not only voluntarily directed. Salient stimuli can attract attention, even though the subject had no intentions to attend to these stimuli (Schreij, Owens, & Theeuwes, 2008 ; Theeuwes, 1991 1992 ). For instance, if a subject is engaged in a conversation, but a loud bang occurs, this bang may attract attention. Or, in the visual domain, someone may be looking for red items, but an unexpected, sudden appearance of a nonred object may inadvertently draw the attention of the subject.

The similarity in top-down and bottom-up deployments of attention is that, although the reason for attentional deployment is different, the effects are largely the same. In both cases, the attended objects receive preferential processing. In both cases, this leads to an increased neural response, which has functional consequences, such as better memory storage (Buschman & Miller, 2007 ; Ciaramelli, Grady, & Moscovitch, 2008 ; Reynolds & Chelazzi, 2004 ).

However, there are also important differences between both types of attention. Top-down attention is also referred to as endogenous or sustained attention, and bottom-up attention is commonly typified as exogenous or transient attention (Carrasco, 2011 ). This difference in nomenclature is employed for a good reason: Top-down attention is called endogenous because, unlike bottom-up attention (which is automatic/involuntary), it is under clear voluntary control. Importantly, top-down attention is called sustained, since subjects typically direct their top-down attention at objects, features, or regions in space for sustained periods of time, whereas bottom-up attention is transiently captured. Moreover, top-down attention seems to take longer to deploy than bottom-up attention, approximately 300 and 100–120 ms, respectively (Cheal, Lyon, & Hubbard, 1991 ; Hein, Rolke, & Ulrich, 2006 ; Ling & Carrasco, 2006 ; Liu, Stevens, & Carrasco, 2007 ; Muller & Rabbitt, 1989 ; Nakayama & Mackeben, 1989 ; Remington, Johnston, & Yantis, 1992 ).

Furthermore, although some of the effects of top-down and bottom-up attention are similar, there are also important differences. Yeshurun and Carrasco ( 1998 ) had subjects detect a texture-defined target with a specific orientation on a background of orthogonal orientation. In such a task, performance does not always peak when the target is presented foveally, but depending on the spatial scale of the target, on certain eccentric locations (Gurnsey, Pearson, & Day, 1996 ; Joffe & Scialfa, 1995 ; Kehrer, 1989 ; Morikawa, 2000 ). It appears that this is caused by the fovea being most sensitive to high spatial frequencies, whereas eccentric parts of the retina are more sensitive to lower spatial frequencies (Kehrer, 1989 ). Interestingly, Yeshurun and Carrasco ( 1998 ) employed an exogenous cue to direct bottom-up attention to the target location. This seemed to always increase the perceived spatial resolution of the target, causing a detrimental effect on task performance when the target location was too near a foveal location. Importantly, in a similar setup, top-down attention only increased spatial resolution when this was beneficial for the task at hand (Yeshurun, Montagna, & Carrasco, 2008 ). This then is a clear example where bottom-up attention rigidly causes a certain effect (increased spatial resolution), whereas top-down attention may be more flexible (only increase spatial resolution when it is beneficial). The differential influence of top-down and bottom-up attention is also observed in temporal order discrimination. Bottom-up attention seems to impair it; top-down attention seems to enhance it (Hein et al., 2006 ).

Also in detecting second-order texture contrasts, differential effects are found. Both types of attention enhance second-order contrast sensitivity, but the effects of bottom-up attention are driven by second-order spatial frequency content, whereas the effects of top-down attention are independent of this (Barbot, Landy, & Carrasco, 2012 ).

With regards to the interaction between attention and working memory top-down and bottom-up attention also seem to play different roles. Top-down attention seems to leave the meridian effect intact. The meridian effect is the phenomenon that performance drops when the attended location is separated from the memory target by more vertical or horizontal crossings (Botta, Santangelo, Raffone, Lupianez, & Belardinelli, 2010 ), whereas bottom-up attention cancels the meridian effect (Rizzolatti, Riggio, Dascola, & Umilta, 1987 ).

Another example is that top-down attention can be flexibly employed depending on differential cue validity, whereas bottom-up attention seems to lack this flexibility and is always employed to the same extent irrespective of cue validity (Giordano, McElree, & Carrasco, 2009 ).

A final noteworthy example of a differential effect of bottom-up and top-down attention is observed in the so-called inhibition of return phenomenon (IOR; Posner & Cohen, 1984 ), where attention first facilitates processing at a location, followed by inhibited processing at this same location. Importantly, IOR seems only to occur when bottom-up attention is involved (Peelen, Heslenfeld, & Theeuwes, 2004 ).

Importantly, note that it may also be that within top-down (and perhaps also bottom-up) attention, different subdivisions can be made. Top-down attention can be directed at a location, or at specific features. For instance, one may attend to the center of the screen, or one may attend to any red item. In the latter case different disconnected areas in space may be selected. This does not only have spatial consequences, but seems also to affect temporal qualities: Spatial attention may be employed faster than featural attention, i.e., it seems to take 150–300 ms to employ spatial attention and 300–500 ms to employ featural attention (Liu et al., 2007 ).

So there is ample evidence that bottom-up and top-down attention can have differential effects. However, this does not necessarily imply that both types of attention do not share certain key properties or are even caused by the same underlying mechanism. For example, jumping is different from running in many respects, but nonetheless both are caused by the same underlying system.

Thus, despite all the differential effects, it is still unclear whether bottom-up and top-down attention are caused by two independent systems, or not. An indication that top-down and bottom-up attention are caused by differential mechanisms is that bottom-up, but not top-down, attention is already present in the most simple species, such as fruit flies (Van Swinderen, 2007 ; Van Swinderen et al., 2009 ). This then would suggest that bottom-up attention is a more primitive form of attention, and top-down attention a newer form.

However, there are also indications that, in humans at least, both systems are integrated. Neglect normally comes about by lesions to cortical areas, specifically the right parietal lobes (Smania et al., 1998 ; Vallar, 1993 1998 ; Vallar & Perani, 1986 ). If both systems are independent, then it could be expected that in neglect, top-down attention is affected, but bottom-up attention is not. However, it seems that both types of attention are strongly reduced in the neglected field.

Furthermore, studies into eye movements also suggest a strong degree of integration of both types of attention. When attention is influenced by both bottom-up and top-down factors, one could expect a horse-race, if both types of attention are independent. So, if top-down attention wants to direct the eyes to location A, and bottom-up attention is attracted to location B, it could be expected that the eyes go to location A on some trials, and to location B on others. However, this is not what happens. In such a case, the eyes typically go to a location somewhere between A and B (Godijn & Theeuwes, 2002 ). This finding is congruent with the notion that there is only one attentional system, or at least that the attentional systems operate in an integrated manner.

Importantly then, the evidence so far is not conclusive: Even if there would be only one attentional system, it could still respond differently depending on the cause of its deployment (urgent or less urgent action for instance). Furthermore, bottom-up attention may be a philogenetically older system, but the brain as a whole must remain an integrated organ. Whether over the course of evolution newer systems are independent from older systems is (also) determined by selection pressure and not only phylogenetic order. Also, the evidence regarding neglect patients is not unambiguous, some bottom-up capture of attention does seem to break through the neglect (Vuilleumier & Schwartz, 2001 ). Finally, the integrated eye movement output could point to an integrated attentional system, but the integration could also occur later (for instance in the superior colliculus), after two independent systems have given their respective inputs, or, since the temporal dynamics may differ between both types of attention, it could be that the bottom-up system affects the trajectory of the eye movement first, and the top-down system affects it later, creating an eye movement affected by both influences.

Thus, a key question remains whether the human brain essentially has one attention center, controlling both top-down and bottom-up attention, or whether there are essentially two attention systems, one controlling top-down and one controlling bottom-up attention. When we consider the psychometric properties of these two attentional systems, we can distinguish between three different types of models.

It is possible that one mechanism produces both top-down and bottom-up attention (see Figure 1A ). Alternatively, it could be that there are two mechanisms underlying both types of attention, but that both mechanisms strongly influence each other (see Figure 1B ; Van Der Maas et al., 2006 ). In both situations, one would expect that performance on a top-down attention task correlates with performance on a bottom-up attention task.

Finally, it is possible that not only are both types of attention produced by two different systems, but that both mechanisms also operate independently.

In the first two scenarios, top-down attention (measured with a visual search task) and bottom-up attention (measured with a singleton capture task) should be strongly correlated. If both types of attention are indeed strongly correlated, then performance on both tasks should also be strongly correlated. So, either people who are efficient in deploying top-down attention should be less susceptible to irrelevant distractors, or more. However, the main point is that if the two types of attention are interdependent, this should be reflected by a correlation in how efficient top-down attention can be deployed, and how easily bottom-up attention is captured. For instance, it could be that people who are more efficient in guiding their attention are more in control of their attentional deployments in general, and thus are less distractable (i.e., are less susceptible to salient distractors). In that case we would expect to find a negative correlation between search efficiency and amount of capture. It seems more far-fetched to find a positive correlation between search efficiency and distractibility, but this cannot be excluded beforehand. For instance, perhaps people who are better at deploying top-down attention are just better at deploying attention generally, and thus also better at deploying bottom-up attention. In that case one would expect a positive correlation between search efficiency and susceptibility to bottom-up distractors. Importantly, if the two types of attention are interdependent, some type of correlation between search efficiency and distractibility is expected. Only in the third scenario, where top-down attention and bottom-up attention operate independently, is it expected that performance on the top-down attention task is uncorrelated to performance on the bottom-up task. So the predictions simply boil down to this: If the first or second scenario is correct, we expect to find a correlation between task performance on both tasks. However, if the third scenario is correct, we expect to find no such correlation.

In the current study, we investigated whether performance on a top-down attention (measured with a visual search task) and bottom-up attention task (measured with a singleton capture task) are correlated in a large, representative sample (for the cohort of 20 to 25 years of age) of the Dutch population. We measured top-down attention and bottom-up attention using two different tasks. The use of two different tasks avoids spurious correlations based on the similarity of testing. For instance, if in both cases we would use a multiple-object tracking task, and then manipulate which type of attention is employed, then there is a danger of finding correlations just because some people are better in multiple-object tracking than others, and this ability may then affect both types of attentional deployment in this specific setting. To test top-down attentional control, we employed a conjunction visual search task (participants search a rotated T among rotated Ls), which required the deployment of top-down attention in order to find the target (Wolfe & Horowitz, 2004 ). The dependent measure in this task is the search slope: How fast does attention move from item to item? The search slope is thought to reflect the efficiency of top-down attention, since attention is quickly steered around based on top-down goals. Note that the intercept is not considered to be a reliable measure of top-down attention, since the intercept indicates how fast someone is when no attentional shifts have yet occurred. Note furthermore that although the search slope is considered to reflect top-down attention in general (Wolfe & Horowitz, 2004 ), this may reflect several different aspects of top-down attention, such as how quickly attention can be deployed, how quickly it can be disengaged, and how fast items within an attentional window can be compared to the target template.

A singleton capture paradigm was used to test bottom-up capture (Theeuwes, 1992 ). In this task participants searched for a uniquely colored shape (in our case, a diamond among circles), while at some trials an irrelevant singleton was present (in our case: one of the circles had a unique color). The dependent measure in this case was how much slower subjects were when the irrelevant singleton was present. This is thought to reflect how much bottom-up attention is drawn to the irrelevant singleton (Hickey, McDonald, & Theeuwes, 2006 ; Theeuwes & Godijn, 2002 ). Again, the drawing of bottom-up attention may consist of more than one process. It may reflect how often bottom-up attention is drawn, how long it is engaged, and how long it takes to redeploy it. Note that although in the singleton capture paradigm, subjects also have a search goal, the irrelevant singleton is never the goal. Therefore, we argue that this paradigm is an especially reliable measure of bottom-up attention. Not only is it not beneficial to attend to the irrelevant singleton (which is normally considered to be enough to avoid the deployment of top-down attention, e.g., Hein et al., 2006 ; Jonides & Yantis, 1988 ; Yantis & Jonides, 1984 ), it is even detrimental, making it virtually certain that subjects will not voluntarily (or in other words in a top-down manner) attend to the irrelevant singleton.

Since we are dealing with a very large sample of subjects (N = 936), and we have two clear competing hypotheses, we apply Bayesian statistics to be able to also evaluate the likelihood of there being no correlation between task performances. Importantly then, this allows us to find evidence for or against the existence of a correlation between both measures.

The current set-up provides two important strongpoints: It is a large and representative sample of the Dutch population (in terms of gender and educational level), and we have two well-tested, independent tasks that serve as a benchmark for experiment validity (does each task produce a normal pattern of results?).