In social interactions, it is highly salient to us where other people are looking. The ability to recover this information is critical to typical social development, helping us to coordinate our attention and behavior with others and understand their intentions and mental states []. The depth and direction in which another individual is fixating are specified jointly by their head position, eye deviation, and binocular vergence []. It is hereto unknown, however, whether this dynamic visual information about others’ focus of attention affects how we ourselves see the world. Here we show that the perceived depth and movement of physical objects in our environment are influenced by others’ tracking behavior. This effect occurred even in the presence of conflicting size cues to object location and generalized to the context of apparent motion displays [] and judgments about causal interactions between moving objects []. Perceived object trajectory was modulated primarily by the object-level motion of the tracking agent (e.g., the head), with less-pronounced effects of eye motion and low-level motion. Interestingly, comparable perceptual effects were induced by non-face objects that displayed similar tracking behavior, indicating a mechanism of distal coupling between the motion of the target and an appropriately moving inducer. These results demonstrate that social information can have a fundamental effect on our vision, such that the visual reality constructed in each brain is determined in part by what others see.

Gaze cues also affected the perception of causal interactions between moving objects. In the cross-bounce illusion, two dots moving across the screen can be perceived as either crossing past one another or bouncing off one another (see Movie S5 and Figure 4 C). This instance of bistable motion perception has been exploited to investigate the contribution of contextual cues to the visual perception of object interactions. For instance, the dots vividly appear to bounce off one another when a ricochet sound is heard at the point of contact between the dots []. We modified this paradigm by replacing the auditory cues with an on-screen avatar that followed a particular trajectory of dot motion throughout the animation, demonstrated in Movie S5 . The perceived trajectory of the dots (“cross” versus “bounce”) was bistable in the absence of gaze cues but was biased toward the particular trajectory implied by the gaze cues when they were present ( Figures 4 D and 4E; experiment 6). Similar to experiment 3, the effect of avatar cues on the perception of target trajectory appeared to relate most prominently to distal coupling between the gross motion of the avatar and the target rather than being attributable to, for example, local background motion around the target ( Figures 4 D and 4E).

Gaze cues also affected the perception of apparent motion, demonstrated in Movie S4 . Perception of intermittently occluded moving objects (e.g., a cat racing behind a picket fence) and motion in successive image displays (e.g., movies) relies on inferences about the correspondence between visual objects over successive moments in time []. In a typical apparent motion display (the “bistable quartet”), the location of two dots on-screen differs between two alternating frames. This produces a vivid sense of the dots moving location between frames, despite the fact that the direction of putative dot motion ought to be ambiguous, as the changed positions of the dots could have resulted from horizontal, vertical, or diagonal motion. Indeed, perception is typically bistable, such that the dots are seen as moving either vertically or horizontally. The perceived direction of motion in the bistable quartet can be biased toward either vertical or horizontal motion by altering the aspect ratio of the competing paths of apparent motion; for instance, a greater vertical separation between the dots promotes perception of horizontal dot motion between frames. In the present experiment, an avatar with gaze focused on one of the dots implied a particular correspondence between the dot locations across frames (i.e., on the assumption that the avatar is looking at the same dot in each frame) and thus implied a particular direction of dot motion (illustrated in Movie S4 and Figure 4 A; experiment 5). We measured the relationship between the perceived direction of apparent motion and the aspect ratio of dot locations and found that the presence of the avatar produced a strong and consistent shift in the psychometric function accordant with the trajectory of dot motion implied by gaze cues (effect size = 0.52 [95% CI: 0.40, 0.68]; Figure 4 B). At the individual level, all subjects differed between the gaze vertical and gaze horizontal conditions in the expected direction.

(E) The effect of gaze cues, quantified as the difference in mean responses between when the eyes followed a bounce trajectory compared to when they followed a crossing trajectory, was significant for all but the HEO condition. Error bars indicate 95% confidence intervals.

(D) Participants reported whether the dots appeared to bounce off one another or cross past one another. The cues provided by the avatar differed across a series of conditions (see STAR Methods for details; see Figure S1 in the Supplemental Information for illustrations of each condition). B, baseline; FC, full cues; HO, head only; EO, eyes only; NFO, non-face object; FC-L, full cues with local motion removed; NF-L, non-face object with local motion removed; HEO, head-eyes opposed; MH, multiple heads; CAT, catch trials. In catch trials, the avatar provides no cues to target trajectory, but “bouncing” and “crossing” trajectories are indicated by dot color. Data are represented as mean ± 1 SEM.

(C) In experiment 6, the avatar’s gaze either followed the target dot to the opposite side of the screen (implying that the dots crossed) or back to the same side of the screen (implying that the dots bounced off one another). See Movie S5 . The images shown are frames 25%, 50%, and 75% into the animation.

(B) In experiment 5, participants indicated whether the spheres were moving vertically or horizontally between frames. The aspect ratio of the sphere positions differed across trials such that there was either greater horizontal separation between the sphere positions (positive log aspect ratio) or greater vertical separation between sphere positions (negative log aspect ratio). This allowed us to quantify the effect of gaze cues on the apparent motion of the target spheres. In the baseline condition, the spheres were shown in the absence of gaze cues; specifically, the avatar was stationary across frames, and its eyes were occluded by sunglasses, similar to that illustrated in Figure 2 A, while the position of the spheres differed between frames as usual.

(A) Apparent motion displays that incorporate gaze cues to the direction of apparent motion. The display alternated between the two upper frames or the two lower frames, corresponding to vertical and horizontal shifts in gaze direction, respectively. When gaze cues are absent, the spheres appear to move either vertically or horizontally between frames, and the perception of motion direction tends to be bistable. See Movie S4

In a further experiment designed to assess the contribution of low-level motion to the perceptual effects reported here, we compared the effects of animating convex and concave non-face objects to track the target (experiment 4). For a given direction of object-level tracking (i.e., when the inducer tracked a clockwise or counterclockwise trajectory of target motion) the direction of low-level motion for convex and concave inducers (i.e., the local motion of the object surface) occurred in opposite directions ( Figure 3 Movie S3 ). Despite this difference in low-level motion, convex and concave inducers produced the same pattern of changes in perceived target trajectory across clockwise and counterclockwise conditions ( Figure 3 ; concave inducer, effect size = 13.63% [95% CI: 9.34, 16.36]; convex inducer, effect size = 20.94% [95% CI: 16.19, 28.03]). This provides strong evidence that the perceptual effects that we report are not accounted for by lower-level motion processes but rather are best explained in terms of the tracking behavior of the inducer.

(D) The effect of the inducer on perceived target trajectory across conditions. Data are represented as the half-difference between the midpoints of the psychometric functions for clockwise and counterclockwise conditions, with 95% confidence intervals. The effect size in the convex non-face object condition was significantly greater than in the concave non-face object condition ( ∗ difference = 7.31%, 95% CI: 1.42 15.70), but the effect of the inducer was significantly different from zero in both conditions (and, most importantly, occurred in the same direction in both conditions).

(C) Data for the convex and concave stimulus conditions (n = 5 for each). The effect of the inducer on perceived target trajectory occurred in the same direction across these two conditions, despite the opposite directions of low-level motion.

(B) Animation stills of the convex and concave non-face stimuli used in experiment 4. Animated examples of these stimuli are shown side by side for comparison in Movie S3

(A) Bird’s-eye view schematic of the stimuli shown in Movie S3 . When concave and convex objects are rotated to track the movement of a target, the object-level motion occurs in the same direction for these different objects, while the low-level motion (i.e., local motion of the object’s textured surface) seen behind the target occurs in opposite directions.

When a non-face object was rotated to track the target object in the same manner as the face stimulus (illustrated in Figure 2 E), there was a similar effect on the perceived trajectory of the target (effect size = 16.28% [95% CI: 13.32, 20.11]). This suggests that the key mechanism behind the perceptual effects is the tracking behavior of the inducer. Previous work has shown that visual perception of object motion is driven in part by an implicit familiarity with the natural interaction of three-dimensional objects; in particular, the mechanical effect of friction between objects on their consequent rotation []. In the present study, however, the influence of tracking cues on perceived target trajectory still occurred when the background immediately around the target object was removed (effect size = 8.79% [95% CI: 4.42, 12.40]). This indicates that the illusion is not produced simply by motion immediately around the target (such as might underlie a friction effect) but rather by more distal coupling between the target and an inducer that is tracking the target’s motion.

Further experiments determined the features of the face stimulus that induced the illusion of target depth ( Figure 2 ; experiment 3). Head rotation (with the eyes occluded) produced a strong illusion of depth, similar in magnitude to the full stimulus (effect size = 18.19% [95% confidence interval [CI]: 11.60, 24.06]). Eye movement (within a motionless head) had a notably smaller but still statistically significant effect on the perceived depth of the target (effect size = 3.23% [95% CI: 0.74, 6.22]). When the head and the eyes followed incongruent trajectories (i.e., the head followed a clockwise path, while the eyes followed a counterclockwise path, or vice versa), the perceived trajectory of the target was shifted in the direction consistent with the head (effect size = 10.93% [95% CI: 6.50, 15.69]). This effect appeared lesser in magnitude than when the eye and head cues were congruent, but the comparison between these conditions was not statistically significant at the 95% level. Together, these results implicate head motion as a strong cue to target trajectory, with a lesser, but still tangible, role for eye movement.

(G) The effect of gaze cues on perceived target trajectory across conditions. FC, full cues (experiment 1); HO, head only; EO, eyes only; HEO, head-eyes opposed; NFO, non-face object; LMR, local motion removed. Asterisk ( ∗ ) indicates a significant difference between conditions at the 95% level. Data are represented as the half-difference between the midpoints of the psychometric functions in clockwise and counterclockwise conditions with 95% confidence intervals.

(A) Animation stills for the full cues condition (containing both head and eye cues to gaze direction) and baseline condition (containing neither head nor eye cues to gaze direction) used in experiments 1 and 2.

To quantify the extent to which gaze cues contribute to perceived depth, we designed a psychophysical task that pitted gaze cues against object size cues. The focus of gaze followed an elliptical trajectory of fixed depth in either the clockwise or counterclockwise direction. The size cues of the target object either followed a straight line (like that shown in Movie S1 ) or followed an elliptical trajectory of variable depth in a direction that was either congruent or incongruent with the focus of gaze. See Movie S2 for an illustration of how changes in the size of the object disambiguate the direction of motion along the elliptical trajectory. The presence of gaze cues produced a statistically significant shift in the relationship between size cues and perceived target trajectory, in the direction consistent with the trajectory of gaze (illustrated in Figures 1 B–1D; experiments 1 and 2). The magnitude of this shift corresponded to 6%–15% of the depth of the gaze trajectory. This is a strong effect, as size cues of equivalent magnitude, when presented in the absence of gaze cues, produce a robust sense of depth in the veridical direction ( Figures 1 B–1D, baseline condition).

We first examined for an effect of tracking movements on perceived target trajectory in depth. We created animated stimuli in which the focus of an avatar’s gaze follows an elliptical trajectory in front of its face (demonstrated in Movie S1 ; and illustrated schematically in Figure 1 A). The focus of gaze is indicated by head rotation, eye deviation, and binocular vergence. In contrast, the “veridical” movement of the target object in the basic condition, indicated by size cues, is in a straight line perpendicular to the observer’s direction of view (i.e., with no change in depth relative to the viewer throughout the animation; Movie S1 ). Many observers experience a compelling visual illusion when watching this animation, wherein the target object appears to move in an elliptical trajectory more consistent with the focus of the avatar’s gaze.

(D) Between-subjects replication of (B), demonstrating that the illusion occurs in a wider pool of subjects. Effect size = 6.35% (95% CI: 2.12, 11.14). The smaller effect size in this sample may reflect different degrees of engagement in the task typical to the sample in experiment 1 (experienced psychophysical observers) and the sample in experiment 2 (undergraduate students). At the individual level, 18 out of 23 subjects differed between the clockwise and counterclockwise conditions in the expected direction. A binomial test confirmed that this is a significantly greater proportion than that expected by chance, p < 0.05 (two-tailed).

(C) Within-subjects replication of (B), demonstrating that the illusion persists after extensive experience with the stimuli. Effect size = 15.13% (95% CI: 9.68, 22.13). At the individual level, all subjects differed between the clockwise and counterclockwise conditions in the expected direction at time 1 and time 2.

(B) Quantifying the effect of gaze cues on perceived object trajectory. The path of the target object (as indicated by size cues) differed across trials between a straight line and a set of different elliptical paths that were up to 50% of the depth of the gaze trajectory, in either the congruent (positive) or incongruent (negative) direction of travel (see Movie S2 ). Participants indicated whether the target object was traveling in a clockwise or counterclockwise direction of elliptical motion. Logistic functions were fit to the data by minimizing the sum of squared errors. There was a shift toward more counterclockwise responses compared to baseline when the eyes were following a counterclockwise trajectory and a shift toward more clockwise responses compared to baseline when the eyes were following a clockwise trajectory. To quantify the magnitude of this effect, we calculated the difference between the midpoints of the psychometric function when gaze was clockwise compared to when gaze was counterclockwise and halved this value to get an average effect of gaze trajectory on perceived target depth. Effect size = 12.02% (95% CI: 6.13, 17.90).

(A) A schematic view of the stimulus shown in Movie S1 . The focal point of the avatar’s gaze follows the path of the black ellipse in either the clockwise or counterclockwise direction. The target object follows the path of the straight red line (in Movie S1 ), perpendicular to the observer’s direction of view. The target object does not change in depth relative to the viewer (as indicated by size cues), but despite this, it is typically perceived as moving in an elliptical path more consistent with the trajectory of gaze.

Discussion

10 Henderson J.M. Gaze control as prediction. 11 Judd, T., Ehinger, K., Durand, F., and Torralba, A. (2009). Learning to predict where humans look. IEEE 12th International Conference on Computer Vision (ICCV), 2106–2113. 12 Mareschal I.

Calder A.J.

Clifford C.W. Humans have an expectation that gaze is directed toward them. 13 Pantelis P.C.

Kennedy D.P. Prior expectations about where other people are likely to direct their attention systematically influence gaze perception. 12 Mareschal I.

Calder A.J.

Clifford C.W. Humans have an expectation that gaze is directed toward them. People tend to look at and track objects in their immediate environment, and the present results suggest that the visual system is sensitive to this regularity in the physical world. In general, the focus of our (own) attention is drawn toward salient aspects of the environment, such as faces, regions with high luminance contrast, and regions that are likely to contain information relevant to our current objectives []. Correspondingly, our perception of others’ direction of gaze is affected by a priori expectations about the features of the environment that are most likely to draw their attention []. For example, we exhibit a bias toward seeing others’ gaze as being directed toward us, which becomes increasingly apparent under conditions of greater sensory uncertainty []. In this respect, our expectations about what others are likely to be looking at affects our perception of where they are looking. Surprisingly, the present results suggest that the flow of information also goes in the reverse direction, such that where people are looking gives us information about the position and interaction of environmental objects.

The variations that we performed on the main experimental conditions provided insight into the mechanism behind the effect of others’ tracking behavior on low-level visual perception. First, the effects of tracking behavior on the perception of target depth (experiments 1–4) and causal interactions (experiment 6) were still present when local background motion around the target (produced by the avatar in the basic experimental conditions) was absent. Second, in experiment 4, we pitted the direction of low-level motion produced by the inducer against the direction of its object-level rotation and found that the direction of perceptual effects was consistent with the latter. Third, a non-face avatar that was rotated to track the target object (in the same manner as the face was rotated in the basic condition) could also induce the illusory effects in these experiments. Together, these results suggest that the mechanism behind these perceptual effects relates to distal coupling between the target and an inducer that is tracking the target’s motion.

14 Heider F.

Simmel M. An experimental study of apparent behavior. While the tendency of the non-face avatar to produce the same perceptual effects as face stimuli indicates that face-specific cues (e.g., eye gaze) aren’t necessary, there was a social quality to the non-face stimulus in that its motion was actively tracking the position of the target. It has been famously demonstrated that we have a tendency to attribute agency to non-biological objects based on their motion, using animations of simple geometric shapes (e.g., “the triangle is fighting the square”) []. Thus, on the one hand, the perceptual effects reported in the present study might stem from a mechanism that evolved initially for social purposes but that can be recruited by non-human objects that display agent-like behavior; on the other hand, these effects might reflect a more general perceptual mechanism related to coherency of motion between distal objects, but which nonetheless has a particular relevance to social situations (which are perhaps the most obvious exemplar of this type of tracking behavior in nature).

15 Wagemans J.

Elder J.H.

Kubovy M.

Palmer S.E.

Peterson M.A.

Singh M.

von der Heydt R. A century of Gestalt psychology in visual perception: I. perceptual grouping and figure-ground organization. The coherency between the motion of the inducer and that of the target suggests perceptual grouping [] between these distinct visual elements may be a mechanism behind the effect of the inducer on perceived target trajectory. This points toward an interesting take on gaze perception: that a face and the target it tracks together form a kind of global or Gestalt percept, such that the perception of “object tracking” may be a valid avenue for future research. Moreover, as pointed out by an anonymous reviewer, these results suggest a novel function of motion-based grouping in enabling social cues to modulate our perception of visual scenes. It is worth noting that in our experiments, the perceived motion of the target object did not simply “take on” the motion that the inducer itself was exhibiting. Typically, the inducer simply rotated in a fixed position, while the target was perceived as following a trajectory in space (e.g., moving laterally and in depth to trace out an elliptical trajectory, in experiments 1–4). Thus, the target took on the trajectory of motion implied by the inducer on the assumption that the latter was tracking the former.

16 Choi H.

Scholl B.J. Effects of grouping and attention on the perception of causality. 17 Scholl B.J.

Nakayama K. Causal capture: contextual effects on the perception of collision events. Previous research has shown that contextual information can affect the perception of causal interactions between moving objects in a manner that is mediated by perceptual grouping—in particular, that the perceived kinematic interaction between two objects is influenced by the motion of other objects in the scene, where these contextual objects are grouped with the interacting objects via factors like proximity, connectedness, and common motion []. The results of experiment 6 build upon this research by demonstrating that the social context (i.e., an avatar tracking one of the target objects) can influence perception of causality.

1 Frischen A.

Bayliss A.P.

Tipper S.P. Gaze cueing of attention: visual attention, social cognition, and individual differences. 18 Friesen C.K.

Kingstone A. The eyes have it! Reflexive orienting is triggered by nonpredictive gaze. It is well established that the direction of others’ gaze can cue the spatial location of our own visual attention (i.e., “gaze following”; []). Thus, another mechanism that might contribute to the result of the present study is attentional orienting; that is, shifts in the direction of others’ gaze might guide our own attention in a way that influences our perception of the causal interactions or temporal continuity of moving objects.

19 Kobayashi H.

Kohshima S. Unique morphology of the human eye. 4 Langton S.R.H. Gaze perception and visually mediated attention. 20 Otsuka Y.

Mareschal I.

Calder A.J.

Clifford C.W. Dual-route model of the effect of head orientation on perceived gaze direction. 21 Carlin J.D.

Calder A.J. The neural basis of eye gaze processing. 22 Perrett D.I.

Smith P.A.

Potter D.D.

Mistlin A.J.

Head A.S.

Milner A.D.

Jeeves M.A. Visual cells in the temporal cortex sensitive to face view and gaze direction. 23 Pelphrey K.A.

Morris J.P.

Michelich C.R.

Allison T.

McCarthy G. Functional anatomy of biological motion perception in posterior temporal cortex: an FMRI study of eye, mouth and hand movements. 24 Carlin J.D.

Rowe J.B.

Kriegeskorte N.

Thompson R.

Calder A.J. Direction-sensitive codes for observed head turns in human superior temporal sulcus. What do these results suggest about brain function? Even among primates, humans are a uniquely social species, including in how we have evolved to use gaze signals to transmit information between individuals []. This entails sophisticated perceptual mechanisms that integrate sensory information concerning others’ head orientation, eye deviation, and binocular vergence []. Converging data from single-cell recording, lesion, and functional neuroimaging studies implicate the anterior superior temporal sulcus (STS) region of the temporal cortex in the encoding of others’ direction of gaze and head position []. Gaze-related motion is less well studied, but functional neuroimaging in humans has implicated both anterior and more posterior regions of STS in the perception of eye and head motion []. An existing research agenda addresses how these regions function in a network with parietal and frontal areas to enact visual orienting and higher-level social-cognitive processes dependent on others’ gaze. A hypothesis suggested by the present findings is that the neural system underpinning gaze perception and sensitivity to others’ tracking motion plays a further role in modulating the neural representation of object depth and motion, emphasizing the integrative nature of sensory processing across the brain.