Language shapes the way we see the world. For example, the metaphors used to describe a concept like crime can shape the way people reason about it; native speakers of different languages tend to conceptualize time in ways consistent with their language; and when an object (say, a chair) is assigned the feminine grammatical gender in one language and the masculine gender in another, speakers of the former language actually think of that object as more feminine than speakers of the latter.

But new research (here’s the pdf) shows that the language we speak literally affects the way we see the world. By tracking people’s eye movements as they watched scenes unfold, researchers found that speakers tended to fixate more on parts of the scene that their language would require them to encode when communicating, relative to speakers of another language.

The experiment included German and Korean speakers. One way these two languages differ is in how they refer to spatial relationships between objects. In German (as in English), there’s a word for containment (in, which means the same as it does in English), which contrasts with the word used for one object supporting another (in German, auf, analogous to on in English). Preposition use in Korean isn’t dictated by whether one object contains or supports the other; instead, different prepositions are used depending on the tightness of the fit of the relationship. For example, putting a cap on a pen is a tight fit, which Korean describes with the word kkita. This contrasts with putting an apple in a bowl, which is a loose fit, so the preposition netha would be used instead (though the authors note that netha tends to be used for loose containment while notha is used for loose support, the line is a bit more blurred than in English or German).

In German, then, the most relevant part of a spatial relationship (for communication purposes) is whether one object contains or supports the other. In Korean, the most relevant part of a relationship is the tightness of fit. The researchers predicted that German and Korean speakers may habitually pay closer attention to certain parts of a scene — the ones their language requires them to communicate — than others.

In the experiment, participants watched videos of objects coming in contact with each other (screenshots are below), while the researchers tracked their eye movements. Participants always saw a pair (one video followed by a second) and rated how similar the two videos were to each other. Importantly, participants were not told which dimension their similarity ratings should be based on — this was for them to decide on their own.

Consistent with language practices, Korean speakers based their similarity ratings on tightness of fit — for example, videos from the second and third rows above (both showing tight fits, and therefore typically described using kkita), were seen as more similar than the first and second, or the third and fourth were (both of which would include one kkita video and one netha or notha video). German speakers, on the other hand, based their ratings on containment vs. support. For them, the first and second (both described by auf) or the third and fourth (in) were more similar than the second and third (auf vs. in). Again, it’s especially relevant that participants were not told to use their language practices to determine similarity; they were simply encouraged to determine how similar different pairs were to each other, and their language practices guided them in this task.

The really novel part of this study, though, is in the eye-tracking. The researchers found that German speakers spent more time looking at the base figure (the bowl, block, or tray that the second object would sit on or in) than Korean speakers did, probably because that object contains more information for a person who needs to determine whether the relationship will be a supportive (on) or containment (in) one, which is what Germans habitually have to encode. Instead of looking at that base figure as much, Korean speakers looked more at the one that did the resting on or in, and particularly looked at the area where the objects intersected, which again holds the most information for speakers of a language that requires communicating the tightness of fit.

Even though participants were not watching these videos in order to communicate about them, their viewing patterns still reflected the tendencies of their languages. They have years of experience needing to pay attention to containment vs. support or tightness vs. looseness, so they now approach the world with a predisposition to look for those same characteristics that their language encodes.

This finding may not have huge practical consequences. People’s vision isn’t impaired by what their language encodes or doesn’t. But the study does show that our attention can be influenced by our language. Visual attention is a pretty low-level process, in the sense that it’s constant and so much of it happens without conscious awareness. That, I think, is why this study is so cool — even when people are watching simple videos of objects, their language shapes the way they approach the situation. Just imagine what our language does for us when we actually go out and navigate the world.

Cover photo by MabelAmber. CC.