Summary: Eyetracking data show that users are easily distracted when watching video on websites, especially when the video shows a talking head and is optimized for broadcast rather than online viewing.

As broadband connectivity has grown, websites have increased their use of video clips. Unfortunately, many of these videos are produced for television broadcast and are thus unsuitable for the online environment.

In 1997, I wrote an analysis of TV vs. computers that still holds: broadcast TV is a medium for relaxation, where the "user" sits back and becomes immersed in whatever the program directors decided to air. In fact, TV users are usually called "viewers," emphasizing their passive mode of engagement. In contrast, computer users sit forward and drive their own experience through a continuous set of choices and clicks.

Because of this fundamental difference in user experience, broadcast video feels boring on the Web. There's nothing to do, no choices, no user control.

Eyetracking Study of Web Video

We conducted an eyetracking study that records where users look on a wide variety of Web pages. Although it's too soon to present detailed results, early data offer striking information about how users behave while watching video clips that are produced for TV and posted on websites.

The following figure shows a "heatmap" of fixations — where the users' eyes rested — during a video on a news-oriented website. Red indicates the most-watched areas, while blue indicates the least.



Distribution of eye fixations while a user watched 24 seconds of a video clip on cnn.com.

The figure shows data from a 24-second segment in which the camera was held steady — that is, the same elements were continuously shown in the same screen locations.

Here's the full gaze replay video (in the gaze replay, the moving blue dot indicates where the user is looking):

Your browser does not support the video tag. This video shows where the website visitor looked as this video played. (In most browsers, hover over the video to display the controls if they're not already visible.)

This sample segment was excerpted from a video that was four minutes long and contained other camera positions, including a split-screen layout that let viewers simultaneously see the studio anchor and the interviewee in the field. As expected, heatmaps recorded during such segments varied from those captured in the relatively static segment. During the split-screen, for example, the heatmap of fixations included large red blobs over both people's faces.

In our 24-second segment, the interviewee's face also attracted much attention. That's not surprising: we've long known that faces are attractors. Also expected were the eye fixations over the caption, which shows the man's name and affiliation.

It's more interesting to notice how much attention was diverted elsewhere in the image, including the road sign behind the interviewee. There's even a brief glance at an object over his shoulder that looks like a trash can.

Most interesting of all is the tremendous attention spent outside the video itself on things such as alternative headlines and video controls.

The eyetracking data clearly show that a talking head is boring, even for 24 seconds. On the Web, 24 seconds is a long time — too long for users to keep their attention on something monotonous.

Video Guidelines

We've just begun research on the usability of online video and other multimedia elements on websites. While I'll surely have many more guidelines later, for now the main guideline for producing website video is to keep it short. Typically, Web videos should be less than a minute long.

A related guideline is to avoid using video if the content doesn't take advantage of the medium's dynamic nature. This doesn't mean incessant use of pans, zooms, and fades to add artificial movement. It does mean that it's better to use video for things that move or otherwise work better on film than they would as a combination of photos and text.

Finally, recognize that Web users are easily distracted, and keep distracting elements out of the frame of your shots. If there's a road sign in the video, for example, users will try to read it and will thus miss some of the main content.

Since the Web's beginning, I've warned against repurposing. The initial problem was that companies simply put up advertising brochures as websites. Later, newspapers and other content sites failed to follow the guidelines for writing for the web and used headlines that were optimized for print. Now, as technology evolves, we're seeing the same phenomenon for yet another media type: you can't recycle video and expect to create a good online user experience.

The Web is its own medium. We seem doomed to learn this lesson again and again.

See Also

A more recent take on "The Talking-Head Video 2.0: Findings from Eyetracking Research"

Other findings from our eyetracking research.