The Reading Process

To understand whether reading can be dramatically sped up while maintaining comprehension, it is important to understand how reading normally occurs. In this section, we review the visual and mental processes that are involved in silent reading when it proceeds as it typically does in educated adults, at a rate of about 200 to 400 wpm. Throughout the course of this discussion, it is important to keep in mind that reading is based on language; it is not a purely visual process. Speech is the primary form of language, and all human societies have a spoken language. (Groups of people who are deaf have developed languages that use the visual modality, but the primary languages employed by people with normal hearing all use the auditory modality.) We begin learning our spoken language as babies, and this process does not require explicit instruction. Reading and writing, which are relatively recent cultural inventions that are used in only some societies, normally require explicit instruction. That instruction begins around the age of 6 in many societies, although there are variations across cultures. It takes many years for a child to become a proficient reader. At first, children often read aloud, converting a printed text into the more familiar spoken form. Children gradually improve in their ability to read silently. Even in cultures in which reading is highly valued and widely taught, some children and adults never become good readers. These observations indicate that speech is the basic form of language; reading and writing is an “optional accessory that must be painstakingly bolted on” (Pinker, 1997, p. ix). Even though reading can be considered “an unnatural act” (Gough & Hillinger, 1980), many adults in modern societies perform this act quite skillfully. The body of research we now discuss has shed light on how they manage to do so.

The symbols of modern writing systems require no shading, no color other than that needed to distinguish the writing from the background, and no meaningful distinctions between lighter and darker lines or between wider and narrower lines. Once a symbol has been learned, it can be recognized in many printed forms. This use of abstract codes rather than visual templates ( McConkie & Zola, 1979 ) helps to explain why we are able to recognize that a word has the same meaning regardless of the font or case in which it is printed (see Fig. 2 ). Despite the fact that letters are recognized by their abstract forms rather than visual templates, good visual acuity is required to pick up the critical differences between the marks that distinguish visually similar letters—for example, the difference between h and n . If this difference is not perceived, one could mistake hot for not .

In some writing systems that are not alphabetic, the individual symbols that are arranged along a line represent units of meaning, morphemes , rather than units of sound. For example, the Chinese character 人 stands for “person.” Some Chinese characters contain components that provide a hint about the character’s pronunciation. However, these hints are not always present and, when present, are not always consistent or helpful. Most modern Chinese words contain more than one unit of meaning and must be expressed by a sequence of characters. For example, when the characters meaning “ground” (地) and “board” (板) are written together, they convey the meaning “floor” (地板). In Chinese, there are no spaces between any characters, so the only way a reader knows which two characters go together in a word is through experience. For the most part, however, this does not cause a problem for skilled readers.

A logical starting point for reviewing the mental and visual processes involved in reading is to consider what the eye takes in and what the cognitive system must then process: the elements of writing system. Writing normally takes the form of visible marks on a surface, whether it is a clay tablet, a sheet of paper, a computer monitor, or another digital screen. In most languages, written words are composed of smaller visual units that can be combined in various ways. The basic written symbols of English and other alphabetic writing systems are letters, which approximately represent the sounds of the language, the phonemes ( Fig. 1 ). For example, the word bag is represented by using a letter that represents the “b” sound, a letter that represents the “a” sound, and a letter that represents the “g” sound. The letters are arranged left to right along a horizontal line in the case of English, but other writing systems use other arrangements. For example, Hebrew arranges its symbols horizontally from right to left.

Visual processing and eye movements

Given that writing is composed of fine lines and marks, the acuity (visual-resolution) limits of vision are an important constraint on the reading process. The premise behind some speed-reading courses is that it is possible to use peripheral vision to simultaneously read large segments of a page, perhaps even a whole page, instead of one word at a time (Brozo & Johns, 1986). However, such a process is not biologically or psychologically possible. One indication that it is not possible is that visual acuity is limited and that these limitations are what cause readers to make eye movements. Acuity is much higher in the fovea (from the center of vision—the fixation location—to 1° of visual angle away from it in any direction) than in the parafovea (1°–5° away from the center of vision) or periphery (areas more than 5° away from the center of vision; Balota & Rayner, 1991; Fig. 3). To get a sense of how small the foveal viewing area is, note that it is roughly equivalent to the width of your thumb held at arm’s length from your eye.

Saccades (quick, ballistic eye movements) allow readers to move the fovea to the word they wish to process with the highest efficiency. Therefore, the oculomotor (eye movement) system controls the sequence and timing of the visual system’s access to the text. Decisions about how long to fixate a word and when to move the eyes to the next word are to a large extent under the control of cognitive processes (Rayner, Liversedge, White, & Vergilino-Perez, 2003; Reingold, Reichle, Glaholt, & Sheridan, 2012; see Rayner & Reingold, 2015). They are not preprogrammed before one begins to read a text in the way that a metronome is set by a musician before he or she begins to practice a piece of music. This moment-by-moment control helps to ensure that the next word enters the system through foveal (high-resolution) vision with the optimal timing. In the following sections, we detail how the visual and oculomotor systems support and constrain the reading process.

The visual system As mentioned, acuity is highest in the fovea, and this is the area in which the majority of word recognition occurs. One reason that acuity is higher in the fovea than in other areas of the visual field is related to the distribution of two types of neural receptors that respond to light—rods and cones. Cones are sensitive to color and detail and are more effective in bright light, while rods are sensitive only to brightness (e.g., shades of gray) and motion and are mostly sensitive (i.e., useful) in dimly lit rooms or at night. Cones are concentrated in the fovea and decrease in density with increasing distance from fixation. Rods are least concentrated in the fovea and increase in density with increasing distance from fixation (Fig. 4). Because cones are more sensitive to detail, this means that acuity is higher in the fovea, where there are more cones, than in nonfoveal areas. Download Open in new tab Download in PowerPoint Another reason why acuity is higher in the fovea than in other areas of the visual field has to do with the way information is transmitted from rods and cones (located in the retina—a membrane lining the back of the inside of the eyeball) to the brain. Information from rods is pooled (averaged across a group of rods) before being relayed to the brain, while information from individual cones is relayed directly, without being combined with information from other cones (Fig. 5). The consequence of this organization is that even minute variations in the pattern of light hitting the fovea will be preserved by the cones. If one cone receives bright light and an adjacent cone receives very dim light, the brain will perceive a light/dark boundary. In contrast, minute variations in the pattern of light hitting nonfoveal areas (where cones are sparse and there are mostly rods) will be obscured. If one rod receives bright light and an adjacent rod receives very dim light, the brain will perceive a gray blob. Download Open in new tab Download in PowerPoint These facts about rods and cones have some important implications for reading. As we have discussed, all text, regardless of writing system, is composed of combinations of lines, normally dark on a white background. Therefore, fine discrimination between dark and light areas is essential to recognizing the visual elements of writing. If the light pattern coming from a word hits the fovea, the cones will easily recognize such fine detail and relay the pattern—with high fidelity—to the brain. However, if the word hits nonfoveal areas and is sensed primarily by rods, it will be relayed to the brain as an average and will appear fuzzy. This will make it difficult to discern the exact identities of the symbols (see the beginning and the end of the sentence represented in Fig. 3). In fact, when people are asked to report the identity of a word that is presented so briefly that they cannot make an eye movement, accuracy is high in the fovea but drops off dramatically outside of it, with performance reaching chance level around the middle of the parafovea (approximately 3° of visual angle away from fixation; Bouma, 1973; see also Bouma, 1978; Rayner & Morrison, 1981). These facts cast doubt on suggestions from speed-reading proponents that people can read more effectively by using peripheral vision, taking in an entire line or even an entire page at a time.

Eye movements For over a century, researchers have been monitoring eye movements in order to study the cognitive processes underlying reading. The technology has evolved over the years so that eye tracking can now be achieved with a high-speed video camera connected to a computer and can be used to study readers of any age. In general, these technologies work by computing the location of the eye up to one thousand times per second, allowing the researcher to know, with precision to the millisecond, which word and where in the word the reader is looking at a particular time. This information can then be separated into times when the eyes remain in the same location (i.e., during fixations) and times when they move between locations (i.e., during saccades). In this section, we review what has been discovered in studies of experienced adult readers, generally college students, who are reading silently in their first language. As mentioned earlier, the reason readers make saccades is to move their fovea to the next word. The eyes are relatively stable during fixations, which last approximately 250 ms for experienced adult readers. In general, no new visual information is obtained during saccades (Matin, 1974), but cognitive processing continues during this time (Irwin, 1998). This is important, because some speed-reading technology developers have claimed that saccades waste time. However, because cognitive processing continues during saccades, this time is not wasted. We can conclude, however, that fixations are the reader’s opportunity to obtain new visual information from the text. Although the average fixation lasts about 250 ms, there is considerable variability in how long an individual fixation lasts. These variations reflect such things as the legibility of the text (e.g., light/dark contrast, filled-in or removed spaces between words), linguistic difficulty (e.g., word frequency, predictability, ambiguity), properties of the reader (e.g., age or reading skill), and task goals (e.g., reading, proofreading, skimming). One reason fixations last as long as they do is that eye movements are motor responses that require time to plan and execute. For example, even in the simple task of moving the eyes to a new stimulus that appears either to the left or to the right of the eye’s current location, the reaction time is in the range of 100 to 1,000 ms, depending on the stimuli and experimental conditions (see Gilchrist, 2011). The reaction time in reading is in the shorter end of this range, 150 to 200 ms, because the eyes generally move in one direction (Becker & Jürgens, 1979; Rayner, Slowiaczek, Clifton, & Bertera, 1983). Additionally, a number of processes are being conducted at once: Processing of the fixated word, planning to move the eyes forward, and processing of the upcoming word using parafoveal information all overlap in time. This means that saccade latencies in reading can be shorter than in simple saccadic reaction-time tasks because some of the cognitive processing that leads to the decision of when to move the eyes can occur before the fixation on a particular word even begins. On average, forward saccades when reading English last about 20 to 35 ms and span the distance of 7 letters. Saccade durations, like fixation durations, are variable. But the variation is mostly determined by the distance traveled rather than by the cognitive and linguistic variables that affect fixation durations (Rayner, 1998, 2009). Saccades usually move from one word to the next word. However, about 30% of the time, readers move past the next word to the following one. These skips are more likely to happen when the word is very short, extremely frequent, and/or highly predictable from the prior context. The word the has these characteristics, and it is skipped about 50% of the time or more (see Angele & Rayner, 2013). Importantly, just because a word is skipped does not mean that it was not processed at all. All major theories of reading posit that word skipping is based on at least partial recognition of the word from information obtained in parafoveal vision and/or expectations about the word’s identity. In fact, if readers are given passages to read in which words that most people skip over are omitted, comprehension suffers rather dramatically (Fisher & Shebilske, 1985). This shows that readers are actually processing many or most of the words they skip over, along with the words they fixate. It also suggests that every reader is unique in terms of the timing and sequence of words he or she needs to directly look at in order to read efficiently. What works for some people, such as the initial readers in Fisher and Shebilske’s study, who were able to choose which words they fixated, does not work for others, such as the readers who got the modified text. The implication is that speed-reading devices that control the timing of word presentation may not be ready to use “out of the box” but instead may need to be tailored to each individual user based on how that person would naturally process the text. Not all saccades move forward to the next word in the text (Fig. 6). A small proportion of eye movements result in refixations on the same word. Refixations are most common for long words, about 7 or more letters long in English, for which the end part of the word may not fall within the word-identification span (described below). About 10% to 15% of the time, skilled readers make regressions, moving backward in the text to a previous word. Regressions are different from return sweeps—eye movements that go from the end of one line of text to the beginning of the next. Although return sweeps and regressions are both right-to-left movements in writing systems that go from left to right, such as English, return sweeps continue to move forward with respect to the progression of the text, whereas regressions move backward. Download Open in new tab Download in PowerPoint Because return sweeps tend to be long saccades, there is some error in where they land, sometimes requiring an additional fixation to correct (Just & Carpenter, 1980). In general, though, these corrective saccades take half the time that normal saccades take and do not disrupt the reading process too much (in fact, readers almost never notice them). Some color-based technologies for presenting text have recently been developed that aim to make it easier to make return sweeps. However, as we will discuss in more detail later, return sweeps and other aspects of oculomotor control are generally not the difficult part of reading. Faulty language processing generally causes problems in eye movement programming, not the other way around. Regressions are more important than return sweeps with respect to understanding reading because they constitute a deviation of the reader’s eye movements from the normal progression of the text. Although some regressions are made to correct for oculomotor error (e.g., the eye’s landing too far past the intended word), many regressions are made to correct a failure in comprehension (e.g., when the reader has misinterpreted the sentence). This is important in the context of speed-reading technologies that use RSVP because these technologies do not allow people to reread the text to correct misunderstandings in an intelligent way that is informed by the reader’s understanding of the text. Given that most backward eye movements are made in order to repair a failure in comprehension, readers would maintain misinterpretations if they forced themselves to keep moving forward and would comprehend the text less well (Schotter, Tran, & Rayner, 2014).