Wearable displays are becoming increasingly important, but the accessibility, visual comfort, and quality of current generation devices are limited. We study optocomputational display modes and show their potential to improve experiences for users across ages and with common refractive errors. With the presented studies and technologies, we lay the foundations of next generation computational near-eye displays that can be used by everyone.

From the desktop to the laptop to the mobile device, personal computing platforms evolve over time. Moving forward, wearable computing is widely expected to be integral to consumer electronics and beyond. The primary interface between a wearable computer and a user is often a near-eye display. However, current generation near-eye displays suffer from multiple limitations: they are unable to provide fully natural visual cues and comfortable viewing experiences for all users. At their core, many of the issues with near-eye displays are caused by limitations in conventional optics. Current displays cannot reproduce the changes in focus that accompany natural vision, and they cannot support users with uncorrected refractive errors. With two prototype near-eye displays, we show how these issues can be overcome using display modes that adapt to the user via computational optics. By using focus-tunable lenses, mechanically actuated displays, and mobile gaze-tracking technology, these displays can be tailored to correct common refractive errors and provide natural focus cues by dynamically updating the system based on where a user looks in a virtual scene. Indeed, the opportunities afforded by recent advances in computational optics open up the possibility of creating a computing platform in which some users may experience better quality vision in the virtual world than in the real one.

Emerging virtual reality (VR) and augmented reality (AR) systems have applications that span entertainment, education, communication, training, behavioral therapy, and basic vision research. In these systems, a user primarily interacts with the virtual environment through a near-eye display. Since the invention of the stereoscope almost 180 years ago (1), significant developments have been made in display electronics and computer graphics (2), but the optical design of stereoscopic near-eye displays remains almost unchanged from the Victorian age. In front of each eye, a small physical display is placed behind a magnifying lens, creating a virtual image at some fixed distance from the viewer (Fig. 1A). Small differences in the images displayed to the two eyes can create a vivid perception of depth, called stereopsis.

(A) A typical near-eye display uses a fixed focus lens to show a magnified virtual image of a microdisplay to each eye (the eyes cannot accommodate at the very near microdisplay’s physical distance). The focal length of the lens, f , and the distance to the microdisplay, d ′ , determine the distance of the virtual image, d . Adaptive focus can be implemented using either a focus-tunable lens (green arrows) or a fixed focus lens and a mechanically actuated display (red arrows), so that the virtual image can be moved to different distances. (B) A benchtop setup designed to incorporate adaptive focus via focus-tunable lenses and an autorefractor to record accommodation. A translation stage adjusts intereye separation, and NIR/visible light beam splitters allow for simultaneous stimulus presentation and accommodation measurement. (C) Histogram of user ages from our main studies. (D and E) The system from B was used to test whether common refractive errors could quickly be measured and corrected for in an adaptive focus display. Average (D) sharpness ratings and (E) fusibility for Maltese cross targets are shown for each of four distances: 1–4 D. The x axis is reversed to show nearer distances to the left. Targets were shown for 4 s. Red data points indicate users who did not wear refractive correction, and orange data points indicate users for whom correction was implemented on site by the tunable lenses. Values of -1, 0, and 1 correspond to responses of blurry, medium, and sharp, respectively. Error bars indicate SE across users.

However, this simple optical design lacks a critical aspect of 3D vision in the natural environment: changes in stereoscopic depth are also associated with changes in focus. When viewing a near-eye display, users’ eyes change their vergence angle to fixate objects at a range of stereoscopic depths, but to focus on the virtual image, the crystalline lenses of the eyes must accommodate to a single fixed distance (Fig. 2A). For users with normal vision, this asymmetry creates an unnatural condition known as the vergence–accommodation conflict (3, 4). Symptoms associated with this conflict include double vision (diplopia), compromised visual clarity, visual discomfort, and fatigue (3, 5). Moreover, a lack of accurate focus also removes a cue that is important for depth perception (6, 7).

(A) The use of a fixed focus lens in conventional near-eye displays means that the magnified virtual image appears at a constant distance (orange planes). However, by presenting different images to the two eyes, objects can be simulated at arbitrary stereoscopic distances. To experience clear and single vision in VR, the user’s eyes have to rotate to verge at the correct stereoscopic distance (red lines), but the eyes must maintain accommodation at the virtual image distance (gray areas). (B) In a dynamic focus display, the virtual image distance (green planes) is constantly updated to match the stereoscopic distance of the target. Thus, the vergence and accommodation distances can be matched.

The vergence–accommodation conflict is clearly an important problem to solve for users with normal vision. However, how many people actually have normal vision? Correctable visual impairments caused by refractive errors, such as myopia (near-sightedness) and hyperopia (far-sightedness), affect approximately one-half of the US population (8). Additionally, essentially all people in middle age and beyond are affected by presbyopia, a decreased ability to accommodate (9). For people with these common visual impairments, the use of near-eye displays is further restricted by the fact that it is not always possible to wear optical correction.

Here, we first describe a near-eye display system with focus-tunable optics—lenses that change their focal power in real time. This system can provide correction for common refractive errors, removing the need for glasses in VR. Next, we show that the same system can also mitigate the vergence–accommodation conflict by dynamically providing near-correct focus cues at a wide range of distances. However, our study reveals that this conflict should be addressed differently depending on the age of the user. Finally, we design and assess a system that integrates a stereoscopic eye tracker to update the virtual image distance in a gaze-contingent manner, closely resembling natural viewing conditions. Compared with other focus-supporting display designs (10⇓⇓⇓⇓⇓⇓⇓–18) (details are in SI Appendix), these adaptive technologies can be implemented in near-eye systems with readily available optoelectronic components and offer uncompromised image resolution and quality. Our results show how computational optics can increase the accessibility of VR/AR and improve the experience for all users.

Results

Near-Eye Display Systems with Adaptive Focus. In our first display system, a focus-tunable liquid lens is placed between each eye and a high-resolution microdisplay. The focus-tunable lenses allow for adaptive focus—real-time control of the distance to the virtual image of the display (Fig. 1A, green arrows). The lenses are driven by the same computer that controls the displayed images, allowing for precise temporal synchronization between the virtual image distance and the onscreen content. Thus, the distance can be adjusted to match the requirements of a particular user or particular application. Details are in SI Appendix, and related systems are described in refs. 19⇓–21. This system was table-mounted to allow for online measurements of the accommodative response of the eyes via an autorefractor (Fig. 1B), similar to the objective measurements in ref. 14, but the compact liquid lenses can fit within conventional-type head-mounted casings for VR systems. Adaptive focus can also be achieved by combining fixed focus lenses and a mechanically adjustable display (Fig. 1A, red arrows) (11). This approach is used for our second display system, which has the advantage of having a much larger field of view, and it will be discussed later. To assess how adaptive focus can be integrated into VR systems so as to optimize the display for the broadest set of users, we conducted a series of studies examining ocular responses and visual perception in VR. Our main user population was composed of adults with a wide range of ages ( n = 153, age range = 21–64 y old) (Fig. 1C) and different refractive errors (79 wore glasses and 19 wore contact lenses).

Correcting Myopia and Hyperopia in VR. Before examining the vergence–accommodation conflict, we first tested whether a simple procedure can measure a user’s refractive error and correct it natively in a VR system with adaptive focus. Refractive errors, such as myopia and hyperopia, are extremely common (22) and result when the eye’s lens does not produce a sharp image on the retina for objects at particular distances. Although these impairments can often be corrected with contact lenses or surgery, many people wear eyeglasses. Current generation VR/AR systems require the user to wear their glasses beneath the near-eye display system. Although wearing glasses is technically possible with some systems, user reviews often cite problems with fit and comfort, which are likely to increase as the form factor of near-eye displays decreases. Users ( n = 70, ages 21–64 y old) were first tested using a recently developed portable device that uses a smartphone application to interactively determine a user’s refractive error without clinician intervention, including the spherical lens power required for clear vision (NETRA; EyeNetra, Inc.) (23). After testing, each user performed several tasks in VR without wearing his/her glasses. Stimuli were presented under two conditions: uncorrected (the display’s virtual image distance was 1.3 m) and corrected (the virtual image was adjusted to appear at 1.3 m after the correction was applied). Note that the tunable lenses do not correct astigmatism. We assessed the sharpness and fusibility of a Maltese cross under both conditions. The conditions were randomly interleaved along with four different stereoscopic target distances: 1–4 Diopters (D; 1.0, 0.5, 0.33, and 0.25 m, respectively). Users were then asked (i) how sharp the target was (blurry, medium, or sharp) and (ii) whether the target was fused (i.e., not diplopic). As expected, the corrected condition substantially increased the perceived sharpness of targets at all distances (Fig. 1D). This condition also increased users’ ability to fuse targets (Fig. 1E). Logistic regressions indicated significant main effects for both condition and distance. The odds ratios for correction were 4.05 [95% confidence interval ( c i ) = 3.25–5.05] and 1.54 ( c i = 1.20–1.98) for sharpness and fusibility, respectively. The distance odds ratios were 0.77 and 0.21, respectively (all p s ≤ 0.01), indicating reductions in both sharpness and fusibility for nearer distances. Importantly, the VR-corrected sharpness and fusibility were comparable with those reported by people wearing their typical correction, who participated in the next study (called the conventional condition). Comparing responses between these two groups of users reveals that, across all distances, the average sharpness values for the corrected and conventional conditions were 0.60 and 0.63, respectively. The percentages fused were 68 and 74%, respectively. This result suggests that fast, user-driven vision testing can provide users with glasses-free vision in VR that is comparable with the vision that they have with their own correction. We also assessed overall preference between the two conditions (corrected and uncorrected) in a less structured session. A target moved sinusoidally in depth within a complex virtual scene, and the user could freely toggle between conditions to select the one that was more comfortable to view; 80% of users preferred the corrected condition, which is significantly above chance (binomial probability distribution; p ≪ 0.001 ). Those that preferred the uncorrected condition may have had inaccurate corrections or modest changes in clarity that were not noticeable in the virtual scene (SI Appendix has additional discussion). Future work can incorporate the refractive testing directly into the system by also using the focus-tunable lenses to determine the spherical lens power that results in the sharpest perceived image and then, store this information for future sessions.

Driving the Eyes’ Natural Accommodative Response Using Dynamic Focus. Even in the absence of an uncorrected refractive error, near-eye displays suffer from the same limitations as any conventional stereoscopic display: they do not accurately simulate changes in optical distance when objects move in depth (Fig. 2A). To fixate and fuse stereoscopic targets at different distances, the eyes rotate in opposite directions to place the target on both foveas; this response is called vergence (red lines in Fig. 2A). However, to focus the displayed targets sharply on the retinas, the eyes must always accommodate to the virtual display distance (gray lines in Fig. 2A). In natural vision, the vergence and accommodation distances are the same, and thus, these two responses are neurally coupled. The discrepancy created by conventional near-eye displays (the vergence–accommodation conflict) can, in principle, be eliminated with an adaptive focus display by producing dynamic focus: constantly updating the virtual distance of a target to match its stereoscopic distance (Fig. 2B) (19, 20). Using the autorefractor integrated in our system (Fig. 1B), we examined how the eyes’ accommodative responses differ between conventional and dynamic focus conditions and in particular, whether dynamic focus can drive normal accommodation by restoring correct focus cues. Users ( n = 64, ages 22–63 y old) viewed a Maltese cross that moved sinusoidally in depth between 0.5 and 4 D at 0.125 Hz (mean = 2.25 D, amplitude = 1.75 D), while the accommodative distance of the eyes was continuously measured. Users who wore glasses were tested as described previously with the NETRA, and their correction was incorporated. In the conventional condition, the virtual image distance was fixed at 1.3 m; in the dynamic condition, the virtual image was matched to the stereoscopic distance of the target. Because of dropped data points from the autorefractor, we were able to analyze 24 trials from the dynamic condition, which we compare with 59 trials for the conventional condition taken from across all test groups. The results are shown in Fig. 3 A and B. Despite the fixed accommodative distance in the conventional condition, on average, there was a small accommodative response (orange line in Fig. 3A) (mean gain = 0.29) to the stimulus. This response is likely because of the cross-coupling between vergence and accommodative responses (24). However, the dynamic display mode (green line in Fig. 3B) elicited a significantly greater accommodative gain (mean = 0.77; partially paired one-tailed Wilcoxon tests, p < 0.001), which closely resembles natural viewing conditions (25). These results show that it is indeed possible to drive natural accommodation in VR with a dynamic focus display (SI Appendix has supporting analysis). Fig. 3. (A and B) Accommodative responses were recorded under conventional and dynamic display modes while users watched a target move sinusoidally in depth. The stimulus was shown for 4.5 cycles, and the response gain was calculated as the relative amplitude between the response and stimulus for 3 cycles directly after a 0.5-cycle buffer. The stimulus position (red), each individual response (gray), and the average response (orange indicates conventional focus and green indicates dynamic focus in all panels) are shown with the mean subtracted for each user. Phase is not considered because of manual starts for measurement. (C) The accommodative gains plotted against the user’s age show a clear downward trend with age and a higher response in the dynamic condition. Inset shows means and SEs of the gains for users grouped into younger and older cohorts relative to 45 y old. (D and E) Average (D) sharpness ratings and (E) fusibility were recorded for Maltese cross targets at each of four fixed distances: 1–4 D. The x axis is reversed to show nearer distances to the left. Error bars indicate SE. The ability to accommodate degrades with age (i.e., presbyopia) (26). Thus, we examined how the age of our users affected their response gain. For both conditions, accommodative gain was significantly negatively correlated with age (Fig. 3C) (conventional r = − 0.34 , dynamic r = − 0.73 , p s < 0.01). This correlation is illustrated further in Fig. 3C, Inset, in which average gains are shown for users grouped by age ( ≤ 45 and >45 y old). Although the gains are much greater for the dynamic condition than conventional among the younger age group, the older group had similar gains for the two conditions. From these results, we predicted that accurate focus cues in near-eye displays would mostly benefit younger users and in fact, may be detrimental to the visual perception of older users in VR. We examine this question below.

Optimizing Optics for Younger and Older Users. A substantial amount of research supports the idea that mitigating the vergence–accommodation conflict in stereoscopic displays improves both perception and comfort, and this observation has been a major motivation for the development of displays that support multiple focus distances (3, 5, 7, 12⇓⇓–15, 27). However, the fact that accommodative gain universally deteriorates with age suggests that the effects of the vergence–accommodation conflict may differ for people of different ages (28⇓–30) and even that multifocus or dynamic display modes may be undesirable for older users. Because presbyopes do not accommodate to a wide range of distances, these individuals essentially always have this conflict in their day to day lives. Additionally, presbyopes cannot focus to near distances, and therefore, using dynamic focus to place the virtual image of the display nearby would likely decrease image quality. To test this hypothesis, we assessed sharpness and fusibility with conventional and dynamic focus in younger ( ≤ 45 y old, n = 51) and older (>45 y old, n = 13) users. For the younger group, sharpness was slightly reduced for closer targets in both conditions. However, for the older group, perceived sharpness was high for all distances in the conventional condition and fell steeply at near distances in the dynamic condition (Fig. 3D). A logistic regression using age, condition, and distance showed significant main effects of distance and condition. The distance odds ratio was 0.56 ( c i = 0.46–0.69), and the ratio for the dynamic condition was 0.60 ( c i = 0.48–0.75; p s < 0.001), indicating reductions in sharpness at nearer distances. However, the effect of condition was modified by an interaction with age, indicating that sharpness in the older group was reduced significantly more by dynamic mode (odds ratio = 0.70, c i = 0.56–0.87, p < 0.01). Indeed, for targets 2 D (50 cm) and closer, older users tended to indicate that the dynamic condition was blurry and that the conventional condition was sharp. The fusibility results for the two age groups were more similar: dynamic focus facilitated fusion at closer distances (Fig. 3E). Significant main effects of condition (odds ratio of 1.75, c i = 1.23–2.49) and distance (odds ratio of 0.27, c i = 0.18–0.39) were modified by an interaction (odds ratio of 1.69, c i = 1.27–2.25, all p s < 0.01). The interaction indicated that the improvement in fusibility associated with dynamic focus increased at nearer distances. Although dynamic focus provides better fusion for young users, in practice, a more conventional display mode may be preferable for presbyopes. The ideal mode for presbyopes will depend on the relative weight given to sharpness and fusion in determining the quality of a VR experience. In addition, a comfortable focus distance for all images in the conventional condition obviates the need to wear traditional presbyopic correction at all. We also tested overall preferences while users viewed a target moving in a virtual scene. Interestingly, in both the younger and older groups, only about one-third of the users expressed a preference for the dynamic condition (35% of younger users and 31% of older users). This result was initially surprising given the substantial increase in fusion experienced by younger users in the dynamic mode. One potential explanation is that the target in the dynamic condition may have been modestly less sharp (Fig. 3D) and that people strongly prefer sharpness over diplopia. However, two previous studies have also reported overall perceptual and comfort improvements using dynamic focus displays (19, 20). To understand this difference, we considered the fact that our preference test involved a complex virtual scene. Although users were instructed to maintain fixation on the target, if they did look around the scene even momentarily, the dynamic focus (yoked to the target) would induce a potentially disorienting, dynamic vergence–accommodation conflict. That is, unless the dynamic focus is adjusted to the actual distance of fixation, it will likely degrade visual comfort and perception. To address this issue, we built and tested a second system that enabled us to track user gaze and update the virtual distance accordingly.