Intel and Microsoft are pushing hard into what they consider the next evolution of computing--mixed reality--but until now, neither company had elaborated in detail what exactly we should expect in terms of specifications from the incoming wave of mainstream-level VR HMDs. At WinHEC, though, Microsoft detailed a range of features for the HMDs themselves.

Note that all of these mainstream-level HMDs will be made by Intel’s and Microsoft’s hardware partners--for now, the usual PC suspects such as Acer, Asus, Dell, HP, and Lenovo. Further, the headsets we’re talking about here will be tethered and will require a PC. (For the most part.)

The Headsets: A Range Of Specifications

This one slide spells it out rather succinctly:

On the lowest end of the spectrum, the new HMDs will offer 1200x1080 resolution per eye. That’s striking, because that’s the same per-eye resolution of the two titans of the industry, the Oculus Rift and the HTC Vive. On the high end, Microsoft expects its hardware partners to offer up to 1440x1440, which would outpace the Rift and Vive.

However, note that a major difference is that these mainstream HMDs will offer just a 60Hz refresh rate at that resolution (versus 90Hz for Rift and Vive), which could prove problematic for the user experience. Again, however, at the high end, these new HMDs should meet or exceed that refresh rate.

HMD HTC Vive Oculus Rift Mainstream HMD (Low End) Mainstream HMD (High End) Display Type Dual low-persistence Samsung AMOLED Dual low-persistence Samsung AMOLED LCD OLED Resolution 1200x1080 per eye 1200x1080 per eye 1200x1080 per eye 1440x1440 per eye Refresh Rate 90Hz 90Hz 60HZ 90Hz or 120Hz Audio Microphone, jack for external headphones Microphone, integrated supra-aural 3D spatial audio headphones (removable) Audio out jack (for headphones), Mic-in jack Integrated headphones, integrated mic array Ports HDMI 1.4, USB 3.0 x 2 Proprietary headset connector (HDMI/USB 3.0) HDMI 1.4, USB 2.0 HDMI 2.0 or DisplayPort, USB 3.0 Cable 5m (plus 1m from Link Box to PC) 4m Multiple cables Bonded single cable Accessories Wand controllers Touch controllers (optional, sold separately) Game controller 3DoF or 6DoF controller Price $800 $600 ($800 w/Touch controllers) Starts at $300 Unknown

Curiously, Microsoft made no mention of the types of lenses we could expect, but lower-end panels will be LCD displays; you won’t get the same OLED display as the Rift and Vive unless you slide to the high end.

Another important note is that the lower-end mainstream HMDs will be equipped with only a game controller. Until recently, when it finally debuted its (excellent) Touch controllers, that was one of the Rift’s primary weaknesses as a system, especially compared to the Vive. Again, though, higher-end HMDs will offer a 3DoF or 6DoF controller. Microsoft did not spell out what exactly that means, though; after all, Daydream’s controller is capable of just 3DoF, which, though better than a game controller, pales in comparison to 6DoF.

Further, there’s going to be a huge difference between a 6DoF controller that an OEM whips up in a few months and the years-in-the-making Touch controllers and Vive wands. In any case, based on the presentation, we’re confident that any controllers paired with these mainstream HMDs will be tracked.

For audio, expect the higher-end HMDs to sport a mic array (which Microsoft “strongly” recommends) and integrated headphones. (The HoloLens has such headphones, and although some people have criticized them for being too quiet, they do produce impressive spatial sound without requiring any actual ear cups.) This is one point where Microsoft seems to think cost will be an issue. The above is certainly ideal, but should be pricey, yet the lower-end options--an audio jack for separate headphones and what would amount to a basic boom mic--will present design problems.

Also note that the range of options listed in the Microsoft slide, connectivity is not wireless; there will be HMDs with multiple-cable setups and single-cable implementations, but all will be tethered. Again, the Rift and Vive are no different in that regard.

Curiously, there is no mention at all of field of view. Further, they will of course be equipped with sensors, but Microsoft did not mention anything specifics beyond calling them “IMUs” (inertial measurement units).

Let’s Talk About Input Methods

Although we touched on input methods above, Microsoft broke out that topic and elaborated on it a bit. The company has conceptualized multiple “interaction models” and tied those to various actions and devices.

Hey look, another handy slide:

It’s important to bear in mind at this juncture that Microsoft and Intel have in mind that all of these HMDs of varying feature sets will be used to do all sorts of different computing tasks, from casual computing (such as web browsing) to communications (Skype) to serious productivity (Excel and PowerPoint floating in front of you) to passive entertainment (watching movies) to active entertainment (such as gaming).

Simply put, they believe that this mixed reality business is the future of personal computing, and that people will use the HMDs for everything. Therefore, it follows that there will be a variety of input methods and devices.

Microsoft was clear: You can’t have too much input diversity. Even so, you have to be sure that all of those input methods are standardized and consistent for all developers and users, which is why Microsoft has spent so much time nailing these down. Note also that this platform offers seamless input switching, which is to say that you should be able to alternate freely between all forms of input easily without doing a thing.

Arguably the most basic, intuitive input method is gaze. You’ll find gaze selection even in lower-end mobile VR, although there are certainly higher- and lower-tech ways to implement it. You should be able to use gaze to zip through menus, navigate, select, and so on, and Microsoft has it pegged to game pad controllers.

As we’ve seen with the Rift and Vive, the next logical evolution of VR input involves motion-tracked controllers. These can bring your hands (or rather, a facsimile of your hands) into the VR environment, and with the addition of touch pads, joysticks, and buttons on those controllers, you get all sorts of marvelous capabilities. (Cast a spell, stab some zombies, shoot robots, etc.)

"Non-spatial pointing" is a fancy way of saying "mouse and keyboard." This one is tricky. It’s true that for mouse input, if you’re inside an occluded HMD, you’ll have no problems. Few people actually look down at their mice when they’re using them. However, keyboard input will almost certainly prove to be more problematic; you need to see your keyboard. Microsoft can get around that in a couple of different ways; one of those is by using a camera and inside-out tracking (which we’ll discuss further down the page).

Speech, though, according to Microsoft, is “the most native way to communicate to interact with the virtual world.” The company has an internal mantra that goes: “If you can see it, you can say it.” For example, if you see an item that you want to select, you should just be able to select it by saying “select.” (This will require gaze, too, you’ll note.)

Inside-Out Tracking: The Linchpin

All other features being more or less equal, the degrees of freedom your HMD offers is of enormous importance. Cheaper VR, like Google Daydream and even the Samsung Gear VR, offer only 3DoF. That means you can look around on X, Y, and Z axes, but you can’t move around in the virtual space or look at things from different angles. To do that, you need 6DoF, and these new mainstream HMDs will all offer it.

Microsoft believes in 6DoF; it considers the tracking a major part of the user experience, and it knows that if you offer a substandard UX, people generally will dump your product. Specifically, in VR, they will vomit first and then dump your product. Thus was significant emphasis put on inside-out, 6DoF tracking on these HMDs.

That inside-out tracking bit is also of enormous import; you can achieve 6DoF tracking by way of external sensors or cameras (hey there, Rift and Vive) or by placing fiducial markers all around a space, but both have limitations. In the case of the former, if you move your head too far one way or another, the HMD loses tracking with the cameras/sensors. (This is why Oculus requires a third Constellation tracker to enable room-scale VR on the Rift.) With markers, if you move to a new room, you have to fill it with new markers; obviously, this limits where you can enjoy your VR experiences.

But with inside-out tracking, the HMD is equipped with technology that scans and “understands” the world as you move about in it. As we’ve said in the past, that's not just room scale tracking; it’s world scale.

This is what Microsoft is offering to its partners. And it requires no setup on the part of the user.

For these HMDs, inside-out tracking requires two components: a camera and an IMU. The camera is just a camera; it’s aimed out at the world from the HMD, and it “observes” (Microsoft’s term) the world. (Of note, Microsoft uses the same SLAM Scan technology as Dacuda for this part of the process. It continuously maps your environment even as it tracks your position within it.) The IMU tracks the position of your head. The camera’s frame data and IMU data are combined and fed to the Windows platform with sensor fusion.

From that you get the pose data, which is fed to the application, and it’s rendered on the display. (For this very last part, Microsoft said that it has “optimizations on the rendering part as well” that allegedly smoothes the resulting image that your eyes see.)

An Untethered Future?

For all the talk of inside-out tracking, though, Microsoft strangely did not list any of that hardware in its range of HMD specifications. That seems odd, doesn’t it? Further, although inside-tracking is clearly a big deal for Microsoft here, we noticed that one of the early slides in the presentation, which had a mock-up of a possible standard VR setup, showed a standalone tracker:

However, in the presentation, Microsoft made no mention of any standalone tracking. Clearly, the company is focused on inside-out tracking, instead.

In addition to the above discussion on inside-out tracking, another slide in the presentation stated: “Our vision: inside-out tracking for everyone.” Granted, having inside-out tracking on an HMD does not necessarily mean that it would be untethered--and indeed, it may be superior to a standalone outside-in tracker even for tethered headsets--but you certainly can’t get untethered, world-scale XR without inside-out tracking.

On that point, again, Microsoft is clear. In the slide above, note that part of that “inside-out tracking for everyone” ethos is “world scale” tracking. That means not just untethered HMDs, but truly mobile ones.

So why are all these HMDs tethered? It’s likely because Microsoft and Intel are still experimenting to see what will work and what consumers will want in the XR market. Tethered VR and MR experiences could certainly be part of that (they should certainly cost less than completely untethered HMDs).

For that matter, the XR market could end up following same paradigm we see in the PC market: PC-tethered HMDs are like desktop PCs--not portable, but powerful--whereas untethered, mobile HMDs will be like laptops--eminently portable, indispensable to many, and typically less powerful than their desktop-bound counterparts.

In any case, we continue to learn more about what Microsoft and Intel are trying to accomplish in the XR world. Now, we finally have some insight into some of the HMD hardware we can expect from PC makers. We should see some of these devices realized at CES 2017 next month.