A little while ago in my role as Microsoft Emerging Experiences MVP I was given the opportunity to experience the Microsoft HoloLens during a Holo Academy session in Redmond.

We attended with a group of tech-savvy MVP’s so naturally we started analyzing the device and experience with that mind set.

Photo and recording devices were not allowed so you’ll have to do with text and this picture 🙂



Since others have posted reports on their HoloLens experiences (like here & here), I thought I’d try to report some views that focus more on the technical aspects.

What follows are my personal observations, opinions, speculations and views on the hardware that was presented to us.

These may or may not describe the actual hardware that will be available at some point in the future.

They are in no way official specs, but nevertheless may be interesting to others, so here goes.

If you don’t know what a HoloLens is you may want to look here.

The device itself:

The device is completely tetherless, all computing and batteries are embedded in the helmet itself with no cable(s) connected to a computer.

All of the devices weight rests on a band that fits snugly around the head, none of it rests on the nose making it more comfortable to wear than current gen VR headsets for example.

The lenses can not only be adjusted up and down but can also freely slide towards and away from the face, group members wearing glasses could easily keep them on while wearing a HoloLens.

The projection/display:

This is not using standard LCD or OLED displays but something called waveguide technology, more on that here.

By extending our arms forward and spreading thumb/pinky we estimated the Field of View to be slightly less than 40 degrees (horizontally).

Although it has been reported than on some events Interpupillary distance (IPD) of the user had to be measured in our case we weren’t measured.

I’m not sure if the device is now capable of automatically measuring/adjusting this or that for our demos an average IPD setting was used.

I’m not sure if the device is now capable of automatically measuring/adjusting this or that for our demos an average IPD setting was used. Virtual objects felt mostly opaque, only when the real world background was very bright you could see the projection being slightly transparent.

Resolution looked pretty high and only became apparent when objects were very very small in view.

We couldn’t check frame rate but things appeared very smooth.

How does the limited Field of View feel in practice:

Since you can see through the glasses of the HoloLens and your peripheral view is not blocked you never get the claustrophobic feel that a limited FoV in VR can give you.

Only experiences where virtual objects are clipped by the edges of the FoV will make you aware of it.

Since virtual objects are truly anchored to the real world this seems to trigger something special in your brain.

Even if objects are clipped by the FoV this doesn’t seem too bothersome since you have a natural tendency to move around them and step backwards if needed.

Your brain perceives it as a ghostly appearance that is maybe not real but is truly there in the real world, even if it sometimes is partly clipped.

I find it hard to describe and to be honest my expectations going into the demo were skeptic.

Additional hardware components:

Even if objects are clipped by the FoV this doesn’t seem too bothersome since you have a natural tendency to move around them and step backwards if needed. Your brain perceives it as a ghostly appearance that is maybe not real but is truly there in the real world, even if it sometimes is partly clipped. I find it hard to describe and to be honest my expectations going into the demo were skeptic. Additional hardware components: We noticed 2 lenses on each side, placed above the temples, pointed slightly forward and backward, they appear to be miniaturized depth sensors (similar to the Kinect) used for tracking purposes.

Note that during stage demos Microsoft used a Kinect v2 strapped to a witness camera to show external footage of the mixed reality experience, this confirms the use of depth sensor(s) for tracking.

Note that during stage demos Microsoft used a Kinect v2 strapped to a witness camera to show external footage of the mixed reality experience, this confirms the use of depth sensor(s) for tracking. We noticed 2 sensors on the front, in between the eyebrows.

Most probably a color camera as the photo app was reported to be taking snapshots from this sensor location.

Possibly also an ambient light sensor although we couldn’t confirm this.

Tracking:

The device tracks full positional and rotational information.

Tracking was impressively robust.

There appears to be hardly any drift, even after walking away 6 meters and coming back to the same spot and walking around for several minutes, virtual objects stayed anchored to the same spot in the real world as you would expect from a real object.

We expect it to be based on something similar like Kinect Fusion, where it continuously builds a rough 3D model of the surrounding to track the head transformations.

In our demos the environment was scanned upon initialization (depicted by a cool shading effect for a few seconds).

Other demos are known to work with pre-scanned environments where things can already be placed within this known environment.

Tracking could be broken intentionally by occluding 3 or 4 of the depth sensors.

When tracking is lost it seems IMU (accelerometer, gyro, magnetometer) takes over to still provide rotational data.

The IMU is probably also used for high speed (rotational) tracking in between depth frames.

Occasionally with big changes/occlusions in surroundings or after lost tracking it would be slightly jittery for a second, after that it would be rock solid again and remain so.

The “Project XRay demo” (which unfortunately wasn’t available for us to try) also tracks the forearm and hand to attach a virtual gun so quite possibly the Kinect’s body tracking functionality can be used.

Occluding virtual/real world:

It appears the depth sensors build a 3D mesh of the surroundings upon initialization, and continuously add to this when new areas become visible (for example when looking around.

This 3D mesh is used for occlusion, for example if a person is present during this initialization virtual objects can be placed in front or behind the person.

If the person moves this would break the illusion as the occluder would still remain the same as initialized.

If the person moves this would break the illusion as the occluder would still remain the same as initialized. It was unclear if this occluder geometry is updated over time when running the demo for longer periods.

I believe the Kinect Fusion algorithms can adjust for objects that have disappeared after initialization.

Note that algorithm settings may have been primarily tuned for lower computational needs and battery consumption and this may even vary between applications.

I believe the Kinect Fusion algorithms can adjust for objects that have disappeared after initialization. Note that algorithm settings may have been primarily tuned for lower computational needs and battery consumption and this may even vary between applications. With 5 people in a living room sized area (and 5 more of those groups in the bigger open space room) I’m still amazed tracking was so robust.

I can only imagine what the full room would look like in InfraRed knowing how much active IR projection a depth sensors does.

Hand controls:

The primary interactions with the device is using hand gestures.

I’m unsure if a controller can be paired with the device at this point.

The main gesture is an “Air Tap” gesture, simply tapping your index finger and thumb together, which acts like a mouse click.

Click and drag functionality exists.

Another gesture we used was making a fist with palm facing upwards and then opening the fingers, used to go back to the main menu.

There may be other gestures that simply weren’t used for our particular demo applications.

Gesture detection seemed to work on a wide variety of body poses as I deliberately tried triggering it with my hand extended to the front, across the body, to the side etc while detection kept working.

I did miss some more fine control like buttons or thumb sticks, it may be interesting to try and pair with a controller in the future (like the ones from HTC or Oculus for example).

Voice control:

The device has built in microphones and can be controlled by voice commands just like the XBox One console with Kinect for example.

Personally I didn’t try this in depth but heard from others this didn’t always work 100%, which may have been due to the noise in the room (with 30+ people talking).

Software:

As mentioned in earlier Microsoft presentations the device runs a version of Windows 10 (don’t expect a regular desktop though, as this doesn’t make sense).

There were several familiar 2D applications (like Photos and the Edge browser), these could be pinned anywhere in the world and would remain anchored to that spot.

One of the engineers loaded some of the demos on my personal HoloLens device from a desktop machine and a USB cable, this took a few seconds to copy.

Many of the demos we saw appeared to have been built using the Unity game engine.

I got several confirmations that Unity is a strong focus of integration.

There may be other ways to deploy Universal Apps and/or scenes from other game engines like Unreal Engine 4 to the device, but I can’t confirm this.

Overall experience: