How close are we to holographic video for virtual reality?

In a previous article, I argued that a light field camera system would be ideal for VR, because it allows for uncompromised experiences, including positional tracking. Six months on, how close are we to a holographic system? (Six months doesn’t seem like a long time, but there has been a lot of movement in the space… also, I’m excited and impatient.)

Well… there have been some new camera systems — Ozo, Jump, Jaunt’s One, among others — but every VR camera system I’ve seen is substantially the same: a big ball o’ cameras (BBOC). Nothing remotely holographic.

But wait, you say, there are ‘light field’ cameras, right? Haven’t we heard announcements, press releases, etc? Where’s this holographic future we talked about, if we’re already getting light-field baby steps? It’s complicated. Let’s try to simplify.

[Note — This is a pretty long and in depth article, and it gets fairly dense at times. The TL;DR version: Current cameras aren’t really capturing light fields, and we won’t get ‘holographic’ playback until we have many, many more cameras in the arrays. But it’s not hopeless, and building on current systems will eventually give us great VR, AR, and even footage for holographic displays. Back to the regular programming….]

A rose by any other… Oh wait, that’s not even a rose

I think the term ‘light field’ has gotten muddled. So I’m going to try to avoid doing the same thing to the term ‘holographic’, but it’s sort of metaphorical. In fact, I asked Linda Law, a holographer with decades of experience, about the possibility of holographic light field capture. She laid it out in pretty stark terms: we need a lot, lot more cameras.

“The amount of data that goes into an analog hologram is huge. It’s a vast amount of data, way more than we can do digitally,” Law says. “That has to be sampled and compressed down, and that’s the limitation of digital holography right now, that we have to do so much compression of it.” (For those interested in learning more, her site and upcoming course cover the history and future practice of holography.)

But how much data does a hologram require, exactly? It depends on the size and resolution of the hologram… but a rough approximation: for a 2-square-meter surface, you would need about 500 gigapixels (or ‘gigarays’) of raw light field data, taking up more than a terabyte. At 60 frames per second — you wanted light field video, right? — we’re talking about 400 petabytes/hour (500 billion rays * 3 bytes/ray for 8-bit color * 60 frames/second * 3600 seconds/hour… equals a whole lotta hard drives).

Now, an important aside about holography (note the lack of quotes): a proper hologram doesn’t require a headset, because it’s literally recreating light fields. Everything else we’re talking about is trying to digitally simulate them, or use data as raw material to synthesize views. (Think about that for a moment: your credit cards all have little stickers on them that literally create light fields. The world is an awesome place.)

It seems like a fully ‘holographic’ system is a long ways off. But how close are we in human, perceptual terms? In other words, can we get close with a BBOC, and what would that look like? Let’s try to develop an intuition.

Review: what are light fields?

A ‘light field’ is really just all the light that passes through an area or volume. (The nerdier term for it is ‘plenoptic function,’ but don’t throw that around at cocktail parties unless you like getting wedgies.) Physicists have been talking about light fields since at least 1846, but Lytro really popularized the idea by developing the world’s first consumer light field camera back in 2012. So it’s an old idea that’s only recently relevant.

And that idea is actually pretty simple, once you really understand it. It turns out you’re surrounded at all times by a huge field of light, innumerable photons zipping around in all directions. Like the Force, you’re constantly surrounded by light fields, and they pervade everything. (Except there’s no ‘dark side’ to the plenoptic function: no photons, no light field. Sorry, Vader.)

A light field is technically five-dimensional — the three spatial dimensions (say, x, y, z), plus two angular dimensions (phi, theta). For any point in space (x1, y1, z1), there are light rays that are zipping around in all directions (360 degrees total of phi and theta). And as long as the light rays are moving through empty space, they move in straight lines, so we’re talking about 2D surfaces instead of 3D volumes. So holographic light fields are 2D spatial + 2D angular = 4D total.

Here’s the less-technical, VR-specific version of it: a light field camera would capture a window into a VR world. The camera defines the outside of the ‘bounding box’ that a VR viewer can move around inside while having an uncompromised, 360 degree stereo view. I experienced light field rendering and video at Otoy, my writeup is here. (It was awesome.)

So, if you can capture and recreate a light field with high enough fidelity, the result is ‘holographic.’ And that’s useful if you’re operating in VR, because you can allow a VR viewer to have positional tracking — to move his head side to side, forward and back, look straight up or down or even tilt his head without the illusion (or stereo view) breaking. Even for seated VR, that amount of freedom helps with immersion: any time you move around without positional tracking, the entire virtual universe is glued to your head and moves around with you. Thanks to Otoy, I know this isn’t theoretical; the positionally-tracked experience is truly better, and I highly recommend it. Even seated in an office chair, the difference was obvious… palpable.

Let me tell you about our additional cat-skinning technologies

So we already established that the data requirements for fully holographic capture is completely bonkers. This is why BBOCs are doing something a little… different. Once you’ve got cameras pointed outward in all directions, you’ve got seams and stitching issues to contend with. But assuming you crack that, viewers also want positional tracking (and stereo).

(Cruel trick: if you’re demoing VR video content without positional tracking to someone, have them jump forward while standing. Their rapid acceleration, and the fact that the world stays stationary, is extremely weird… on second thought, never do this.)

The Otoy demo shows how much better VR can be with positional tracking — so does CG content played back in a game engine. It’s only live action VR that has this inherent limitation. So it’s a really, really important selling point for content, to be at least as good as computer-generated content. It’s also why NextVR, Jump, and presumably Jaunt are all doing 3D modeling and painting the resulting models with pixel data: to get you some amount of ‘look around’, so their footage isn’t obviously inferior to CG content. This is… maybe a half-measure toward holographic imaging. Okay, not quite half — more like 0.2%, actually. Let me explain more… after the break. [Record scratch.]