With Google Glass just around the corner, augmented reality is finally becoming a realistic part of our foreseeable future. The idea of AR — that we could dynamically edit information into (or out of) a live video feed as it’s being displayed — has never been undercut by software or hardware power, but by the simple fact that the concept requires both a camera and a display. Even the tech industry, with its almost religious belief in the unlimited potential of early adoption, saw that the world would never accept a lifestyle upgrade that requires them to keep an iPad held in front of their faces. It is steps forward on the usability side that have brought real potential to the technology’s dreams of consumer relevance, and now it’s finally time for the software developers to take advantage of that potential.

MIT Media Lab has released a video to go along with its recent publications detailing new AR interface technology it claims could put “a user interface on any surface.” In the video below, we see the researchers control an MP3 player and a set of speakers using an interface dynamically layered atop the iPad’s video feed.

The demonstration still uses ye olde augmented reality techniques, however, still requiring us to hold our iPads out like a portable X-ray viewer. From a practical standpoint, this makes the technology fundamentally a gimmick, since most people get annoyed with having to touch a tiny button on a 2D interface; it’s simply impossible to imagine large-scale adoption of a solution that forces me to properly align two devices in 3D space, just to change speaker volume. Taken purely as a display of software power, however, and with a faithful belief that this code can be migrated from iOS onto more realistic platforms, this idea begins to show promise.

The technology works by wirelessly networking devices with a simple home router, and running software that keeps the various connected objects in some level of spatial “awareness” of one another. This means that to support an interface, an object must be fitted with a small computer chip and a wireless card — don’t expect to be using your phone to pop up the lever on your toaster any time soon. Presumably, the idea is that more and more home and public devices will be fitted with such controllers automatically.

What we need is an end-user solution like Google Glass but with a much larger footprint in the field of view, probably something retractable, and it would need to be much better at interpreting its wearer’s hand waving than current Kinect-like software solutions — or would they take a page from Sony’s Move and have us wearing colored thimbles? Regardless, widespread enough adoption of such a Glass 2.0 solution, and of the object-end controllers seen in this demonstration, could lead to some truly interesting new ways of dealing with technology. What if your buddy’s stereo used precisely the same button layout as your own — and at the same time, he could use his preferred control scheme to fiddle with yours? What if your passwords, or your bank PIN, never actually appeared on-screen, even as stars? What if we could make virtually every device cheaper by replacing buttons with easily machine-read visual patterns, literally painting buttons onto devices?

It’s worth noting the distinction between this technology and another form of augmented reality interface, the projected interface. This has actually been on the market for a while (reports vary on their usefulness) but recent developments seem to have moved the technology forward immensely. MIT Media Labs itself has waded into the fray with its LuminAR project, but competitors like Microsoft are just as keen to grab this tiger by the tail. Since this idea eschews the aspects of privacy, separate and simultaneous use, and others that are inherent to the premise of MIT’s AR interface, the two burgeoning technologies do not seem to be in particularly direct competition.

The interface-everywhere zeitgeist highlights the increasingly schizophrenic relationship between display and viewer: do we want greater usability and convenience, or do we want greater resolution and picture fidelity? As relatively low-fi displays like e-ink gain traction in everyday life, the role of the monitor will look increasingly like that of the television. Why consume Facebook the same way as Game of Thrones — does a wall post require such detail? And if a low-res display clamped against your temple can put a friend’s latest tweet next to their face as you speak to them, we might begin to wonder why we ever believed that a huge desktop screen was a good way to handle our increasingly endless digital chores in the first place.

The technology is clearly still in its infancy; yet another solution designed for a world that doesn’t exist yet. It’s another chicken-egg conundrum; why should a bar pay to put a controller in its jukebox when so few customers could use it, and why would we buy an AR interface like Google Glass without a wealth of controllers already in place to use? The only future I see for AR interfaces springs from the widespread adoption of Google Glass, and competitors. The success of this sort of tech will likely hinge on whether we can put a display in front of a majority of consumer eyeballs.

If we can do that, Minority Report ought not to be far behind.

Now read: Microsoft OmniTouch: Turn your entire world into a touchscreen