Is Google Glass an Augmented Reality Device?
No. But it’s close. (See the bottom of this post for a little addendum.)
Augmented Reality, as a field, has been threatened with co-option of its name in the past. There may even have been some angst a few years ago about using the term to describe what most people call Mobile Augmented Reality on smartphones.
AR is, in its true form and ideal implementation, the seamless visual fusion of virtual objects and data with the real environment, by way of overlay through optics that can simulate all of the visual characteristics by which we perceive physical objects in the real world. That’s the ideal. It’s okay to refer to less-than-ideal analogues as AR devices because that ideal doesn’t exist yet. But they’re stand-ins until the necessary hardware exists.
Is a mobile phone a legitimate Augmented Reality device? Yes. One looks through the “magic window” of the screen and that becomes the user’s active Field of View (FOV). It lacks depth and the full FOV of the human eye, but one can hold it directly between one’s eyes and the subject area at which one is looking. Something like the Nintendo 3DS goes one better, since it adds stereoscopy to the experience, but it’s still far from ideal.
I know I’m belaboring this point, but for those who have tried the Oculus Rift, let me make an analogy: Imagine that you’re really standing in the yard of the villa that inspired the Rift’s Tuscany demo scene, but with no fountain there. You’re wearing VR goggles with a selectively transparent display element and lenses that don’t distort anything that’s behind that see-through display. When you aren’t looking at the fountain’s position, the goggles don’t display any of the scene, but the computer to which they are attached has the scene’s model geometry in its memory, and the model is an accurate one-to-one representation of the real space. So here’s what we’re using:
- GPS (for initial rough positioning so the system knows that the villa model and accompanying data is what it should be using)
- Head-tracking data from sensors like the MPU-6050 found in the Rift, or the MPU-9150 in Glass (it’s the same chip with the addition of a third-party magnetometer built into the package… incidentally a chip for which I wrote a sloppy but ground-breaking hack). This is mostly to make the next step easier. Because inertial and magnetic sensors are inevitably subject to at least some error, (accumulated integration error for the inertial sensors and magnetic field distortions for the magnetometer), especially when trying to measure linear translation as opposed to orientation, this is not really how you want to determine where the user is looking. but having a good guess reduces, by an order of magnitude, the number of possible perspectives against which you need to try to match data from the visual sensors.
- Visual data from cameras (stereo cameras, depth cameras using code like Kinect Fusion, single cameras with really slick SLAM algorithms like PTAMM or 13th Lab’s PointCloud™ SDK… interpreted by the CPU, or a dedicated vision processor… whatever… doesn’t matter) to precisely register position of the virtual field of view with what’s actually in front of you.
So the system knows where you are, the direction in which you’re looking, and precisely what your field of view is. The optics have the capability of displaying virtual objects with a real sense of depth, like the Rift, but don’t block out your view of reality except where displaying virtual objects. You look where the fountain should be, the systems draws it with the correct focal depth where it should be, and boom, your perception of the reality that is the yard around you has been augmented with a virtual fountain that looks like it’s really there… until somebody walks between you and said fountain. The system needs to be capable of perceiving that an object has passed into the portion of your FOV where the fountain exists, and that that object exists at a closer depth than the one to which the fountain is registered. With that data, it needs to apply a stereoscopic occlusion mask over the fountain and in the shape of the outer contours of the occluding object. Now the person between you and the fountain is visible through the person-shaped hole punched in the rendering of the fountain. Because the focal depth of the remaining visible portion of the fountain is correct, and your occlusion mask is perfect, the person appears to walk in front of the fountain. Oh yeah, don’t forget to make the lighting of the fountain match the lighting of the real place. And also don’t forget to capture the shadow of that person occluding it and remap it if it would fall on the fountain. And that other person is wearing the same AR system as you and is tuned to the same channel… your system had better show you the virtual splash when they throw that virtual rock into it. Never mind the reflection of the scene in the water 0_0.
Anyhow…
Proceed to populate your virtual environment virtual objects. Don data-gloves or spatially tracked controllers, and whatever haptic feedback systems you have access to, and reach out and interact with those virtual objects. Or use a gestural mouse for a less seamless experience. Or use that depth camera on your head and be content to limit your interactions to those where your hands are visible to it.
And THAT is Augmented Reality. And it sure as hell ain’t easy.
So, back to our original question: Is Google Glass an Augmented Reality device?
Well, what’ve we got? We’ve got GPS. We’ve got an inertial measurement unit with a magnetometer. We’ve got a camera and a host processor capable of running some SLAM analysis on what we’re seeing. We have network connectivity with which to reference an online database of virtual objects and their precise coordinates in the real world. What we don’t have is the display. Google Glass is as much an Augmented Reality device as your phone is… IF you take your phone and hold it out, up, and to the right of your head and then glance over at it to see virtual objects overlaid on the 2D image of what is actually right in front of you. Or Google Glass is as much an Augmented Reality device as the GPS display you have suction-cupped to the windshield of your car beneath your rear-view mirror (which isn’t an Augmented Reality device), not even close to being as much of one as the badass HUDs that are projected onto the windshields in front of the drivers of some newer vehicles. I wouldn’t say that that’s really Augmented Reality, but at least it’s a real see-through overlay. I would say that the automotive HUD exhibited by MVS California a couple of years ago IS a real Augmented Reality system.
So no, Google Glass is not an Augmented Reality device. But a lot of the ingredients are there, and there will be lots of apps that can display useful contextual data up and off to the side of what you’re looking at. But that isn’t Augmented Reality. It’s something useful, it’s something in the same family, and it’s something that should be of interest to everybody who is interested in Augmented Reality, but it isn’t Augmented Reality. Some people think that Glass is a bad thing because the current focus is on the capture of images and video using the onboard camera, and that that is going to creep out the public and give a bad name to head-worn computers. I’m hoping that that focus will have evolved by the time the consumer version launches. Where I think Glass is of great importance to Augmented Reality is that it is set to be the first mass-produced consumer electronics device that places all of the necessary non-display components of a basic AR headset on people’s heads. The only thing missing is the correct display modality.
Addendum:
So I just had a conversation with Steve Feiner while on a conference call to prepare for a panel that will include both of us at Augmented World Expo in Santa Clara next month. He made the argument that, with a rooted device (not limited to the Mirror API) and the eyepiece slightly repositioned, and with the addition of bigger battery (no problem; I sometimes carry an 18Ah backup battery with me anyhow) then sure, the Glass hardware could be used as a legit AR device. Stereoscopy is not a prerequisite for AR. But geometric registration of graphics with the scene is a prerequisite. So, arguably, the Glass hardware is capable of being used as an AR device… just not a very good one. So really, you shouldn’t want Glass to be used as an AR device. But it will be a great contextual data display. And I suspect that the supported programability of Glass will grow far beyond the Mirror API in short order. Keep in mind that there was no App Store on the first iPhone for a long time, and that developers were limited to web apps. I think that this is just Google’s attempt to curate and guide the experience for users and developers who aren’t hardware, interface, and kernel experimenters. It is a technology that will augment the human experience, but not with Augmented Reality. Maybe that will come with Glass Mk II, or from another company in the meantime. We’ll see.