Spatial Tracking for Handhelds

Overview

Why Spatial Displays? The Superiority of Dynamic Peepholes
Overview of Low Cost Tracking Alternatives
Internal Non-Visual Sensory Fusion
Internal Non-Visual Sensory Fusion - LookingAtYou
Internal Visual Localization and Mapping - SLAM and Tango
EXTERNAL DEPTH IMAGE BASED TRACKING

Why Spatial Displays? The Superiority of Dynamic Peepholes

In our CHI 14 paper we described a user study comparing spatial navigation using the peephole metaphor with multitouch: “The results surpassed our expectations in various ways. On average, participants were more than 35% faster with the spatial approach, even though all of them were conversant with Pinch-Drag-Flick and used the spatial technique for the first time. This finding was further supported by the questionnaires, where participants rated the spatial approach at least as good as or even better than the touch-based counterpart.”

Overview of Low Cost Tracking Alternatives

Ever since we showed our interactivity at CHI´14 and at the IML Dresden Open Lab Days after ITS2014, inquisitive natures of various kinds have been looking at our OptiTrack System (priced at more than €10.000) and wondered: “Is that really necessary?” Opti Track

I have been pointed to various projects, that people believed could replace the OptitrackSystem. So I decided to write this article to provide an overview of the suggestions and my thoughts about them.

Internal Non-Visual Sensory Fusion

One of our reviewers pointed out that the Samsung S3 had a tilt to zoom feature using the gyrosope - but if you think about it we need 6 DoF tracking and this is only 1 DoF. However if you have the acceleration and orientation of a device you can theoretically determine its position over time.

This Google Tech Talk (start at 23:08) on fusing data of the intrinsic sensors and integrating accelerometer data will introduce you to the main problems you are facing when integrating over noisy data. I am not aware that anyone was able to solve this yet.

Internal Non-Visual Sensory Fusion - LookingAtYou

Microsoft research got very close to a solution. They built an image viewing application that adds face tracking to the input channel for sensory fusion. Still looking at the video we can identify two main problems here: inaccuracy and latency. These two issues will influence the user experience tremendously. In other words: you loose that 35% when you use any of the above solutions.

Internal Visual Localization and Mapping - SLAM and Tango

I believe that mobile 6DoF tracking will originate from computer vision research, more precisely the kind of CV research that is driven by the billon dollar research area of robotics. If you think about it Spatial Displays are by far not the only devices that need to be aware of their own location and rotation with in a reference frame. This is also a very fundamental question in robotics: robots constantly need to know what their environment looks like and where they are located within it. This computational problem is called SLAM Simultaneous localization and mapping. There have been amazing advances in that field, otherwise we would not have seen self driving cars.

The LSD-SLAM algorithm developed by Computer Vision Group at TUM (thx Robert for the hint) even solves the SLAM problem using a single camera, and runs on a smartphone. However, I wonder what the accuracy would be like, compared to an OptiTrackSystem (send me an eMail if you know).

However the following project is the far most promising: Project Tango by ATAP (Google). As experts in Computer Vision and Robotics are working on it, I am really exited about this project. And yet again, I wonder what the accuracy of the spatial localization will be like. I am convinced it takes quite a high accuracy to make the dynamic peephole metaphor work. It would be an interesting thing to find out how much exactly, wouldn’t it?

Update 5th December: I’ve been whitelisted for the Tango Dev kit! Hopefully i get to try it soon!

EXTERNAL DEPTH IMAGE BASED TRACKING

For a long time we had to rely on external tracking systems for reliable spatial tracking. Now, with Tango, it seems we’ll see spatially aware mobile devices which are available to the masses in the near future. Still, it will take some time until those device get a noticeable share of the market and it remains unclear how aware these devices will be of one another. This awareness of device of one another is an mandatory to map multiple Spatial Displays to the the same information space. So in the meantime if you need a less costly OptiTrack Alternative i suggest you look at stationary depth cameras for tracking. In my previous lab in Magdeburg Spindler et al. successfully used a Kinect to track the 3D location of a spatial display(see Journal Article as pdf). Another more openly available solution is HuddleLamp.

However HuddleLamp currently only provides 2D tracking. Maybe someone with a background in computer vision is able to fork HuddleLamp on Github and extend it to 6DoF tracking. Since gyroscopes are available in most handheld devices, the only unknown dimension left is the height of the device above the ground plane. However if we remember the shape of the device on the ground plane we can calculate its hight by the changes in shape size when taking into account the already known rotation from the gyroscope.

Martin Schuessler