Touch

Tapping, dragging, swiping; most users are comfortable with an array of touch gestures. However, most traditional Apps have focussed on controlling an object in one or two dimensions e.g. scrolling up/down (Y-axis), swiping left/right (X-axis), dragging around on screen (X&Y).

In AR Apps, we generally want our users to manipulate objects in three dimensions, but with only two dimensions on the screen, conveying intent is difficult.

To illustrate this point, let’s say our user wants to move a virtual ball projected on their table, they drag up 👆 on the screen. Should the ball move vertically up into the sky or back into the distance?

Is up up or is up back?

The problem is that the simple 2D gesture doesn’t provide enough information to make a precise manipulation in 3D. There are multiple solutions to this problem, and we’ll explore the merits of some below.

Option 1: Reduce the Dimensions

The simplest option is to reduce the number of dimensions being manipulated so that it can be represented by a lower dimensional gesture. This sounds more complicated than it is, we simply define a 2D plane/surface along which the object can move, and effectively one direction in which it cannot.

Take a furniture App like Ikea Place for example. In the App, users place furniture on the floor, and can move it across the surface of the floor but cannot change its distance from the floor. They have reduced the manipulation down to a 2D plane parallel to the floor.

With movement restricted to a plane, taps and other gestures can now be projected through the camera onto points in 3D space.

Casting a 2D screen point to a 2D plane in 3D space

Key Takeaway: You should consider if your object manipulation actually needs to be in 3D. Most Apps can likely simplify it to two, as in the real-world, we often expect objects to be grounded on flat surfaces by gravity.

Option 2: Multi-touch Gestures

Whilst multi-touch gestures are less familiar to users (the exception being pinch), they do provide a means by which the user can give more information about their intent and thus perform more complex manipulations.

For example, performing a drag with one finger may move an object parallel to the floor, whilst the same gesture with two fingers could be used to move it along the Up-Axis.

Alternatively, multi-touch gestures can be used to control different types of manipulation. As well as moving a 3D object, the user may want to rotate it. Again, this could be achieved by using a single finger to control movement, but two fingers to control rotation or scaling.

Two finger rotation gesture

Key Takeaway: Multi-touch gestures can be used to convey additional user intent but they are not common. If you include them in your App, good onboarding and instructions will be essential.

Option 3: Use a Virtual Gizmo

If complex 3D manipulations are required but multi-touch gestures are too complicated, then additional onscreen controls can be added. These controls can allow the user to toggle the manipulation they want to apply. This approach is used in 3D modeling packages and games engines, where interactive UI components called Transformation Gizmos are attached to the targeted object.

A Transformation Gizmo is a collection of handles that a user can drag on to perform transformations in a specific dimension, and lock transformation along the other dimensions. For example, in the far left image above pulling up/down on the green arrow will only allow the object to move along the UP-Axis.

Different types of gizmos can be used for different types of manipulations. Typically, there will be ones for movement, rotation, and scale, which the user can toggle between.

Key Takeaway: Gizmos are the standard approach when precision is vital. If your App requires intricate 3D manipulation this is probably the right choice.

Option 4: Use a Real Gizmo

Head-mounted Displays for VR/AR often come with a controller that tracks 3D movement and rotation and so can be used to perform 3D manipulations. Whilst iPhones, Pixels and Galaxys don’t have external controllers, the devices themselves have position and rotation information and can be used in a similar way.

Let’s say our user wants to manipulate a 3D object, they could tap to select it, which would bind its position and rotation relative to that of the device. When the user then moves or rotates the device, the object would move and rotate with it. Once happy, the user can tap again to deselect and release the object to its new fixed location.

lock and move