VR is all about immersion, and the ability to track the user’s position in space is a key element of it. However, to date this has only been available in desktop and console VR, even though modern smartphones already incorporate the essential technology to make it possible in mobile VR too. This blog explains how to achieve inside-out tracking in mobile VR using only Unity and AR SDKs with today’s handsets.

Note that this particular method of implementing inside-out-tracking is not an officially supported Unity feature, nor is it on our immediate roadmap. We learned that Roberto from ARM was doing something cool with some of our integrated platforms and wanted to share it with you.

If you have ever tried a room scale VR console or desktop game then you will understand how badly I wished to implement mobile inside-out VR tracking. The problem was that there were no SDK’s and/or phones to try it. At the beginning of the year, I saw the first opportunity at CES when the news about the new ASUS release supporting AR functionality and Daydream became public. This ended up being an accidental leak because as it turned out, the ASUS release was not available until June. Only then I could create my first inside-out mobile VR tracking project in Unity for Daydream using the early AR SDK for Android. When I got it working, it was amazing to walk in the real world and see how my camera in VR also moved around virtual objects. It felt so natural, and it is something you need to experience yourself.

The second chance I had to implement inside-out tracking became available when Google released the ARCore SDK. On the same day, Unity released a version supporting it. I was so excited I couldn’t wait! So, that weekend I got my second inside-out mobile VR tracking project in Unity. This time for the Samsung Gear VR using the Google ARCore SDK on a Samsung Galaxy S8. This mobile device has an Arm Mali-G71 MP20 GPU capable of delivering high image quality in VR by using 8x MSAA running consistently @ 60 FPS.

This blog is intended to share my experience in developing inside-out mobile VR tracking apps and making it available to Unity developers. The Unity integration with ARCore SDKs is not yet prepared to do inside-out mobile VR tracking out of the box (or it wasn’t intended to do it), so I hope I will save you some time and pain with this blog.

I hope you will experience the same satisfaction I had when you implement your own Unity mobile VR project with inside-out tracking. I will explain step by step how to do it with the ARCore SDK.

Mobile inside-out VR tracking using the Google ARCore SDK in Unity

I won’t point out all the steps you need to follow to get Unity working. I assume you have Unity 2017.2.0b9 or later, and have the entire environment prepared to build Android apps. Additionally, you’ll need a Samsung Galaxy S8. Unfortunately, you can try inside-out VR tracking based on Google ARCore only on this phone and Google Pixel and Pixel XL so far.

The first step is to download the Unity package of the Google ARCore SDK for Unity (arcore-unity-sdk-preview.unitypackage) and import it to your project. A simple project will be enough; just a sphere, a cylinder and a cube on a plain.

You will also need to download the Google ARCore service. It is an APK file (arcore-preview.apk), and you need to install it on your device.

At this point you should have a folder in your project called “GoogleARCore” containing a session configuration asset, an example, the prefabs, and the SDK.

We can now start integrating ARCore in our sample. Drag and drop the ARCore Device prefab that you will find in the Prefabs folder into the scene hierarchy. This prefab includes a First-Person Camera. My initial thought was to keep this camera that automatically converts to the VR camera when ticking the “Virtual Reality Supported” box in Player Settings. I understood later that this is a bad decision. The reason for this is that this is the camera used for AR. We mean the camera used to render the phone camera input together with the virtual objects we add to the “real world scene”. I have identified three big inconveniences so far:

You need to manually comment the line that calls _SetupVideoOverlay() in the SessionComponent script because if you untick the “Enable AR Background” option in the session settings asset (see Fig. 3) then the camera pose tracking doesn’t work at all.

You can’t apply any scale factor you may need to use to map the real world to your virtual world. You can’t always use a 1:1 map.

After selecting the Single-pass Stereo Rendering option, I got the left eye rendered correctly but not-so-good rendering in the right eye. Single-pass Stereo Rendering is something we need to use, to reduce the load on the CPU and accommodate the additional load that ARCore tracking brings.

So, we will use our own camera. As we are working on a VR project, place the camera as a child of a game object (GO); so we can change camera coordinates according to the tracking pose data from the ARCore subsystem. It is important to note here that the ARCore subsystem provides the camera position and orientation, but I decided to use only the camera position and let the VR subsystem to work as expected. The head orientation tracking the VR subsystem provides is in sync with the timewarp process and we don’t want to disrupt this sync.

The next step is to configure the ARCore session to exclusively use what we need for tracking. Click on the ARCore Device GO and you will see in the Inspector the scripts attached to it as in the picture below:

Double click on Default SessionConfig to open the configuration options and untick the “Plane Finding” and “Point Cloud” options as we don’t need them since they add a substantial load on the CPU. We need to leave “Enable AR Background” (passthrough mode) ticked in options otherwise the AR Session component won’t work and we won’t get any camera pose tracking.

The next step is to add our own ARCore controller. Create a new GO ARCoreController and attach to it the script HelloARController.cs which we will borrow from the GoogleARCore/HelloARExample/Scripts folder. I renamed it to ARTrackingController and removed some items we don’t need. My ARCoreController looks as the picture below. I have also attached to it a script to calculate the FPS.

The Update function of the ARTrackerController script will look like below:

public void Update (){ _QuitOnConnectionErrors(); if (Frame.TrackingState != FrameTrackingState.Tracking) { trackingStarted = false; // if tracking lost or not initialized m_camPoseText.text = "Lost tracking, wait ..."; const int LOST_TRACKING_SLEEP_TIMEOUT = 15; Screen.sleepTimeout = LOST_TRACKING_SLEEP_TIMEOUT; return; } else { m_camPoseText.text = ""; } Screen.sleepTimeout = SleepTimeout.NeverSleep; Vector3 currentARPosition = Frame.Pose.position; if (!trackingStarted) { trackingStarted = true; m_prevARPosePosition = Frame.Pose.position; } //Remember the previous position so we can apply deltas Vector3 deltaPosition = currentARPosition - m_prevARPosePosition; m_prevARPosePosition = currentARPosition; if (m_CameraParent != null) { Vector3 scaledTranslation = new Vector3 (m_XZScaleFactor * deltaPosition.x, m_YScaleFactor * deltaPosition.y, m_XZScaleFactor * deltaPosition.z); m_CameraParent.transform.Translate (scaledTranslation); if (m_showPoseData) { m_camPoseText.text = "Pose = " + currentARPosition + "

" + GetComponent<FPSARCoreScript> ().FPSstring + "

" + m_CameraParent.transform.position; } } } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 public void Update ( ) { _QuitOnConnectionErrors ( ) ; if ( Frame . TrackingState != FrameTrackingState . Tracking ) { trackingStarted = false ; // if tracking lost or not initialized m_camPoseText . text = "Lost tracking, wait ..." ; const int LOST_TRACKING_SLEEP_TIMEOUT = 15 ; Screen . sleepTimeout = LOST_TRACKING_SLEEP_TIMEOUT ; return ; } else { m_camPoseText . text = "" ; } Screen . sleepTimeout = SleepTimeout . NeverSleep ; Vector3 currentARPosition = Frame . Pose . position ; if ( ! trackingStarted ) { trackingStarted = true ; m_prevARPosePosition = Frame . Pose . position ; } //Remember the previous position so we can apply deltas Vector3 deltaPosition = currentARPosition - m_prevARPosePosition ; m_prevARPosePosition = currentARPosition ; if ( m_CameraParent != null ) { Vector3 scaledTranslation = new Vector3 ( m_XZScaleFactor * deltaPosition . x , m_YScaleFactor * deltaPosition . y , m_XZScaleFactor * deltaPosition . z ) ; m_CameraParent . transform . Translate ( scaledTranslation ) ; if ( m_showPoseData ) { m_camPoseText . text = "Pose = " + currentARPosition + "

" + GetComponent < FPSARCoreScript > ( ) . FPSstring + "

" + m_CameraParent . transform . position ; } } }

I removed everything but the checking of connection errors and the right tracking state. I have replaced the original class members by the ones below:

public Text m_camPoseText; public GameObject m_CameraParent; public float m_XZScaleFactor = 10; public float m_YScaleFactor = 2; public bool m_showPoseData = true; private bool trackingStarted = false; private Vector3 m_prevARPosePosition; 1 2 3 4 5 6 7 8 9 10 11 12 13 public Text m_camPoseText ; public GameObject m_CameraParent ; public float m_XZScaleFactor = 10 ; public float m_YScaleFactor = 2 ; public bool m_showPoseData = true ; private bool trackingStarted = false ; private Vector3 m_prevARPosePosition ;

You then need to populate the public members in the Inspector. The camPoseText is used to show on-screen data for debugging and errors, when tracking is lost, together with the phone camera position obtained from the Frame and the virtual camera position after applying the scale factors.

As I mentioned before, you will hardly always be able to map your real environment one to one to the virtual scene, and this is the reason I have introduced a couple of scaling factors for the movement on the XZ plane and in the Y axis (up-down).

The scale factor depends on the virtual size (vSize) we want to walk through and the actual space we can use in the real world. If the average step length is 0.762 m and we know we have room in the real world to do only nSteps, then a first approximation to the XZ scale factor will be:

scaleFactorXZ = vSize / (nSteps x 0.762 m)

I kept the _QuitOnConnectionErrors() class method and only changed the message output to use the Text component m_camPoseText.

private void _QuitOnConnectionErrors() { // Do not update if ARCore is not tracking. if (Session.ConnectionState == SessionConnectionState.DeviceNotSupported) { m_camPoseText.text = "This device does not support ARCore."; Application.Quit(); } else if (Session.ConnectionState == SessionConnectionState.UserRejectedNeededPermission) { m_camPoseText.text = "Camera permission is needed to run this application."; Application.Quit(); } else if (Session.ConnectionState == SessionConnectionState.ConnectToServiceFailed) { m_camPoseText.text = "ARCore encountered a problem connecting. Please start the app again."; Application.Quit(); } } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 private void _QuitOnConnectionErrors ( ) { // Do not update if ARCore is not tracking. if ( Session . ConnectionState == SessionConnectionState . DeviceNotSupported ) { m_camPoseText . text = "This device does not support ARCore." ; Application . Quit ( ) ; } else if ( Session . ConnectionState == SessionConnectionState . UserRejectedNeededPermission ) { m_camPoseText . text = "Camera permission is needed to run this application." ; Application . Quit ( ) ; } else if ( Session . ConnectionState == SessionConnectionState . ConnectToServiceFailed ) { m_camPoseText . text = "ARCore encountered a problem connecting. Please start the app again." ; Application . Quit ( ) ; } }

After all this is working, your hierarchy (besides your geometry), should look like in the picture below:

As in my project the camera is colliding with some chess pieces in a chess room (this is an old demo I use every time I need to show something quick) I have added a CharacterController component to it.

At this point we are almost ready. We just need to set up the player settings. Besides the standard settings we commonly used for Android, Google recommends:

Other Settings -> Multithreaded Rendering: Off

Other Settings -> Minimum API Level: Android 7.0 or higher

Other Settings -> Target API Level: Android 7.0 or 7.1

XR Settings -> ARCore Supported: On

Below you can see a capture of my XR Settings. It is important to set the Single-pass option to reduce the number of draw calls we issue to the driver (almost halved).

If you build your project following the above described steps, you should get the mobile VR inside-out tracking working. For my project the picture below was my rendering result. The first line of text shows the phone camera position in the world supplied by Frame.Pose. The second line shows the FPS, and the third line shows the position of the VR camera in the virtual world.

Although the scene is not very complex in terms of geometry, the chess pieces are rendered with reflections based on local cubemaps, there are camera-chess pieces and chess pieces – chess room collisions. I am using 8x MSAA to achieve high image quality. Additionally, the ARCore tracking subsystem is running and all this on the Samsung S8 CPU and Arm Mali-G71 MP20 GPU render the scene at a steady 60 FPS.

Conclusions

At this point, I hope you have been able to follow this blog and build your own mobile VR Unity project with inside-out tracking and above all, experience walking around a virtual object while doing the same in the real world. You will hopefully agree with me that it feels very natural and adds even more sense of immersion to the VR experience.

Just a few words about the quality of the tracking. I haven’t performed rigorous measurements, and these are only my first impressions after some tests and the feedback of colleagues that have tried my apps. I have tried both implementations indoors and outdoors, and they worked pretty stable in both scenarios. The loop closing was also very good, with no noticeable difference when coming back to the initial spot. When using Google ARCore I was able to go out of the room and the tracking still worked correctly. Nevertheless, formal tests need to be performed to determine the tracking error and stability.

Up to now we have been bound to a chair, moving the virtual camera by means of some interface being able to control only the camera orientation with our head. However, now we are in total control of the camera in the same way we control our eyes and body. We are able to move the virtual camera by replicating our movements in the real world. The consequences of this new “6DoF power” are really important. Soon, we should be able to play new types of games on our mobile phones that up to now are only possible in the console and desktop space. Other potential applications of mobile inside-out VR tracking in training and education will be possible soon as well just with a mobile phone and a VR headset.

As always, I really appreciate your feedback on these blogs and please any comments on your own inside-out mobile VR experience.

About the Author

After a decade working in nuclear physics, Roberto discovered his real passion for 3D graphics in 1995 and has been working in leading companies ever since. In 2012 Roberto joined Arm and has been working closely with the ecosystem in developing optimized rendering techniques for mobile devices. He also regularly publishes graphics related blogs, delivers talks and workshops at different game related events.