Creating a Custom Box Entity

We can create our own Entity subclasses of custom shape and sizes by conforming to the HasModel and HasAnchoring protocols. Additionally, the HasCollision protocol is used to enable interactions with the entity — ray casting (more on this later), gesture handling (scale, translate, rotate), etc.

The following code shows how to create a custom entity box structure:

class CustomBox: Entity, HasModel, HasAnchoring, HasCollision {



required init(color: UIColor) {

super.init()

self.components[ModelComponent] = ModelComponent(

mesh: .generateBox(size: 0.1),

materials: [SimpleMaterial(

color: color,

isMetallic: false)

]

)

}



convenience init(color: UIColor, position: SIMD3<Float>) {

self.init(color: color)

self.position = position

}



required init() {

fatalError("init() has not been implemented")

}

}

There’s also a convenience initializer that allows us to specify the position of the entity in the scene with respect to the camera:

let box = CustomBox(color: .yellow)

//or

let box = CustomBox(color: .yellow, position: [-0.6, -1, -2]) self.scene.anchors.append(box) //self is arView

A box placed at a certain distance from the camera

Now we’ve added an entity to our AR scene, but we can’t perform any interactions with it yet! To do that we’ll need to add gestures, which we’ll explore next.

Entity Gestures and Child Entities

RealityKit provides us with a bunch of built-in gesture interactions. Specifically, it allows scaling, rotating, and translating the entities in the AR Scene. To enable gestures on an entity, we need to ensure that it conforms to the HasCollision protocol (which we did in the previous section).

Also, we need to “install” the relevant gestures ( scale , translate , rotate or all ) on the entity in the following way:

let box = CustomBox(color: .yellow, position: [-0.6, -1, -2])

self.installGestures(.all, for: box)

box.generateCollisionShapes(recursive: true)

self.scene.anchors.append(box)

The function generateCollisionShapes generates the shape of the Collision Component of the entity with the same dimensions as the entity’s Model Component. The collision component is responsible for interacting with the entity.

To install multiple gestures, we invoke the method with the list of gestures in an array, as shown below:

arView.installGestures(.init(arrayLiteral: [.rotate, .scale]), for: box)

With this, our entity is ready to be interacted and played around with in the AR scene.

Adding an entity to another entity

We can also add child entities to the current entity and position them relative to it. Let’s extend our current case by adding a 3D text mesh on top of the box, as shown below:

let mesh = MeshResource.generateText(

"RealityKit",

extrusionDepth: 0.1,

font: .systemFont(ofSize: 2),

containerFrame: .zero,

alignment: .left,

lineBreakMode: .byTruncatingTail)



let material = SimpleMaterial(color: .red, isMetallic: false)

let entity = ModelEntity(mesh: mesh, materials: [material])

entity.scale = SIMD3<Float>(0.03, 0.03, 0.1)



box.addChild(entity)



entity.setPosition(SIMD3<Float>(0, 0.05, 0), relativeTo: box)

The following is a glimpse of our RealityKit application with the text placed above the box:

As a note, the world’s environment has an impact on the lighting of the entities. The same box that looks pale yellow in the above illustration would look brighter in different surroundings.

Now that we’ve added interactivity to the entities and created a 3D text mesh, let’s move on to the last segment of RealityKit — ray casting.

Ray Casting

Ray casting, much like hit testing, helps us find a 3D point in an AR scene from your screen point. It’s responsible for converting the 2D points on your touch screen to real 3D coordinates by using ray intersection to find the point on the real-world surface.

Though hitTest is available in RealityKit for compatibility reasons, ray casting is the preferred method, as it continuously refines the results of tracked surfaces in the scene.

We’ll extend the above application to allow touch gestures in the ARView in SwiftUI to be converted into the 3D points, where we’ll eventually position the entities.

Currently, the TapGesture method in SwiftUI doesn’t return the location of the view — where it’s pressed. So we’ll fall back onto the UIKit framework to help us find the 2D location of the tap gesture.

In the following code, we’ve set up our UITapGestureRecognizer in the ARView , as shown below:

Take note of the findEntities function — this helps us find nearby entities in 3D space based on the 2D screen point.

function — this helps us find nearby entities in 3D space based on the 2D screen point. The setupGestures method will be invoked on our ARView instance.

method will be invoked on our instance. The makeRaycastQuery creates an ARRaycastQuery , in which we’ve passed the point from the screen. Optionally, you can pass the center point of the screen if you intend to just add the entities to the center of the screen each time. Additionally, the plane type (exact or estimated) and orientation (you can set either among horizontal , vertical or any ).

creates an , in which we’ve passed the point from the screen. Optionally, you can pass the center point of the screen if you intend to just add the entities to the center of the screen each time. Additionally, the plane (exact or estimated) and (you can set either among , or ). The results returned from ray casting are used to create an AnchorEntity on which we’ve added our box entity with the text.

on which we’ve added our box entity with the text. overlayText is what we’ll receive from the user input as the label for the 3D text (more on this later).

Before we jump onto PencilKit for creating input digits, let’s modify the ARViewContainer that loads the ARView with the changes we’ve made so far.

Configuring ARView with SwiftUI Coordinator

In the following code, the Coordinator class is added to the ARViewContainer in order to allow data to flow from the PencilKitView to the ARView .

The overlayText is picked up by the ARView scene from the Coordinator class. Next up, PencilKit meets the Vision framework.

Handling Input with PencilKit

PencilKit is the new drawing framework introduced in iOS 13. In our app, we’ll let the user draw digits on the PencilKit’s canvas and classify those handwritten digits by feeding the Core ML MNIST model to the Vision framework.

The following code sets up the PencilKit view ( PKCanvasView ) in SwiftUI:

struct PKCanvasRepresentation : UIViewRepresentable {



let canvasView = PKCanvasView()



func makeUIView(context: Context) -> PKCanvasView {



canvasView.tool = PKInkingTool(.pen, color: .secondarySystemBackground, width: 40)

return canvasView

}



func updateUIView(_ uiView: PKCanvasView, context: Context) {

}

}

ContentView

Now it’s time to merge the ARView and PKCanvasView in our ContentView . By default, SwiftUI views occupy the maximum space available to them. Hence, both of these views would take up almost half of the screen.

The code for the ContentView.swift file is presented below:

The following code does the styling for the SwiftUI button:

struct MyButtonStyle: ButtonStyle {

var color: Color = .green



public func makeBody(configuration: MyButtonStyle.Configuration) -> some View {



configuration.label

.foregroundColor(.white)

.padding(15)

.background(RoundedRectangle(cornerRadius: 5).fill(color))

.compositingGroup()

.shadow(color: .black, radius: 3)

.opacity(configuration.isPressed ? 0.5 : 1.0)

.scaleEffect(configuration.isPressed ? 0.8 : 1.0)

}

}

Finally, our app is ready! An illustration of a working RealityKit + PencilKit iOS application is given below:

An output from an iPad

Once the digit is extracted from the PencilKit drawing, all we do is a ray cast from the point where the ARView is touched on the screen to create an entity on the plane. Currently, the entities do not support collision and can be dragged in and out of each other. We’ll be handling collisions and more interactions in the a subsequent tutorial, so stay tuned!

Conclusion

RealityKit is here to abstract a lot of boilerplate code to allow developers to focus on building more immersive AR experiences. It’s fully written in Swift and has come as a replacement for SceneKit.

Here, we too a good look at the RealityKit entities and components and saw how to set up a coaching overlay. Furthermore, we created our own custom entity and child entities. Subsequently, we dug into the 3D gestures currently supported with RealityKit and integrated them on the entities, and then explored ray casting. Finally, we integrated PencilKit for handling user inputs and used the Vision framework for predicting hand-drawn digits.

The full source code along with the MNIST Core ML model is available in this GitHub Repository.

Moving on from here, we’ll explore the other interesting functionalities available in RealityKit. Loading different kinds of objects, adding sounds, and the ability to perform and detect collisions will be up next.

That’s it for this one. Thanks for reading.