Hosts are one of Sumerian's most unique selling points. A host is a 3D-animated character you can place into an AR or VR scene. Users can ask hosts questions, and developers can script a complex set of actions, behaviors, gestures, and movements a host can perform as they have conversations and walk around scenes. Amazon drew inspirations for hosts from all sorts of places, including online games like Second Life and The Sims, Roche said.

Sumerian currently has two default hosts—Cristine and Preston—but will launch a whole series of hosts over the course of this year. Amazon built a lot of nuance into these AI characters. Roche showed me a demo of Cristine where he dragged the host into the scene, and pulled open the inspector panel to customize her emotions, facial expressions, and gestures. Amazon will auto-generate gestures as the host talks based on natural language processing of the conversation. So if Cristine says "Hi," it might trigger a waving gesture.

With something called a point of interest system, you can check a box in the editor so the host's eyes always pay attention to the camera. So if you're wearing an HTC Vive Pro walking around a 360-degree space, the host can follow you. If it's an AR app connected to your smartphone camera, Roche explained that Amazon's Rekognition deep-learning system can run facial analysis of both where you are and where your face is in the frame to make it look like the host is looking back through your screen directly at you. It gives you the illusion of eye contact.

Customers can also create their own custom hosts from scratch using Amazon's Maya SDK, but Amazon provides the basic skeleton from which you can adjust a host's appearance, dialect and inflections, language, and more. In the long-term, Amazon is thinking about ways to make it easier to create hosts. Argenti talked about the idea of a host generator for first-person avatars, or using facial recognition to match rendered characters to real people.

"In conjunction with Rekognition, if we procedurally generate as many of these characters as possible, we can try to match you to the closest avatar. We'll take your photo and run reverse facial recognition and match it to a randomized character to give you a host that looks like a version of you."

Argenti explained how integrating other AWS services like the Amazon Comprehend natural language processing service could make hosts even more lifelike. Comprehend analyzes text to extract metadata on things like mood and sentiment analysis. So a host could have a different facial expression or manner or speaking based on the mood of the person they're interacting with.

"If they're angry, maybe the host calms them down," Argenti said. "There's an evolution not only in the way we convey information, but how we present it though deep sentiment analysis."