People make critical intuitive judgments to drive safely and smoothly. Can cars make them too?

In early 2016, I brought a GoPro camera to an intersection near my lab at Harvard to answer a question that had been nagging me: While driving, how often do people look at another person and attempt to understand and react to what that person is thinking?

After examining a few videos, I had an answer: People make intuitive, subconscious judgments about what someone else wants or knows constantly. In one 30-second segment of footage of a suburban street, there were well over 50 instances where one person looked at another and clearly thought, for example, “she isn’t going to cross,” or “he knows I’m here and is willing to yield,” or “she doesn’t see me, I should stop.” Humans are incredibly good at silently communicating with each other. That communication is the key to safe and considerate driving.

The video that made us realize how big a problem this really was

Machines, on the other hand, lack this critical ability. They can’t decipher our unspoken social communications nor intuit what’s going on inside our heads. So how can we possibly give machines like self-driving cars the ability to read human intentions? To know what humans know?

This was a really hard problem, but my co-founder Walter and I had been developing technology that I was confident offered a solution. Sid, my co-founder and our CEO, asked companies building automated vehicles how they were approaching the problem. What he learned is that none of them had figured out a way to address it.

State-of-the-art approaches to interacting with the world using computer vision and motion planning simply didn’t cut it when it came to other humans. Software of that type uses physics to identify where people are and what their trajectory is, but can’t anticipate what a pedestrian might do next. People aren’t billiard balls — we just aren’t predictable that way.

Moreover, they were tackling one of the largest-scale engineering challenges ever attempted and it was clear that they had plenty of other problems to solve. So they focused on what was plausible and tractable and put off what has been called “the hardest of the hard problems” for autonomous vehicles: the problem of understanding other humans.

This makes sense when you’re approaching the self-driving challenge from an engineering and computer science perspective — solve the problems you know how to solve, first! — but this means that instead of smoothly responding to a situation the way a human driver would, self-driving cars act alternately ‘paranoid’ — timid, skittish, and easily startled — and oblivious.

A jaywalker who stops in the middle of the road is incredibly confusing for current self-driving cars — our models understand her intentions perfectly

They frequently slow down or come to a complete stop in the presence of pedestrians and/or cyclists when there is no need to do so, only to start moving again at exactly the wrong time. This style of driving is nauseating for passengers and infuriating for human drivers, pedestrians, and cyclists. It’s the reason why self-driving cars are often rear-ended and it can cause human drivers to behave erratically as well.

Somebody who is walking in the road but doesn’t want to cross right away is a conundrum today — our models handle it.

It is now widely understood that there will be no meaningful real-world deployment of self-driving vehicles without solving the problem of human understanding. And here at Perceptive Automata, that’s what we’ve done. Let me explain our approach.

The beauty of behavioral science

We felt that human intuition could be cracked with the right combination of talent, tech, and out-of-the-box thinking, and we felt the problem was too urgent to wait around for someone else to try to figure it out. We knew that our team had a unique collection of skills and experience, and we had spent years in the lab at Harvard refining a technique that allowed us to solve this problem in a way that nobody else can.

We’ve designed a model that can use the whole spectrum of subtle, unconscious insights that we, as humans, use to make incredibly sophisticated judgments about what’s going on in someone else’s head. You could say that, in a sense, our models develop their own human-like “intuition.”

Our models can tell that somebody wants to cross even though they haven’t started walking

We take sensor data from vehicles that show interactions with people (pedestrians, bicyclists, and other motorists). This data is incredibly rich raw material, because every time a person appears and interacts with a vehicle, they’re giving off hundreds of signals that another human could use to understand their awareness and intention — their state of mind. We chop up the raw sensor feed into slices, manipulate those slices, and then show the clips to groups of people that answer questions about the depicted pedestrian’s (or motorist’s, or cyclist’s) level of intention and awareness based on what they are seeing (e.g., one of our study respondents might think the pedestrian is waving at the car to go ahead, while another respondent might think the pedestrian is asking the car to stop and let them cross). We repeat this process hundreds of thousands of times, with all sorts of interactions, and then we use that data to train models that interpret the world the way people do.

Somebody at a crosswalk who doesn’t know the car is there deserves careful attention, but since this man has no interest in crossing in front of us, we can proceed without unnecessary stopping

Once trained, our deep learning models allow self-driving cars to understand and predict human behavior and, subsequently, react with human-like behaviors. This has huge implications for safety, rider experience, and practical utility in the self-driving car industry.

In standard approaches to training AI, most of the things that humans know about the world are wasted. Perceptive Automata’s proprietary approach is capable of leveraging the incredibly rich judgments we make about the world around us.

On the road

In late 2016, Avery, my other co-founder and a machine learning expert, led the charge to build a prototype of our concept: a model trained using the techniques of behavioral science (including cognitive psychology, neuroscience, and psychophysics) to characterize the way humans evaluate other humans, and giving that information to an autonomous system.

Our models can tell that this cyclist is signalling that he wants to cross in front of us, even though he’s not moving, and know that he’s aware, even though he’s currently looking away.

We took our prototype to First Round Capital and felt an instant bond. As the first investor in Uber, they deeply understand the promise and challenges of autonomous mobility, and instantly understood the problem we described. They saw our potential and we’re proud to have them leading our seed round, which closed in early 2017. This investment has given us the means to build out an amazing team of machine learning experts, experienced automotive software engineers, some of the sharpest business minds in autonomy, and more.

Today, Perceptive Automata’s human intuition AI module is already up and running in the vehicles of self-driving car companies around the world. With our growing team of field-leading researchers, experienced automotive engineers, and noted authorities in the autonomous vehicle business working out of our offices in Boston and Silicon Valley, we’ve been able to build relationships with the people who are going to be deploying world-changing autonomous vehicle technology. Our systems for real-time understanding of people’s intentions, awareness, and states of mind are helping these companies build autonomous vehicles that can operate safely and effectively in a world full of people as we speak. With our help, the vehicles they are building will be pleasant to ride in, safe, easy to interact with, and good citizens of the road.