Let's assume someone seriously wants to create a robot umpire or sports referee. Is it possible to build an accurate and trustworthy augmented reality solution today? Or must we wait for the technology to catch up?

It was the kind of play that gives an umpire—or a baseball fan—nightmares. It was Oct. 26, 1985, and the sixth game of the World Series was underway at Royals Stadium in Kansas City, Missouri. At the top of the ninth inning, the home team was staring into the abyss. Their opponents, the St. Louis Cardinals, were up three games to two and holding a 1-0 lead. If the Royals couldn’t put some runs on the scoreboard stat, the show was over.

The first batter, Jorge Orta, sent a ground ball toward first baseman Jack Clark, who snagged it and tossed it to pitcher Todd Worrell at first. Worrell tagged the base an instant before Orta’s foot touched. But at the critical instant, Orta’s body blocked the view of first-base umpire Don Denkinger, who called Orta safe. Video replays and photographs showed that the play wasn’t even close—Orta was half a step away from the bag when Worrell tagged it—but Denkinger refused to reverse his call. With a man on base and nobody out, the Royals suddenly had momentum. They scored two runs that inning to win the game. It wasn’t until after the game, when Denkinger had a chance to see the footage himself, that he realized his mistake. But by then, it was too late. The Royals took the next game and won the series.

Blunders like Denkinger’s have been a part of sports since the dawn of time. But will they be part of our future? High-definition imaging systems can already capture athletic action with razor-sharp precision from multiple angles, while artificial intelligence can sift through the incoming data with blistering speed. It would seem a no-brainer that bad calls should go the way of the clogged carburetor and the tangled VHS tape.

So why haven’t they?

The borderline pitch

Tennis offers a useful case study. Since 2006, professional tennis matches have been watched over by Hawk-Eye, a system that combines high-definition video feed from an array of as many as six cameras to automatically determine a shot’s trajectory and its point of impact. It then displays not a video of the shot but a computer-generated simulation of the ball’s movement. Even with balls traveling in excess of 100 mph, Hawk-Eye is accurate on average to within 3.6 millimeters, or less than a quarter inch. Out on the court, Hawk-Eye’s judgment is binding. This technology doesn’t assist the referee. It is the referee.

In adopting Hawk-Eye, tennis has put itself in the vanguard of automated officiating, but even so, it’s used only for a subset of decisions—the so-called boundary calls that determine whether a ball has crossed a line or not. These are binary, yes-or-no decisions that lend themselves well to machine processing. At other times, a ref must make judgments that hinge on subtler forms of discernment. During the 2011 U.S. Open, for instance, Serena Williams shouted, “Come on!” after returning a ball and before opponent Sam Stosur had a chance to hit it. Chair ump Eva Asderaki ruled that the interjection amounted to a hindrance and gave the point to Stosur.

Similar judgments exist in all sports. Was that collision of hockey players a check or a foul? When the basketball player blocked the shot, did she extend her arms too far? Did the soccer player intend to trip his opponent, or was he going for the ball and took him down by accident?

These calls are tricky enough for a human being but impossible for today’s machines. “Now, we’re getting into the really challenging stuff that nobody knows how to do yet,” says Willy Zwaenepoel, a professor of computer science at the École Polytechnique Fédérale de Lausanne. “These kinds of decisions have to do with intentions. That’s a challenge that’s harder by another order of magnitude.”

The inhuman element

Even when a system exists that can make a decision, many officials question whether they should be allowed to do so. In baseball, for example, broadcasters use a visualization system called PITCHf/x to determine whether a pitch is a ball or a strike. PITCHf/x uses three cameras to detect the ball’s trajectory through 3D space and projects it forward to decide whether it will intersect the strike zone or not. The system is built by California-based Sportvision, which also makes visualization for NASCAR racing and the America’s Cup.

Though it was designed to capture data, not to make calls, PITCHf/x played ump during an experimental minor league game in 2015, and it performed well. That successful trial, though, did nothing to alleviate Major League Baseball’s long-standing hostility toward automated refereeing on both practical and philosophical grounds. “I don’t believe the current technology is sufficient to call balls and strikes on a real-time basis,” MLB Commissioner Rob Manfred explained to USA Today. “If and when we get to that technology—and sooner or later, we’re going to get there—there’s still a fundamental question about whether or not we want to remove that human element from the game.”

Like this article? Sign up for the weekly newsletter. We won't waste your time. Sign up now!

Harry Collins, a professor of social sciences at Cardiff University in Wales, shares Manfred’s misgivings. Together with colleagues Robert Evans and Christopher Higgins, he wrote a book called "Bad Call," arguing against automated refereeing. “Sports is a human activity,” Collins says. “Humans are imperfect; that’s OK. Everyone knows that sometimes referees are going to make a mistake. It’s worked that way for hundreds of years.”

The fact that computers can make faster and more accurate decisions than human beings shouldn’t distract us from recognizing that making a call is imposing judgment on the physical world, Collins argues. “Technology shouldn’t be presented as showing reality, when really it’s creating reality,” he says. He points out that when spectators at a tennis match watch a Hawk-Eye playback after a point, few likely realize that what they’re seeing is not a recording but a simulation.

“With the spread and improvement of these virtual realities, it’s important for the public to know the difference between what they see on their screens and what’s constructed,” Collins says.

For his part, Willy Zwaenepoel not only feels that robot referees will be a good thing, but he’s worked to make them a reality. In 2016, the Belgian academic went public with a project he called Collina (after legendary Italian soccer referee Pierluigi Collina) that would replace human refs entirely with a system of cameras and computers. “The hope is to solve the endless disputes over both individual referee calls and allegations of referee bias more globally,” he says. Not only would such a system reduce fan dissatisfaction, but also take pressure off human refs. “Referees are under a lot of scrutiny right now, because with cameras everywhere, the audience can see everything and the poor guy can only see from one angle,” Zwaenepoel says.

The idea received a 50,000 Swiss franc (US$50,715) prize from the Hasler Foundation, but the project was terminated before it could bear any concrete results. Zwaenepoel acknowledges that implementing a program like Collina is beyond the capabilities of current technology, but he sees no reason not to keep working toward it. “You try for a moonshot, and hopefully you get to change the world,” he says. “In a 10-year time frame, I think it’s possible, and it’s a worthwhile thing to do.”

For all of his philosophical qualms, Collins acknowledges that most fans embrace automatic calls. “It’s not all that problematic for the public,” he says. “They seem to enjoy it.”

Robot umps: Lessons for leaders