With all of the hype around self-driving cars, it may seem like mass-market availability is just around the corner. Waymo’s cars have logged over 3 million fully autonomous miles on the streets of Texas, Arizona, and California; Uber’s have been ferrying passengers around Pennsylvania and Arizona; and during TED2017 in April, Elon Musk boldly promised that Tesla’s would reach full autonomy by the end of this year. As of May, 44 companies are involved in developing self-driving vehicles, according to CB Insights.

But let’s be clear: The leap from the current state of autonomous vehicles to the day you’ll be shuttled around by truly driverless cars is bigger than you think.

The main obstacle can be boiled down to teaching cars how to operate reliably in scenarios that don’t happen often in real life and are therefore difficult to gather data on. Anything from strange weather occurrences and other vehicles’ unpredictable driving patterns to more common but irregular situations, like emergency vehicles and snowfall, can pose a data problem. Without consistent opportunities to encounter these situations during average training sessions, self-driving cars often must undergo specialized training for those scenarios, which takes time.

Last month, for example, Waymo began targeted emergency-vehicle detection training for its minivans. To ensure a safe testing environment, the Alphabet’s self-driving car unit partnered with a local Arizona police and fire station in a low-traffic neighborhood to record visual and audio samples of fire trucks, police cars, and motorbikes driving by its minivans under various lighting conditions and at different speeds, distances, angles. This month, Waymo also began testing the performance of its cars in extreme heat by running them through “as many driving conditions as possible” on the roads of Death Valley. Similarly, back in February 2015, the then-Google project sent its cars to Washington state to seek out rain, after California’s drought provided little opportunity for adverse weather testing. Last January, Ford sent its cars to Mcity, Michigan—a town built specifically for controlled autonomous vehicle testing—to log its first miles in the snow.

To train vehicles for rarer scenarios, the data collection gets even harder. In the course of reporting on the impact of automation on trucking towns, Quartz’s Mike Murphy and Dave Gershgorn spoke with a trucker who explained how the broad sides of a truck made it prone to tipping over without experienced maneuvering in the wind patterns characteristic of the western US. Given the erratic nature of the wind, it is also difficult to model. “There are too many variables,” said Terry, a fellow trucker.

All this isn’t to say that a self-driving car must be exposed to every scenario under the sun in order to function robustly. Instead, developers tease out patterns and train vehicles on the fundamental principles that would allow them to fare well in most situations. But there’s a caveat: “Two situations that we as humans might find equivalent, may not be equivalent to whatever kind of software that you’re building,” cautioned Michael Wagner, a co-founder of software technology company Edge Case Research. “You sometimes don’t know whether or not prior development applies to a situation. You don’t necessarily know whether all your simulations are going to be realistic enough to uncover all of the different cases.”

In other words, a training data set considered comprehensive to a human may be insufficient to a machine. As a result, real-world testing is still crucial for discovering data gaps. That’s why Waymo and Uber, for example, both continue to accumulate miles on real roads in addition to within simulations and controlled testing environments. During real-world drives, added an Uber spokesperson, the vehicle operator records newly discovered gaps in real-time and relays them back to the engineering team.

“There is a huge difference between building a few vehicles to run in reasonably benign conditions with professional safety drivers, and building a fleet of millions of vehicles that have to run in an unconstrained world,” wrote Wagner in a 2016 paper co-authored with his co-founder. “It’s going to take a lot of work,” he said. But Wagner remains optimistic that we will get there—even if it takes a decade or two.