In the end, the car-image project involved 50 million images of street scenes gathered from Google Street View. In them, 22 million cars were identified, and then classified into more than 2,600 categories like their make and model, located in more than 3,000 ZIP codes and 39,000 voting districts.

But first, a database curated by humans had to train the A.I. software to understand the images.

The researchers recruited hundreds of people to pick out and classify cars in a sample of millions of pictures. Some of the online contractors did simple tasks like identifying the cars in images. Others were car experts who knew nuances like the subtle difference in the taillights on the 2007 and 2008 Honda Accords.

“Collecting and labeling a large data set is the most painful thing you can do in our field,” said Ms. Gebru, who received her Ph.D. from Stanford in September and now works for Microsoft Research.

But without experiencing that data-wrangling work, she added, “you don’t understand what is impeding progress in A.I. in the real world.”

Once the car-image engine was built, its speed and predictive accuracy was impressive. It successfully classified the cars in the 50 million images in two weeks. That task would take a human expert, spending 10 seconds per image, more than 15 years.

Identifying so many car images in such detail was a technical feat. But it was linking that new data set to public collections of socioeconomic and environmental information, and then tweaking the software to spot patterns and correlations, that makes the Stanford project part of what computer scientists see as the broader application of image data.

“There has been an explosion of computer vision research, but so far the societal impact has been largely absent,” said Serge Belongie, a computer scientist at Cornell Tech. “Being able to identify what is in a photo is not science that advances our understanding of the world.”