Last fall, University of Virginia computer science professor Vicente Ordóñez noticed a pattern in some of the guesses made by image-recognition software he was building. “It would see a picture of a kitchen and more often than not associate it with women, not men,” he says.

That got Ordóñez wondering whether he and other researchers were unconsciously injecting biases into their software. So he teamed up with colleagues to test two large collections of labeled photos used to “train” image-recognition software.

Their results are illuminating. Two prominent research-image collections—including one supported by Microsoft and Facebook—display a predictable gender bias in their depiction of activities such as cooking and sports. Images of shopping and washing are linked to women, for example, while coaching and shooting are tied to men.

Machine-learning software trained on the datasets didn’t just mirror those biases, it amplified them. If a photo set generally associated women with cooking, software trained by studying those photos and their labels created an even stronger association.

Mark Yatskar, a researcher at the Allen Institute for Artificial Intelligence, says that this phenomenon could also amplify other biases in data, for example related to race. “This could work to not only reinforce existing social biases but actually make them worse,” says Yatskar, who worked with Ordóñez and others on the project while at the University of Washington.

As sophisticated machine-learning programs proliferate, such distortions matter. In the researchers' tests, people pictured in kitchens, for example, became even more likely to be labeled “woman” than reflected the training data. The researchers’ paper includes a photo of a man at a stove labeled “woman.”

If replicated in tech companies, these problems could affect photo-storage services, in-home assistants with cameras like the Amazon Look, or tools that use social media photos to discern consumer preferences. Google accidentally demonstrated the dangers of inappropriate image software in 2015, when its photo service tagged black people as gorillas.

As AI-based systems take on more complex tasks, the stakes will become higher. Yatskar describes a future robot that when unsure of what someone is doing in the kitchen offers a man a beer and a woman help washing dishes. "A system that takes action that can be clearly attributed to gender bias cannot effectively function with people," he says.

Tech companies have come to lean heavily on software that learns from piles of data, after breakthroughs in machine learning roughly five years ago. More recently, researchers have begun to show how techniques considered cold and clinical can pick up unsavory biases.

Last summer, researchers from Boston University and Microsoft showed that software trained on text collected from Google News reproduced gender biases well documented in humans. When they asked software to complete the statement “Man is to computer programmer as woman is to X,” it replied, “homemaker.”

The new study shows that gender bias is built into two big sets of photos, released to help software better understand the content of images. The researchers looked at ImSitu, created by the University of Washington, and COCO, initially coordinated by Microsoft, and now also cosponsored by Facebook and startup MightyAI. Each collection contains more than 100,000 images of complex scenes drawn from the web, labeled with descriptions.

Both datasets contain many more images of men than women, and the objects and activities depicted with different genders show what the researchers call “significant” gender bias. In the COCO dataset, kitchen objects such as spoons and forks are strongly associated with women, while outdoor sporting equipment such as snowboards and tennis rackets are strongly associated with men.