French company Idemia’s algorithms scan faces by the million. The company’s facial recognition software serves police in the US, Australia, and France. Idemia software checks the faces of some cruise ship passengers landing in the US against Customs and Border Protection records. In 2017, a top FBI official told Congress that a facial recognition system that scours 30 million mugshots using Idemia technology helps “safeguard the American people.”

But Idemia’s algorithms don’t always see all faces equally clearly. July test results from the National Institute of Standards and Technology indicated that two of Idemia’s latest algorithms were significantly more likely to mix up black women’s faces than those of white women, or black or white men.

The NIST test challenged algorithms to verify that two photos showed the same face, similar to how a border agent would check passports. At sensitivity settings where Idemia’s algorithms falsely matched different white women’s faces at a rate of one in 10,000, it falsely matched black women’s faces about once in 1,000—10 times more frequently. A one in 10,000 false match rate is often used to evaluate facial recognition systems.

Donnie Scott, who leads the US public security division at Idemia, previously known as Morpho, says the algorithms tested by NIST have not been released commercially, and that the company checks for demographic differences during product development. He says the differing results likely came from engineers pushing their technology to get the best overall accuracy on NIST’s closely watched tests. “There are physical differences in people and the algorithms are going to improve on different people at different rates,” he says.

Computer vision algorithms have never been so good at distinguishing human faces. NIST said last year that the best algorithms got 25 times better at finding a person in a large database between 2010 and 2018, and miss a true match just 0.2 percent of the time. That’s helped drive widespread use in government, commerce, and gadgets like the iPhone.

But NIST’s tests and other studies repeatedly have found that the algorithms have a harder time recognizing people with darker skin. The agency’s July report covered tests on code from more than 50 companies. Many top performers in that report show similar performance gaps to Idemia’s 10-fold difference in error rate for black and white women. NIST has published results of demographic tests of facial recognition algorithms since early 2017. It also has consistently found that they perform less well for women than men, an effect believed to be driven at least in part by the use of makeup.

“White males ... is the demographic that usually gives the lowest FMR,” or false match rate, the report states. “Black females ... is the demographic that usually gives the highest FMR.” NIST plans a detailed report this fall on how the technology works on different demographic groups.

NIST’s studies are considered the gold standard for evaluating facial recognition algorithms. Companies that do well use the results for marketing. Chinese and Russian companies have tended to dominate the rankings for overall accuracy, and tout their NIST results to win business at home. Idemia issued a press release in March boasting that it performed better than competitors for US federal contracts.

Many facial recognition algorithms are more likely to mix up black faces than white faces. Each chart represents a different algorithm tested by the National Institute of Standards and Technology. Those with a solid red line uppermost incorrectly match black women's faces more than other groups. NIST

The Department of Homeland Security has also found that darker skin challenges commercial facial recognition. In February, DHS staff published results from testing 11 commercial systems designed to check a person’s identity, as at an airport security checkpoint. Test subjects had their skin pigment measured. The systems that were tested generally took longer to process people with darker skin and were less accurate at identifying them—although some vendors performed better than others. The agency’s internal privacy watchdog has said DHS should publicly report the performance of its deployed facial recognition systems, like those in trials at airports, on different racial and ethnic groups.