China’s leading Internet search company, Baidu, says that data collected from its customers could be used to predict and preëmpt potentially deadly crowd gatherings in the real world.

Baidu has an incredible amount of data to mine. Out of a total population of 1.35 billion in China, more than 657 million people use its services. And some 302 million use its map services each month.

The Baidu research also highlights how the digital trails left by Internet users can be used to understand city dynamics. Baidu’s data is already being used in China to show city planners where to place transportation, facilities, or shops. However, some experts worry that such data mining might also help the government keep an eye out for social unrest.

The research comes from a lab at Baidu that is researching ways to mine Baidu’s vast data for insights on social trends and behavior. The same group previously showed just how deserted some urban areas of China, known as “ghost cities,” really are (see “Data Mining Reveals the Extent of China’s Ghost Cities”).

To predict crowd problems, Baidu’s researchers trained a machine-learning system to analyze users’ online map queries and predict how they relate to the movement of real people from mobile positioning data, which Baidu collects through its mobile apps. “With more than 70 percent of overall market share in China, Baidu map has an innate advantage to tackle this problem,” the researchers write in a paper describing the work.

A screenshot of Baidu’s crowd-control tool.

The researchers found they could determine, up to three hours in advance, when and where a dangerously large number of people might congregate. Using historical data, such as data relating to a New Year’s Eve gathering in Shanghai in 2014 where 36 people were killed by a stampede, the company could use the approach to warn Chinese authorities about gatherings that might become dangerous. “It is possible to give a very early warning for abnormal crowd events, to avoid crowd disasters,” the researchers say.

The crowd-monitoring technology may be incorporated into Baidu’s maps, which already feature heat maps showing people's movements. This can highlight areas that should be avoided because of overcrowding.

The map data collected by Baidu is anonymous, so it could not be used to track individuals. However, some experts say the kind of data collected by Baidu might provide a way for the government to monitor unrest.

“We have seen similar things from phone location data,” says Sandy Pentland, a professor at MIT’s Media Lab, whose group uses mobile data signals to track and predict behavior.

“I think the interesting question is the trade-off between community safety and individual privacy or government control,” Pentland adds. “In our data many of the spontaneous crowd events turned out to be protests and riots. Given that we can predict such events, will the government try to suppress political expression?”