Using search data collected by Google, researchers at the Pew Research Center have reconstructed the journeys taken by refugees flowing into Europe from the Middle East. It represents a new way of tracking migration patterns—but the technique could eventually lead to misuse.




In 2015 and 2016, some 2.5 million refugees left Syria and Iraq in search of asylum in the European Union, Norway, and Switzerland. A good portion of these migrants had smartphones, which they used to access maps, locate safe and open border crossings, source details about their destination, and seek travel advice from social media. By doing so—and completely unbeknownst to them—the refugees left a trail of digital breadcrumbs behind them that would eventually be used by the Pew Research Center to recreate their journey in startling detail.

Image: Pew Research Center


The information produced by these migrants was made available to the Pew researchers via publicly accessible search data from Google Trends, and they show the paths taken by refugees, the times of day they were most likely to make their perilous journeys, and their eventual destination points. The resulting Pew report, titled “The Digital Footprint of Europe’s Refugees,” was published earlier today.

When conducting the analysis, the researchers focused on a surprisingly simple formula: look for trends in Arabic language search queries originating in non-Arabic language speaking countries. Most refugees crossed into Greece from Turkey by sea, before continuing on to final destinations in Europe. The presence of the migrants in these countries—where Arabic is spoken at a minimal—accounted for the disproportionate number of Arabic-language searches done in Turkey and elsewhere during the period 2015-16, which was the period covered in the new study.

Image: Pew Research Center

When comparing search terms to migration data collected by government authorities, the researchers came up with several interesting correlations. For example, Google searches in Arabic for the word “Greece” mirrored fluctuations in the number of refugees crossing the Aegean Sea to Greece. And by matching the search term “smuggler” with “Greece,” the researchers were able to determine how the migrants planned their journeys. The queries even allowed the researchers to pinpoint the time of day migrants were most likely to make their way across the Mediterranean; searches for “Greece” typically happened during the early morning hours, when the chances of being detected were minimal.


Revealingly, Arabic-language searches for “Greece” from within Turkey reached a peak in August 2015, and exactly two months later, the volume of migrants arriving in Greece also peaked. This two month gap happens to be the time it takes to process asylum seekers travelling between the two countries.

Image: Pew Research Center


The search patterns also closely mirrored asylum applications once the refugees were in Europe. Once in Europe, around 57 percent of refugees applied for asylum in Germany. Similar to Turkey, Germany’s population is largely non-Arabic speaking, and the number of asylum applications in Germany matched online Arabic search from within Germany.

Much of this analysis certain seems intuitive, and even a bit of a no-brainer. Of course we can track migration patterns by matching them to search queries. But this exercise does represent an important proof of concept. This technique certainly helps during after-the-fact analyses like this one, but it could conceivably be used by government agencies or other groups to track the movements of refugees in real-time. This would allow them to anticipate their movements, and even possibly set up barriers to entry (both physical and bureaucratic). On a positive side, it could alert coast guard personnel to the presence of migrant ships on the ocean, or other high-risk migration strategies. Moving forward, Google will need to be aware of how its data may be used and misused—and guard its data accordingly.


We reached out to Google for comment and will update this post should we hear back.

[Pew Research Center]