Abstract The pervasiveness of mobile devices, which is increasing daily, is generating a vast amount of geo-located data allowing us to gain further insights into human behaviors. In particular, this new technology enables users to communicate through mobile social media applications, such as Twitter, anytime and anywhere. Thus, geo-located tweets offer the possibility to carry out in-depth studies on human mobility. In this paper, we study the use of Twitter in transportation by identifying tweets posted from roads and rails in Europe between September 2012 and November 2013. We compute the percentage of highway and railway segments covered by tweets in 39 countries. The coverages are very different from country to country and their variability can be partially explained by differences in Twitter penetration rates. Still, some of these differences might be related to cultural factors regarding mobility habits and interacting socially online. Analyzing particular road sectors, our results show a positive correlation between the number of tweets on the road and the Average Annual Daily Traffic on highways in France and in the UK. Transport modality can be studied with these data as well, for which we discover very heterogeneous usage patterns across the continent.

Citation: Lenormand M, Tugores A, Colet P, Ramasco JJ (2014) Tweets on the Road. PLoS ONE 9(8): e105407. https://doi.org/10.1371/journal.pone.0105407 Editor: Matjaz Perc, University of Maribor, Slovenia Received: May 21, 2014; Accepted: July 19, 2014; Published: August 20, 2014 Copyright: © 2014 Lenormand et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: The authors confirm that all data underlying the findings are fully available without restriction. The data analyzed are publicly available as they come from public online sites (Twitter https://dev.twitter.com and OpenStreetMap http://www.openstreetmap.org). Funding: Partial financial support has been received from the Spanish Ministry of Economy (MINECO) and FEDER (EU) under projects MODASS (FIS2011-24785) and INTENSE@COSYP (FIS2012-30634), and from the EU Commission through projects EUNOIA, LASAGNE and INSIGHT. ML acknowledges funding from the Conselleria d'Educacio, Cultura i Universitats of the Government of the Balearic Islands and JJR from the Ramon y Cajal program of MINECO. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors confirm that José Javier Ramasco is a PLOS ONE Editorial Board member and this does not alter the authors' adherence to PLOS ONE Editorial policies and criteria.

Introduction An increasing number of geo-located data are generated everyday through mobile devices. This information allows for a better characterization of social interactions and human mobility patterns [1], [2]. Indeed, several data sets coming from different sources have been analyzed during the last few years. Some examples include cell phone records [3]–[16], credit card use information [17], GPS data from devices installed in cars [18], [19], geolocated tweets [20]–[25] or Foursquare data [26]. This information led to notable insights in human mobility at individual level [5], [24], but it makes also possible to introduce new methods to extract origin-destination tables at a more aggregated scale [7], [13], [25], to study the structure of cities [16] and even to determine land use patterns [11], [12], [15], [25]. In this work, we analyze a Twitter database containing over 5 million geo-located tweets from 39 European countries with the aim of exploring the use of Twitter in transport networks. Two types of transportation systems are considered across the continent: highways and trains. Tweets on the road and on the rail between September 2012 and November 2013 have been identified and the coverage of the total transportation system is analyzed country by country. Differences between countries rise due to the different adoption or penetration rates of geo-located Twitter technology. However, our results show that the penetration rate is not able to explain the full picture regarding differences across counties that may be related to the cultural diversity at play. The paper is structured as follows. In the first section, the datasets are described and the method used to identify tweets on highways and railways is outlined. In the second section, we present the results starting by general features about the Twitter database and then comparing different European countries by their percentage of highway and railway covered by the tweets. Finally, the number of tweets on the road is compared with the Average Annual Daily Traffic (AADT) in France and in the United Kingdom to assess its capacity as a proxy to measure traffic loads.

Discussion In this work, we have investigated the use of Twitter in transport networks in Europe. To do so, we have extracted from a Twitter database containing more than 5 million geo-located tweets posted from the highway and the railway networks of 39 European countries. First, we show that the countries have different penetration rates for geo-located tweets with no clear dependence on the economic performance of the country. Our results show, as well, no clear difference between countries in terms of the topological features of the Twitter social network. Dividing the highway and railway systems in segments, we have also studied the coverage of the territory with geo-located tweets. European countries can be ranked according to the highway and railway coverage. The coverages are very different from country to country. Although some of this disparity can be explained by differences in penetration rate or by the use of different transport modalities, a large dispersion in the data still persist. Part of it could be due to cultural differences among European countries regarding the use of geo-located tools. Finally, we explore whether Twitter can be used as a proxy to measure of traffic on highways by comparing the number of tweets and the Average Annual Daily Traffic (AADT) on the highways in United Kingdom and France. We observe a positive correlation between the number of tweets and the AADT. However, the quality of this relationship is reduced due to the short character of some AADT highway segments. We conclude that the number of tweets on the road (train) can be used as a valuable proxy to analyze the preferred transport modality as well as to study traffic congestion provided that the segment length is enough to obtain significant statistics.

Author Contributions Conceived and designed the experiments: ML AT PC JJR. Performed the experiments: ML AT PC JJR. Analyzed the data: ML AT PC JJR. Contributed reagents/materials/analysis tools: ML AT PC JJR. Contributed to the writing of the manuscript: ML AT PC JJR.