A lack of useful information about where people go during their leisure time has hindered progress toward understanding what draws and repels people to and from various recreation sites around the world. Here, we show that crowd-sourced information can offer new perspectives on this old problem, revolutionizing the way we study people and understand their choices. We hypothesized that pictures could indicate visitors and furthermore, that photographs uploaded to an image-sharing website could record people's choices and provide useful data worldwide. Our comparison of visitation data collected from 836 sites in 31 countries with data generated from geotagged photographs uploaded to flickr shows that the crowd-sourced data are indeed a suitable proxy for the more traditional time- and labor-intensive empirical estimates. This represents a significant advancement, as this new proxy measure of visitation can be applied almost anywhere: in developed and developing countries, data-poor and data-rich locations, urban areas and wilderness. Wherever people are taking and uploading pictures we can use that information to indicate their visit and learn from it.

There is considerable variation among sites in the concordance between empirical and photograph-based visitation rates. However, if used carefully, this relationship could inform future studies of marginal or absolute changes in visitation rates. Many questions can be answered through scenario-based assessments of relative changes in visitation across alternative management regimes. Managers could use geotagged photographs to explore marginal changes in visitation rates with changes in ecosystem health, site access, or tourism infrastructure under alternative future scenarios, irrespective of the variability in the relationship between and . Studies requiring absolute visitation rates should note that the slope of the relationship between empirical and photograph-based visitation rates is consistent across income levels, I and attraction types, A and it is only the height of the function that varies. In other words, globally, visitation rates derived from field data and images are consistently scaled with a slope of 0.70, but the absolute visitation rate varies with local socioeconomic conditions and attributes of the site. The precision of predictions will hinge on the similarity of the study sites in both geography and the types of attractions. Absolute visitation rates are less variable across sites within nations or from a single destination type, such as a state park or an art gallery (Fig. 2). Wherever some local visitation data are available, the height (and potentially the slope) of the relationship between and could be adjusted to suit the research questions and particular study region. We encourage researchers and practitioners to seek additional socioeconomic factors beyond income and attraction type that explain local variability in the absolute visitation rate. Despite the variability in and , this study presents promising new evidence that visitation rates can be quantified, both in relative and absolute terms, using geotagged photographs and a few easily-measured variables of a site.

Home countries given by users on flickr correspond with the home countries of travelers recorded at immigration entry points, making crowd-sourced data not only useful for estimating visitation rates, but also for understanding where visitors originate. Because the time and money that people spend traveling indicates how much they value the destination, these data on the origin and destination of recreators are enormously beneficial for economic methods for valuing recreation sites. One preferred approach for quantifying value is to use a “travel cost model” which uses the cost of travel to estimate peoples' willingness to pay to recreate at particular sites27. Travel cost studies are often criticized for not accounting for people who visit multiple sites on a single trip away from home. Crowd-sourced visitation data can potentially address this issue since users often upload images throughout their journey.

The ability to estimate visitation rates without survey data allows for models that can anticipate changes in visitation in response to changes in ecosystems, relative to other types of change (in built features, social capital, etc.). Random utility models are one example of an economic technique for quantifying the marginal benefits of natural environments and other attributes. Typically, telephone surveys are conducted asking respondents where they live, which recreational sites they visit and why. These individuals' choices about which sites to visit reflect their preferences for certain characteristics of sites and the tradeoff between the costs (e.g., travel) and benefits (e.g., presence of wildlife) of the trip. Here, we show that the same data can be gathered using the locations of photographs and spatial data on the presence of features such as swimming beaches, cultural events, or other attractions. Enticing evidence that this approach is suitable for understanding people's choices is demonstrated by the match of flickr photos to known temporal aggregations of people in Zuccotti Park, Black Rock Desert and southern Vermont (Fig. 6). We offer these as initial examples and hope to spark further use of this approach to understand what draws and repels people to and from particular places.

Of course, this method is imperfect. There may be biases in who is taking digital photographs and uploading them to social media sites. Different recreational activities may be more or less suited to taking photographs. Surfers, for example, while likely possessing cameras and internet access, may prefer not to take photographs while surfing. Also, the perceived value of a trip may influence whether an individual takes or shares photographs, resulting in a bias against images from visitors who travel shorter distances from home. We observe, for example, that tourists visiting Nepal from neighboring Sri Lanka and India upload fewer photographs to flickr than predicted based on the overall trend (Fig. 5). Similarly, local visitors may be less inspired to take or share photographs of commonly-visited sites. While we find strong correlations between the crowd-sourced information and empirical data at attractions, such as national parks, we do not look at correlations between crowd-sourced information and visitation to more mundane locations, like shopping centers, that might be popular sites for recreation by local people. Further work is needed to explore the utility of this approach at locations that are not major attractions or landmarks. Other social media such as geotagged tweets might serve as more effective proxies for some types of recreation, particularly in urban areas.

New technologies and digital social media have begun making vast amounts of geolocated data available for a wide range of creative purposes, including art, targeted advertising, crime prevention and scientific research. Some authors are rightfully raising concerns about the appropriate and ethical use of these data and the potential for apophenia: to see patterns in “big data” where none actually exist28. In response to their calls for more critical assessments of digital data, this study vets a novel method for using geotagged photographs from flickr to provide sources of information for understanding where people go. We conclude that crowd-sourced information can not only break the log-jam of expensive empirical data requirements for predicting and valuing how changes in the landscape alter recreation and tourism, but also can provide revolutionary information for understanding questions about where people recreate, in ways unimaginable before the existence of the internet and social media.