Introduction

Ecology stands at the threshold of a potentially profound change. Ever‐increasing computational power, coupled with advances in Internet technologies and tools, are together catalyzing new ways of pursuing ecological investigations. These emerging approaches facilitate greater communication, cooperation, collaboration, and sharing, not only of results, but also of data, analytical and modeling code, and potentially even fully documented workflows of the processes—warts and all—that lead to scientific insights. This vision of free and unfettered access to all stages of the scientific endeavor has been called “open science” (Nielsen 2011). As an integrative and highly multidisciplinary field, ecology particularly stands to benefit from this open science revolution, and many ecologists have expressed interest in enhancing the openness of ecology. To date, such conversations among ecologists have largely occurred online (e.g., discussed in Darling et al. 2013); thus it seems timely to present an introduction and path (Tao) to open science for ecologists who may or may not currently be active in the social media forums where the discussion is evolving. We give an overview of the rise of open science, the changes in mindset that open science requires, and the digital tools that can enable ecologists to put the open science mindset into practice.

The exchange of scientific information was institutionalized in the 1660s with the establishment of the Philosophical Transactions of the Royal Society of London and the Journal des Sçavans, the first scientific journals (Beaver and Rosen 1978). While these journals provided platforms for scientists to share their results and ideas, they were largely accessible only to elites—those who could afford a subscription themselves, or those who belonged to an institution that held copies (Nielsen 2011). Individual scientists (i.e., single authors) published in these journals to establish precedence of discovery; the notion of collaboration among scientists does not seem to have taken hold until the 1800s (Beaver and Rosen 1978).

The scientific world looks very different now. Advances in computing power and speed have accelerated not only individual scientists' discoveries but also their collaborative potential (Box 1). Modern scientists constitute a global college, its philosophical transactions enabled by the Internet (Wagner 2008), and collaboration has become the predominant norm for high‐impact research (Wüchty et al. 2007). Technological developments also have enabled the capture (at ever increasing rates) of a previously unimaginable volume of data and metadata (Reichman et al. 2011, Dietze et al. 2013), and have underlain the use of increasingly complex models and analysis techniques to understand these data. Traditional paper notebooks cannot meet the challenges of these new rates of accumulation, sharing, and recombination of ideas, research logs, data, and analyses (Ince et al. 2012, Strasser and Hampton 2012). The tools and approaches that together constitute open science can help ecologists to meet these challenges, by amplifying opportunities for collaboration and rewarding the creation of the consistent and machine‐readable documentation that is necessary for reproducibility of complex projects.

While interest in this new paradigm is on the rise (Fig. 1), it must be acknowledged that both technical and sociocultural obstacles impede adoption for some ecologists. For example, precedence, attribution, investment, and payoff are high‐stakes issues for professional scientists (Hackett 2005). Adopting open practices means ceding some control of these issues, learning new standards and practices for exerting control over others, and devoting precious time to revising familiar modes of research and communication in a seemingly foreign language (Box 2). Yet hewing to traditional practices carries its own risks for the individual investigator. Errors and oversights can persist far longer when experimental design, raw data, and data analysis are held in private; even once published, weeks and months can be wasted in chasing reproduction of results because methods are documented only as fully as a journal word count permits; labs can become isolated, their advancement slowed, for lack of substantive interaction with others. As has been demonstrated in other disciplines, open science can help to mitigate these risks, to the immediate benefit of the individual practitioner (Lawrence 2001, Davis and Fromerth 2007). A community can help the individual scientist identify pre‐publication errors, before they result in paper retractions, damaged reputations, and scientific backtracking.

Figure 1 Open in figure viewer PowerPoint Increasing usage of the term “open science” in the literature since 1995 in Web of Science and PubMed databases. Data from PubMed were downloaded via the rentrez (Winter and Chamberlain 2014) package in R, and Web of Science data were collected from manual searches. Results were normalized by total articles published each year to account for the increasing number of publications. Both data sources show an increase in the number of publications about open science, and an increase in annual citations of those papers.

Box 1 Technological advances driven by scientists Every scientist now uses the Internet, but few are aware of how the Internet grew out of a highly collaborative and open process involving development of publicly available and commentable standard protocols (http://www.fcc.gov/openinternet; Cerf 2002). The availability of “open source” software (a term first coined in the 1990s) radically democratized and expanded participation in the Internet community in the late 1980s‐early 1990s. “Open source” encompasses not only compilers and applications but also protocols and specifications such as the domain name system (DNS) that allows pinpointing specific networked computers (“hosts”) around the world, and HTTP/HTML specifications that provide the basis for the World Wide Web. Members of the scientific research community were early recipients of these advantages, with the National Science Foundation supporting and nurturing growth of the Internet‐based NSFNET from roughly 1985–1995 (National Science Foundation 2007). In that era, it was scientists who were largely communicating through the Internet (gopher, email), transferring their data (FTP), and running analyses on remote servers (telnet, shell access, X11), often with privileged access to fast networks and accounts on powerful computational servers. Within this computer savvy community, “power users” leveraged the Internet most effectively via learning computational skills that were largely command‐line based. The legendary, free GNU suite of software was standard issue for many computers joining the Internet in the late 1980s, and made that early generation of networked “scientific workstations” (from Sun, SGI, DEC, or NeXT) the sought‐after systems of their day. These early forays into powerful software helped birth the plethora of tools now available to the modern scientist. Today, free, multi‐platform, open source tools from the Linux Foundation (free operating system), the Apache Software Foundation (free Web server), the Mozilla Foundation (free Web, email, and other applications), the PostgreSQL Global Development Group (free enterprise database), the Python Software Foundation (free programming language), and the R Foundation for Statistical Computing (analysis and statistical language) are enabling researchers across the globe to dialog with one another via cutting edge communication, execute powerful data manipulation, and develop community‐vetted modeling and analysis tools at minimal individual cost.

Moreover, open science promises many longer term benefits to the scientific community. The adoption of standard best practices and cultural norms for public archiving of data and code will advance discovery and promote fairness in attribution. The use of open‐source tools and open‐access data and journals will help to further democratize science, diversifying perspectives and knowledge by promoting broader access for scientists in developing countries and at under‐resourced institutions, fostering the citizen science that is already a major source of data in some ecological sub‐disciplines (Cooper et al. 2014), and improving the communication of scientific findings both to the general public (Fausto et al. 2012) and to the non‐governmental organizations, managers, and policymakers tasked with putting science into practice.

Here, we discuss the changes in mindset and the tools that can help interested ecologists to find the path toward practicing open science themselves, to facilitate its practice by their students and other colleagues, or both.