Full Disclosure: Author is a data scientist and consultant working in the environmental realm.

The defining aspect of contemporary internet usage has been the volume of data generated. As a greater proportion of the world’s population develop Internet literacy, they generate up to 2.5 quintillion bytes of data a day. This has spurred the development of new techniques and tactics for institutions to adjust to this age of big data. International and domestic environmental policy will not be left unscathed by these developments. The influx of new environmental data can change enforcement and compliance efforts, as well as open up new ways to maintain corporate and governmental accountability. We’ll look at the impacts that data science, open data, and the ‘Internet of Things’ could have on modern environmental policy and management.

Data science is the combination of statistical techniques with computer science concepts. The running joke is that data scientist knows more about statistics than a software engineer, and more about computer science than a statistician. Data science is often synonymous with machine learning techniques such as neural networks and random forest classification, which can create powerful predictive analytical tools.

These predictive tools can simplify enforcement by identifying potential bad actors. They could be used to predict species distributions in response to changes in environmental conditions. For example, data from Instagram could be analyzed to detect pictures of wildlife. These pictures will have a time, date, and approximate location. After controlling for pictures that may be old or out of date using natural language processing techniques, you could learn about new locations for species. This type of information is especially important for climate and urban ecology, where changes in species distributions are a major topic of discussion.

Open data has been an exciting topic in academia. Open data is freely available data open to use by the general public. The United States government has done a lot work in this area with data.gov. Open data initiatives are even popping up at the state and local level, such as the availability of some major data sets from New York City. Open data, when combined with analytical tools, can be used to ensure accountability from NGOs and governments alike.

At the core of what can prevent open data from proliferating are potential risks in the form of legal and confidentiality concerns. For example, if a city publishes its water data and it’s demonstrated that the water is consistently out of compliance with existing water regulations, it can invite potential unwanted attention from state regulators and concerned citizens. Though on the opposite end, actors can be incentivized to be involved in open data initiatives through the development of grants the promise of coproduction on solving environmental ills.

Perhaps the most promising area is through the use of the sensors connected to the Internet to create live data streams. Sensors uploading data in this matter become what is known as the ‘Internet of Things’. These real time data streams can be linked to prediction algorithms to anticipate potential environmental disasters before they become untenable. They could even provide environmental enforcers with notifications of failures to comply with existing regulations.

Data science is a rapidly evolving field being applied to a multitude of topic areas ranging from advertising to poaching prevention. Predictive tools can be used to anticipate environmental problems and target bad actors. Open data could enhance the use of these new tools and increase accountability. The ‘Internet of Things’ has the potential to revolutionize environmental monitoring by providing live data streams accessible. This all can play a role in improving environmental management and accountability. Data science will play a big role in the 21st century, and environmental issues will not go untouched.

Image courtesy of Flickr. Originally published by S&S on June 28, 2016.