Last week, international weekly science journal, Nature, announced that starting this month all research papers accepted for publication will be required to “include information on whether and how others can access the underlying data.”

This is a big and important step forward for the open science movement we are seeing more and more of today. This is because the raw research data behind publications are one of, if not the most valuable component of the replication process. In order to make an honest effort to replicate a study’s results, and further validate hypotheses, this data must be available to others. The more accessible this data is, the better.

Unfortunately, more times than not, researchers will jump through hoops just to get the underlying data from the original author, or an even more likely scenario, never get it at all. An ideal situation, is that a researcher looking to explore a publication’s data, will know exactly where to find it, immediately. Not needing to email five people, wait six months, and then lose interest in the study all together. This is the scientific dreamland Nature is taking a step towards by making this new requirement for it’s contributors.

For researchers, sharing data effectively requires they overcome some substantial inconveniences like time, money, and technical capability. All things that are detouring scientists from sharing data in addition to the fear of their own data being used by others to prove their findings incorrect. The open science community is essentially placing a high expectation on scientists to share every component of their research, but are failing to acknowledge and address all of the barriers currently that are in the way of doing so. The harder it is for owners of data to make it available, the less likely they are to do so. So…Make it easy for them.

Making it as easy as possible to share data, was at the forefront of every decision in the development process at Datazar, where we are creating the largest environment for open data collaboration. To do this, we allow scientists to upload data as they are still conducting their research, while keeping it private until they are ready to share with everyone. This makes it much easier to share a large amount of underlying research data, where as before, an author would have to go back and find all of the data they collected, and upload it individually. To demonstrate exactly what it looks like as a researcher sharing their data on the platform, we can take a look at an example project that can also be used to show how exactly to write a required Data Availability Statement for a Nature publication.

Project Creation

During the first step of the process, a researcher would start by selecting the ‘Project’ tab to create a project where all of the data associated with this particular research could be stored.

After filling in basic information such as the title of your research project, and basic description, one can choose to set the status of the project. This project was set as ‘private’, since we are currently in the middle of conducting research and do not want the data available to everyone yet.

Uploading the Data

Visit this example project at https://www.datazar.com/project/pe3973b2c-fe4c-4f6b-9602-2874a368c37e

Once the project has been created, the author can begin to upload all of the data, they are using/have used to conduct their research by selecting the ‘Upload a File” button.

Visit this example project at https://www.datazar.com/project/pe3973b2c-fe4c-4f6b-9602-2874a368c37e

While uploading a file, a user is able to name, describe, and select file type (Raw Data, Analysis, Visualization etc.) Since we are dealing with data that was collected specifically for this research “Raw Data” is selected and the file is then uploaded.

Visit this example project at https://www.datazar.com/project/pe3973b2c-fe4c-4f6b-9602-2874a368c37e

After uploading the file, it appears within the project overview page.

Visit this example project at https://www.datazar.com/project/pe3973b2c-fe4c-4f6b-9602-2874a368c37e

More files are added as they are collected over the course of the research being conducted.

Making the Research Public

Now that the research is complete and published, we want to make the data available for everyone to see. After going into to ‘Settings’, you can see the top section allows you to change the status of your project from ‘Private’ to ‘Public’.

Now that the project has been made public, the status has changed in the overview section and anyone exploring Datazar can view and download the files. You can also share a link to your project with a unique URL. This is especially convenient when you’re sharing the status of your data’s availability, much like the statement Nature now requires in a ‘Methods’ section.

View this example file at https://www.datazar.com/file/ff7fe3a35-035d-48bb-83cc-fa2ae18b8061

After selecting a file from the ‘Project Overview’ page, you are able to see information about the file including it’s downloads, views, description, and even other datasets associated with it. Each file has it’s own unique ID (highlighted above) which can be shared and searched within Datazar.

Sharing the location

As mentioned before, a data availability statement is something which is now required in publications like Nature. So how would you correctly site data in a statement according to these requirements? Using examples from the projects and data above, as well as the guidelines published by Nature, a statement would look much like this:

Sequence data that support the findings of this study have been deposited in Datazar, https://www.datazar.com/project/pe3973b2c-fe4c-4f6b-9602-2874a368c37e

Moving Forward

As the Open Science movement continues to grow and become more prevalent in the research community, it is important to be mindful of a big reason why data is not shared in the first place, which this that it is hard to do and takes a great amount of effort. At Datazar, we built something that could be incorporated into the current process so that a researcher could organize their data as they’re doing the research, rather than retracing their steps and becoming overwhelmed at the end. Getting researchers to share their data should not be approached only by creating strict rules and consequences for not following these rules. It should also be approached from an angle of empathy towards the researcher. If you want someone to do something, make it as easy for them as possible.

Explore this project and datasets along with many others at https://www.datazar.com/