As more and more public and private institutions have made their datasets publicly available, the need for sharing and collaborating on the datasets has increased many folds.

The sources of data and their veracity is one important aspect. The other important aspect is how can people collaborate on the privately and publicly available datasets. Collaborating on datasets means being able to do much more than just uploading them. Even with the same data source, as new data parameters are added each hour, collaborators need to download the same version of the dataset, keep a regular track of data revisions, and share them without a lot of hassle.

This blog post describes how you can use dstack APIs as a convenient way to address the above-stated problems. More details on how to create an account in dstack and install the dstack package have been discussed in another post.

Share your dataset with others

Are you working on a project where you have to share your dataset with someone else in the team? Do you wanna keep track of revisions of the dataset that you have shared with your team? In this scenario, dstack offers APIs to publish the dataset that can be shared conveniently as well as tracked for each revision.

In general, you will

Publish the dataset using dstack APIs available with dstack package. Share the dataset using the web application.

Let us start by publishing datasets. In the following example where you want to share a static dataset that you created yourself.

The published dataframe can be accessed via the URL