Saving your data in the cloud ensures that when you send your scripts to your colleagues, you don’t have to send them your data or any additional files with it. When it’s a URL link rather than “C://…” or “/home/…”, your script is always pointing to the same path/address. In this text, we’ll go through how we can use a cloud dataset in our R scripts. We’re going to be using a dataset containing the population of Earth from 5000BC until 2016.

There are several ways and packages to access a url from R. In the Datazar SDK, we’ll be using the “httr” package. Let’s go ahead and grab the Datazar SDK for R. I’ve also included the R code here so you can just copy and paste it to your script.

We’ll be using the datazarData function.

Here are the parameters we need:

username

token

fileId

myUsername<-"aman"

myToken<-"mysupersecrettokenthaticantshow"

fileId<-"f7cb0a20c-2f1c-4ad5-9d05-900d7af97a9c" data<-datazarData(myUsername,myToken,fileId)

That’s it! All done. There’s no need to parse the JSON since the datazarData function takes care of it. Let’s go ahead and plot it so we can see what it looks like.

plot(data,"Year","Population")

R Plot of the streamed dataset.

Conclusion

We went over how to stream datasets directly from the cloud. This method uses HTTP “Basic Authentication” and secures your connection to the Datazar API while you’re streaming your datasets.

I have included both the R script in a project to you can use that one if you want to.

R Script link.

Just modify the parameters to your own Datazar username and token. Using this as best practice will ensure your data is always in one location and you or your colleagues will never have to change dataset location-pointers in your scripts.

Hope you enjoyed this! Feel free to ask questions if you’re stuck somewhere.

Note: there’s a related post on how to do the exact same thing with Mathematica.