Data operations is an increasingly important part of data science because it enables companies to feed large business data back into production effectively. We at STATWORX, therefore, operationalize our models and algorithms by translating them into Application Programming Interfaces (APIs). Representational State Transfer (REST) APIs are well suited to be implemented as part of a modern micro-services infrastructure. They are flexible, easy to deploy, scale, and maintain, and they are further accessible by multiple clients and client types at the same time. Their primary purpose is to simplify programming by abstracting the underlying implementation and only exposing objects and actions that are needed for any further development and interaction. An additional advantage of APIs is that they allow for an easy combination of code written in different programming languages or by different development teams. This is because APIs are naturally separated from each other, and communication with and between APIs is handled by IP or URL (http), typically using JSON or XML format. Imagine, e.g., an infrastructure, where an API that’s written in Python and one that’s written in R communicate with each other and serve an application written in JavaScript.

In this blog post, I will show you how to translate a simple R script, which transforms tables from wide to long format, into a REST API with the R package Plumber and how to run it locally or with Docker. I have created this example API for our trainee program, and it serves our new data scientists and engineers as a starting point to familiarize themselves with the subject.

Translate the R Script

Transforming an R script into a REST API is quite easy. All you need, in addition to R and RStudio, is the package Plumber and optionally Docker. REST APIs can be interacted with by sending a REST Request , and the probably most commonly used ones are GET , PUT , POST , and DELETE . Here is the code of the example API, that transforms tables from wide to long or from long to wide format:

## transform wide to long and long to wide format #' @post /widelong #' @get /widelong function(req) { # library require(tidyr) require(dplyr) require(magrittr) require(httr) require(jsonlite) # post body body <- jsonlite::fromJSON(req$postBody) .data <- body$.data .trans <- body$.trans .key <- body$.key .value <- body$.value .select <- body$.select # wide or long transformation if(.trans == 'l' || .trans == 'long') { .data %<>% gather(key = !!.key, value = !!.value, !!.select) return(.data) } else if(.trans == 'w' || .trans == 'wide') { .data %<>% spread(key = !!.key, value = !!.value) return(.data) } else { print('Please specify the transformation') } }

As you can see, it is a standard R function, that is extended by the special plumber comments @post and @get , which enable the API to respond to those types of requests. It is necessary to add the path, /widelong , to any incoming request. That is done because it is possible to stack several API functions, which respond to different paths. We could, e.g., add another function with the path /naremove to our API, which removes NAs from tables.

The R function itself has one function argument req , which is used to receive a (POST) Request Body. In general, there are two different possibilities to send additional arguments and objects to a REST API, the header and the body. I decided to use a body only and no header at all, which makes the API cleaner, safer and allows us to send larger objects. A header could, e.g., be used to set some optional function arguments, but should be used sparsely otherwise.

Using a body with the API is also the reason to allow for GET and POST Requests ( @post , @get ) at the same time. While some clients prefer to send a body with a GET Request, when they do not permanently post something to the server etc., many other clients do not have the option to send a body with a GET Request at all. In this case, it is mandatory to add a POST Request. Typical clients are Applications, Integrated Development Environments (IDEs), and other APIs. By accepting both request types, our API, therefore, gains greater response flexibility.

For the request-response format of the API, I have decided to stick with the JavaScript Object Notation (JSON), which is probably the most common format. It would be possible to use Extensible Markup Language (XML) with R Plumber instead as well. The decision for one or the other will most likely depend on which additional R packages you want to use or on which format the API’s clients are predominantly using. The R packages that are used to handle REST Requests in my example API are jsonlite and httr . The three Tidyverse packages are used to do the table transformation to wide or long.

RUN the API

The finished REST API can be run locally with R or RStudio as follows:

library(plumber) widelong_api <- plumber::plumb("./path/to/directory/widelongwide.R") widelong_api$run(host = '127.0.0.1', port = 8000)

Upon starting the API, the Plumber package provides us with an IP address, and a port and a client, e.g., another R instance, can now begin to send REST Requests. It also opens a browser tool called Swagger, which can be useful to check if your API is working as intended. Once the development of an API is finished, I would suggest to build a docker image and run it in a container. That makes the API highly portable and independent of its host system. Since we want to use most APIs in production and deploy them to, e.g., a company server or the cloud, this is especially important. Here is the Dockerfile to build the docker image of the example API:

FROM trestletech/plumber # Install dependencies RUN apt-get update --allow-releaseinfo-change && apt-get install -y \ liblapack-dev \ libpq-dev # Install R packages RUN R -e "install.packages(c('tidyr', 'dplyr', 'magrittr', 'httr', 'jsonlite'), \ repos = 'http://cran.us.r-project.org')" # Add API COPY ./path/to/directory/widelongwide.R /widelongwide.R # Make port available EXPOSE 8000 # Entrypoint ENTRYPOINT ["R", "-e", \ "widelong <- plumber::plumb('widelongwide.R'); \ widelong$run(host = '0.0.0.0', port= 8000)"] CMD ["/widelongwide.R"]

Send a REST Request

The wide-long example API can generally respond to any client sending a POST or GET Request with a Body in JSON format, that contains a table in csv format and all needed information on how to transform it. Here is an example for a web application, which I have written for our trainee program to supplement the wide-long API:

The application is written in R Shiny, which is a great R package to transform your static plots and outputs into an interactive dashboard. If you are interested in how to create dashboards in R, check out other posts on our STATWORX Blog.

Last but not least here is an example on how to send a REST Request from R or RStudio:

library(httr) library(jsonlite) options(stringsAsFactors = FALSE) # url for local testing url <- "http://127.0.0.1:8000" # url for docker container url <- "http://0.0.0.0:8000" # read example stock data .data <- read.csv('./path/to/data/stocks.csv') # create example body body <- list( .data = .data, .trans = "w", .key = "stock", .value = "price", .select = c("X","Y","Z") ) # set API path path <- 'widelong' # send POST Request to API raw.result <- POST(url = url, path = path, body = body, encode = 'json') # check status code raw.result$status_code # retrieve transformed example stock data .t_data <- fromJSON(rawToChar(raw.result$content))

As you can see, it is quite easy to make REST Requests in R. If you need some test data, you could use the stocks data example from the Tidyverse.

Summary

In this blog post, I showed you how to translate a simple R script, which transforms tables from wide to long format, into a REST API with the R package Plumber and how to run it locally or with Docker. I hope you enjoyed the read and learned something about operationalizing R scripts into REST APIs with the R package Plumber and how to run them locally and with Docker. You are of welcome to copy and use any code from this blog post to start and create your REST APIs with R.

Until then, stay tuned and visit our STATWORX Blog again soon.

We’re hiring!

Data Engineering is your jam and you’re looking for a job? We’re currently looking for Junior Consultants and Consultants in Data Engineering. Check the requirements and benefits of working with us on our career site. We’re looking forward to your application!

Über den Autor Stephan Emmer I am a data scientist at STATWORX and to work with data in a professional way on a day to day basis is just AWESOME!