by James Blair

Moving R resources from development to production can be a challenge, especially when the resource isn’t something like a shiny application or rmarkdown document that can be easily published and consumed. Consider, as an example, a customer success model created in R. This model is responsible for taking customer data and returning a predicted outcome, like the likelihood the customer will churn. Once this model is developed and validated, there needs to be some way for the model output to be leveraged by other systems and individuals within the company.

Traditionally, moving this model into production has involved one of two approaches: either running customer data through the model on a batch basis and caching the results in a database, or handing the model definition off to a development team to translate the work done in R into another language, such as Java or Scala. Both approaches have significant downsides. Batch processing works, but it misses real-time updates. For example, if the batch job runs every night and a customer calls in the next morning and has a heated conversation with support, the model output will have no record of that exchange when the customer calls the customer loyalty department later the same day to cancel their service. In essence, model output is served on a lag, which can sometimes lead to critical information loss. However, the other option requires a large investment of time and resources to convert an existing model into another language just for the purpose of exposing that model as a real-time service. Neither of these approaches is ideal; to solve this problem, the optimal solution is to expose the existing R model as a service that can be easily accessed by other parts of the organization.

plumber is an R package that allows existing R code to be exposed as a web service through special decorator comments. With minimal overhead, R programmers and analysts can use plumber to create REST APIs that expose their work to any number of internal and external systems. This solution provides real-time access to processes and services created entirely in R, and can effectively eliminate the need to perform batch operations or technical hand-offs in order to move R code into production.

This post will focus on a brief introduction to RESTful APIs, then an introduction to the plumber package and how it can be used to expose R services as API endpoints. In subsequent posts, we’ll build a functioning web API using plumber that integrates with Slack and provides real-time customer status reports.

Web APIs For some, APIs (Application Programming Interface) are things heard of but seldom seen. However, whether seen or unseen, APIs are part of everyday digital life. In fact, you’ve likely used a web API from within R, even if you didn’t recognize it at the time! Several R packages are simply wrappers around popular web APIs, such as tidycensus and gh . Web APIs are a common framework for sharing information across a network, most commonly through HTTP. HTTP To understand how HTTP requests work, it’s helpful to know the players involved. A client makes a request to a server, which interprets the request and provides a response. An HTTP request can be thought of simply as a packet of information sent to the server, which the server attempts to interpret and respond to. Every time you visit a URL in a web browser, an HTTP request is made and the response is rendered by the browser as the website you see. It is possible to inspect this interaction using the development tools in a browser. As seen above, this request is composed of a URL and a request method, which in the case of a web browser accessing a website, is GET. Request There are several components of an HTTP request, but here we’ll mention on only a few. URL: the address or endpoint for the request

Verb / method: a specific method invoked on the endpoint (GET, POST, DELETE, PUT)

Headers: additional data sent to the server, such as who is making the request and what type of response is expected

Body: data sent to the server outside of the headers, common for POST and PUT requests In the browser example above, a GET request was made by the web browser to www.rstudio.com. Response The API response mirrors the request to some extent. It includes headers that contain information about the response and a body that contains any data returned by the API. The headers include the HTTP status code that informs the client how the request was received, along with details about the content that’s being delivered. In the example of a web browser accessing www.rstudio.com, we can see below that the response headers include the status code (200) along with details about the response content, including the fact that the content returned is HTML. This HTML content is what the browser renders into a webpage. httr The httr package provides a nice framework for working with HTTP requests in R. The following basic example demonstrates some of what we’ve already learned by using httr and httpbin.org, which provides a playground of sorts for HTTP requests. library(httr) # A simple GET request response <- GET("http://httpbin.org/get") response ## Response [http://httpbin.org/get] ## Date: 2018-07-23 14:57 ## Status: 200 ## Content-Type: application/json ## Size: 266 B ## {"args":{},"headers":{"Accept":"application/json, text/xml, application/... In this example we’ve made a GET request to httpbin.org/get and received a response. We know our request was successful because we see that the status is 200. We also see that the response contains data in JSON format. The Getting started with httr page provides additional examples of working with HTTP requests and responses. REST Representational State Transfer (REST) is an architectural style for APIs that includes specific constraints for building APIs to ensure that they are consistent, performant, and scalable. In order to be considered truly RESTful, an API must meet each of the following six constraints: Uniform interface: clearly defined interface between client and server

Stateless: state is managed via the requests themselves, not through reliance on an external service

Cacheable: responses should be cacheable in order to improve scalability

Client-Server: clear separation of client and server, each with it’s on distinct responsibilities in the exchange

Layered System: there may be intermediaries between the client and the server, but the client should be unaware of them

Code on Demand: the response can include logic executable by the client We could spend a lot of time diving further into each of these specifications, but that is beyond the scope of this post. More detail about REST can be found here.

Plumber Creating RESTful APIs using R is straightforward using the plumber package. Even if you have never written an API, plumber makes it easy to turn existing R functions into API endpoints. Developing plumber endpoints is simply a matter of providing specialized R comments before R functions. plumber recognizes both #' and #* comments, although the latter is recommended in order to avoid potential conflicts with roxygen2 . The following defines a plumber endpoint that simply returns the data provided in the request query string. library(plumber) #* @apiTitle Simple API #* Echo provided text #* @param text The text to be echoed in the response #* @get /echo function(text = "") { list( message_echo = paste("The text is:", text) ) } Here we’ve defined a simple function that takes a parameter, text , and returns it with some additional comments as part of a list. By default, plumber will serialize the object returned from a function into JSON using the jsonlite package. We’ve provided specialized comments to inform plumber that this endpoint is available at api-url/echo and will respond to GET requests. There are a few ways this plumber script can be run locally. First, assuming the file is saved as plumber.R , the following code would start a local web server hosting the API. plumber::plumb("plumber.R")$run(port = 5762) Once the web server has started, the API can be interacted with using any set of HTTP tools. We could even interact with it using httr as demonstrated earlier, although we would need to open a separate R session to do so since the current R session is busy serving the API. The other method for running the API requires a recent preview build of the RStudio IDE. Recent preview builds include features that make it easier to work with plumber . When editing a plumber script in a recent version of the IDE, a “Run API” icon will appear in the top right hand corner of the source editor. Clicking this button will automatically run a line of code similar to the one we ran above to start a web server hosting the API. A swagger-generated UI will be rendered in the Viewer pane, and the API can be interacted with directly from within this UI. Now that we have a running plumber API, we can query it using curl from the command line to investigate it’s behavior. $ curl "localhost:5762/echo" | jq '.' { "message_echo": [ "The text is: " ] } In this case, we queried the API without providing any additional data or parameters. As a result, the text parameter is the default empty string, as seen in the response. In order to pass a value to our underlying function, we can define a query string in the request as follows: $ curl "localhost:5762/echo?text=Hi%20there" | jq '.' { "message_echo": [ "The text is: Hi there" ] } In this case, the text parameter is defined as part of the query string, which is appended to the end of the URL. Additional parameters could be defined by separating each key-value pair with & . It’s also possible to pass the parameter as part of the request body. However, to leverage this method of data delivery, we need to update our API definition so that the /echo endpoint also accepts POST requests. We’ll also update our API to consider multiple parameters, and return the parsed parameters along with the entire request body. library(plumber) #* @apiTitle Simple API #* Echo provided text #* @param text The text to be echoed in the response #* @param number A number to be echoed in the response #* @get /echo #* @post /echo function(req, text = "", number = 0) { list( message_echo = paste("The text is:", text), number_echo = paste("The number is:", number), raw_body = req$postBody ) } With this new API definition, the following curl request can be made to pass parameters to the API via the request body. $ curl --data "text=Hi%20there&number=42&other_param=something%20else" "localhost:5762/echo" | jq '.' { "message_echo": [ "The text is: Hi there" ], "number_echo": [ "The number is: 42" ], "raw_body": [ "text=Hi%20there&number=42&other_param=something%20else" ] } Notice that we passed more than just text and number in the request body. plumber parses the request body and matches any arguments found in the R function definition. Additional arguments, like other_param in this case, are ignored. plumber can parse the request body if it is URL-encoded or JSON. The following example shows the same request, but with the request body encoded as JSON. $ curl --data '{"text":"Hi there", "number":"42", "other_param":"something else"}' "localhost:5762/echo" | jq '.' { "message_echo": [ "The text is: Hi there" ], "number_echo": [ "The number is: 42" ], "raw_body": [ "{\"text\":\"Hi there\", \"number\":\"42\", \"other_param\":\"something else\"}" ] } While these examples are fairly simple, they demonstrate the extraordinary facility of plumber . Thanks to plumber , it is now a fairly straightforward process to expose R functions so they can be consumed and leveraged by any number of systems and processes. We’ve only scratched the surface of its capabilities and, as mentioned, future posts will walk through the creation of a Slack app using plumber . Comprehensive documentation for plumber can be found here.

Deploying Up until now, we’ve just been interacting with our APIs in our local development environment. That’s great for development and testing, but when it comes time to expose an API to external services, we don’t want our laptop held responsible (at least, I don’t!). There are several deployment methods for plumber outlined in the documentation. The most straightforward method of deployment is to use RStudio Connect. When editing a plumber script in recent versions of the RStudio IDE, a blue publish button will appear in the top right-hand corner of the source editor. Clicking this button brings up a menu that enables the user to publish the API to an instance of RStudio Connect. Once published, API access and performance can be configured through RStudio Connect and the API can be leveraged by external systems and processes.