The article that eLife published today builds on years of work by the community towards the long-standing vision of a fully reproducible, living research article. It represents a ‘bringing together’ of two of the primary frameworks currently used for reproducible research, RMarkdown and Jupyter. Under the hood it relies on many open-source applications and packages. It was made possible because the authors of the article went the extra mile in making their work reproducible, and registered their code and source data on the Open Science Framework (OSF).

We were able to download the author’s RMarkdown, R Scripts and data from the OSF. We used Stencila’s converter software to convert the article from the RMarkdown to JATS, the Journal Article Tag Suite – an XML format used by many publishers and upon which the DAR format is based. The converter uses Pandoc, a powerful and popular document conversion tool, and ensures that code cells in the RMarkdown are converted into the JATS extensions for reproducibility that DAR introduces. The result is an archivable article, in a format widely used by publishers, which also embeds the original source code. Stencila is continuing to work on improving its conversion software to ease the transition from existing formats to DAR.

Having source code embedded in the published article allows for transparency. But to achieve live execution from within the browser, we need a way to have R code executed remotely. To do this, Stencila’s interfaces connect to ‘execution contexts’ hosted within R or Python sessions. The execution contexts perform two steps. First, they analyse a chunk of code to determine the variables it depends upon, and any variable that is creates. This provides the automatic dependency analysis that enables Stencila’s reactive interfaces. Secondly, they actually ‘execute’ the chunk of code and return the result as data that can be rendered in document or re-used in another cell.

To reproduce this article, it is not only necessary to have the original source code embedded in the article; we also need to have an R session with the necessary R packages used by the code (e.g. ggplot2, cowplot) and the source data (CSV files which can be downloaded from OSF). Docker images provide a way of bundling source code, packages and data so that they can be re-executed. For this demo, rather than writing our own Docker images, we made use of the recently completed integration between Stencila and Binder – a project that was started by Daniel Nust and Min Ran-Kelley at the eLife Innovation Sprint in May 2018 (read more about this in Daniel’s blog post).

We created a folder for the eLife article in a GitHub repository which we could point Binder at. This folder contains two configuration files, runtime.txt and install.R, which tell Binder which version of R, and which R packages, to install into the Docker image. Thanks to Daniel and Min’s integration you can click on the “Launch Binder” button in the README of that folder to open an editable, interactive version of the article hosted by Binder.

To create a fully reproducible journal article, we wanted to go one step further and take a ‘progressive enhancement’ approach in which the reader starts with a static document but can choose to ‘turn on’ the executable parts. In this approach, the content of the article, including the reproducible figures, are pre-rendered, hosted by the publisher, and can be delivered to the reader just like any web page. However, when the user chooses to edit any of the embedded code, the document is connected to a Docker container hosted by Binder. To do this, we created a tool called ‘Bindilla’ which acts as a bridge between the article in the browser and the container. Bindilla asks Binder to create a container for the image and then proxies requests for code execution to the execution contexts running in that container.