Tufte in R

Motivation The idea behind Tufte in R is to use R - the most powerful open-source statistical programming language - to replicate excellent visualisation practices developed by Edward Tufte. It’s not a novel approach - there are plenty of excellent R functions and related packages wrote by people who have much more expertise in programming than myself. I simply collect those resources in one place in an accessible and replicable format, adding a few bits of my own coding discoveries.

Format Each visualisation is provided in three graphical systems used in R: base graphics, lattice and ggplot2. As an example data I mainly use basic data sets easily accessible within R. Occasionally I use data from package psych developed by William Revelle, package MASS developed by Brian Ripley with collegues and various custom data I link via my Gist profile. This page was produced in RMarkdown using Michael Sachs’s tuftehandout , but with a modified CSS inspired by Dave Liepmann’s Tufte CSS. Its best if you view this page on a desktop computer rather than mobile devices.

Requirements You need the most recent version of R installed on you computer. You also need a basic understanding of R and there are some great online tutorials to get you started. I also recommend R Studio as an integrated development environment for R. I use resources from a number of R packages. You can install all those packages at once via R console using the command below: install.packages(c("CarletonStats", "devtools", "epanetReader", "fmsb", "ggplot2", "ggthemes", "latticeExtra", "MASS", "PerformanceAnalytics", "psych", "plyr", "prettyR", "plotrix", "proto", "RCurl", "reshape", "reshape2")) Minimal line plot Edward Tufte, The Visual Display of Quantitative Information (Cheshire, 1983), p. 65 We start by plotting the most basic graph from page 65 of The Visual Display of Quantitative Information - a minimal line plot. This one is important because it illustrates the most elemental principle - that of minimalism with reduced ‘data-ink’. As Tufte explains, the ‘data-ink’ (total ink used to print the graphic) ratio should equal to ‘1 - proportion of graphic that can be erased without loss of data-information’. The primary challenge is therefore to modify the default graphs produced with R so that we remove as much of ‘non-data ink’ as possible. As you will soon see, this is done by subtracting and deconstructing existing R graphs to get rid of as much ‘non-data ink’ as possible.

Minimal line plot in base graphics Parameter axis = F prevents from drawing all axes elements so they can be easily refined with axis() function. I use minimal and maximal values from the data to draw text() - it usually requires a bit of tweaking to get it right. Font is changed to serif with family . x <- 1967:1977 y <- c(0.5,1.8,4.6,5.3,5.3,5.7,5.4,5,5.5,6,5) pdf(width=10, height=6) plot(y ~ x, axes=F, xlab="", ylab="", pch=16, type="b") axis(1, at=x, label=x, tick=F, family="serif") axis(2, at=seq(1,6,1), label=sprintf("$%s", seq(300,400,20)), tick=F, las=2, family="serif") abline(h=6,lty=2) abline(h=5,lty=2) text(max(x), min(y)*2.5,"Per capita

budget expanditures

in constant dollars", adj=1, family="serif") text(max(x), max(y)/1.08, labels="5%", family="serif") dev.off()

Minimal line plot in lattice Arguments scales and par.settings have to be used heavily to customise scales and get rid of box. I used benbarnes axis hack from Stackoverflow to draw only axes ticks. library(lattice) x <- 1967:1977 y <- c(0.5,1.8,4.6,5.3,5.3,5.7,5.4,5,5.5,6,5) xyplot(y~x, xlab="", ylab="", pch=16, col=1, border = "transparent", type="o", abline=list(h = c(max(y),max(y)-1), lty = 2), scales=list(x=list(at=x,labels=x, fontfamily="serif", cex=1), y=list(at=seq(1,6,1), fontfamily="serif", cex=1, label=sprintf("$%s",seq(300,400,20)))), par.settings = list(axis.line = list(col = "transparent"), dot.line=list(lwd=0)), axis = function(side, line.col = "black", ...) { if(side %in% c("left","bottom")) {axis.default(side = side, line.col = "black", ...)}}) ltext(current.panel.limits()$xlim[2]/1.1, adj=1, fontfamily="serif", current.panel.limits()$ylim[1]/1.3, cex=1, "Per capita

budget expandures

in constant dollars") ltext(current.panel.limits()$xlim[2]/1.1, adj=1, fontfamily="serif", current.panel.limits()$ylim[1]/5.5, cex=1, "5%")

Minimal line plot in ggplot2 I use excellent package ggthemes by Jeffrey B. Arnold which provides a lot of useful functions for Tufte-like plots - including a dedicated theme_tufte() function. library(ggplot2) library(ggthemes) x <- 1967:1977 y <- c(0.5,1.8,4.6,5.3,5.3,5.7,5.4,5,5.5,6,5) d <- data.frame(x, y) ggplot(d, aes(x,y)) + geom_line() + geom_point(size=3) + theme_tufte(base_size = 15) + theme(axis.title=element_blank()) + geom_hline(yintercept = c(5,6), lty=2) + scale_y_continuous(breaks=seq(1, 6, 1), label=sprintf("$%s",seq(300,400,20))) + scale_x_continuous(breaks=x,label=x) + annotate("text", x = c(1977,1977.2), y = c(1.5,5.5), adj=1, family="serif", label = c("Per capita

budget expandures

in constant dollars", "5%"))

Minimal line plot - interactive plot with highcharter A new approach to create a dynamic plots with package highcharter - seems its based on Java wrappers and includes dedicated hc_theme_tufte() . library(highcharter) x <- 1967:1977 y <- c(290,318,372,385,385,372,386,380,390,400,380) d <- data.frame(x, y) highchart() %>% hc_chart(type = "scatter") %>% hc_subtitle(text = "Per capita budget expanditures in constant dollars") %>% hc_yAxis(labels = list(format = "${value}")) %>% hc_add_series(data = d) %>% hc_add_theme(hc_theme_tufte()) Range-frame (or quartile-frame) scatterplot Edward Tufte, The Visual Display of Quantitative Information (Cheshire, 1983), p. 130-133. This doesn’t really replicate Tufte range frame because its a bit tricky to draw custom axis lines in basic graphics. As a rough starting point I use summary() to display values for minimum, maximum, median, mean and both quartiles on the axes.

Range frame plot in base graphics x <- mtcars$wt y <- mtcars$mpg plot(x, y, main="", axes=FALSE, pch=16, cex=0.8, family="serif", xlab="Car weight (lb/1000)", ylab="Miles per gallon of fuel") axis(1,at=summary(x),labels=round(summary(x),1), tick=F, family="serif") axis(2,at=summary(y),labels=round(summary(y),1), tick=F, las=2, family="serif")

Range frame plot in base graphics with fancyaxis Steven Murdoch’s GitHub fancyaxis function - currently doesn’t work. # library(devtools) # source_url("https://raw.githubusercontent.com/sjmurdoch/fancyaxis/master/fancyaxis.R") # x <- mtcars$wt # y <- mtcars$mpg # plot(x, y, main="", axes=FALSE, pch=16, cex=0.8, # xlab="Car weight (lb/1000)", ylab="Miles per gallon of fuel") # fancyaxis(1, summary(x), digits=1) # fancyaxis(2, summary(y), digits=1)

Range frame plot in lattice Again, I used benbarnes axis hack from Stackoverflow to draw only axes ticks. Heavy use of par.settings to change the fontfamily to serif. library(lattice) x <- mtcars$wt y <- mtcars$mpg xyplot(y ~ x, mtcars, col=1, pch=16, fontfamily="serif", xlab="Car weight (lb/1000)", ylab="Miles per gallon of fuel", par.settings = list(axis.line = list(col="transparent"), par.xlab.text=list(fontfamily="serif"), par.ylab.text=list(fontfamily="serif")), scales = list(x=list(at=summary(mtcars$wt),labels=round(summary(mtcars$wt),1), fontfamily="serif"), y=list(at=summary(mtcars$mpg),labels=round(summary(mtcars$mpg),1), fontfamily="serif")), axis = function(side, line.col = "black", ...) { if(side %in% c("left","bottom")) {axis.default(side = side, line.col = "black", ...)}})

Range-frame plot in ggplot2 Another use of package ggthemes by Jeffrey B. Arnold - this time for geom_rangeframe() . library(ggplot2) library(ggthemes) ggplot(mtcars, aes(wt, mpg)) + geom_point() + geom_rangeframe() + theme_tufte() + xlab("Car weight (lb/1000)") + ylab("Miles per gallon of fuel") + theme(axis.title.x = element_text(vjust=-0.5), axis.title.y = element_text(vjust=1.5))

Range-frame plot in ggplot2 with qfplot Function qfplot by Mikhail Y. Popov - limited customisation, but simple to deploy. library(devtools) source_url('https://raw.githubusercontent.com/bearloga/Quartile-frame-Scatterplot/master/qfplot.R') qfplot(x=mtcars$wt, y=mtcars$mpg, xlab="Car weight (lb/1000)", ylab="Miles per gallon of fuel") Dot-dash (or rug) scatterplot

Dot-dash plot in base graphics with fancyaxis Edward Tufte, The Visual Display of Quantitative Information (Cheshire, 1983), p. 130-133. Steven Murdoch’s GitHub minimalrug function (currently doesn’t work). # library(devtools) # source_url("https://raw.githubusercontent.com/sjmurdoch/fancyaxis/master/fancyaxis.R") # x <- mtcars$wt # y <- mtcars$mpg # plot(x, y, main="", axes=FALSE, pch=16, cex=0.8, # xlab="Car weight (lb/1000)", ylab="Miles per gallon of fuel", # xlim=c(min(x)-0.2, max(x)+0.2), # ylim=c(min(y)-1.5, max(y)+1.5)) # axis(1, tick=F) # axis(2, tick=F, las=2) # minimalrug(x, side=1, line=-0.8) # minimalrug(y, side=2, line=-0.8)

Dot-dash plot in lattice A useful panel.rug() lattice function used to create a dot-dash axis with a neat solution from Josh O’Brien to control the length of dash margin lines. library(lattice) x <- mtcars$wt y <- mtcars$mpg xyplot(y ~ x, xlab="Car weight (lb/1000)", ylab="Miles per gallon of fuel", par.settings = list(axis.line = list(col="transparent")), panel = function(x, y,...) { panel.xyplot(x, y, col=1, pch=16) panel.rug(x, y, col=1, x.units = rep("snpc", 2), y.units = rep("snpc", 2), ...)})

Dot-dash plot in ggplot2 Here I use a geom_rug() function from ggplot2 . library(ggplot2) library(ggthemes) ggplot(mtcars, aes(wt, mpg)) + geom_point() + geom_rug() + theme_tufte(ticks=F) + xlab("Car weight (lb/1000)") + ylab("Miles per gallon of fuel") + theme(axis.title.x = element_text(vjust=-0.5), axis.title.y = element_text(vjust=1)) Marginal histogram scatterplot

Marginal histogram scatterplot in base graphics with fancyaxis Edward Tufte, The Visual Display of Quantitative Information (Cheshire, 1983), p. 133.

Steven Murdoch’s GitHub axisstripchart creates a marginal dot-histograms, although with limited customisation. library(devtools) source_url("https://raw.githubusercontent.com/sjmurdoch/fancyaxis/master/fancyaxis.R") x <- faithful$waiting y <- faithful$eruptions plot(x, y, main="", axes=FALSE, pch=16, cex=0.8, xlab="Time till next eruption (min)", ylab="Duration (sec)", xlim=c(min(x)/1.1, max(x)), ylim=c(min(y)/1.5, max(y))) axis(1, tick=F) axis(2, tick=F, las=2) axisstripchart(faithful$waiting, 1) axisstripchart(faithful$eruptions, 2)

Marginal histogram scatterplot in lattice (in preparation)

Marginal histogram scatterplot in ggplot2 with ggMarginal Dean Attali’s ggExtra package includes function ggMarginal to create margin histograms. There is no option to create dot-histograms, just the standard bar-based histograms. library(ggplot2) library(ggExtra) library(ggthemes) p <- ggplot(faithful, aes(waiting, eruptions)) + geom_point() + theme_tufte(ticks=F) ggMarginal(p, type = "histogram", fill="transparent") However, ggMarginal can be also used to quickly create margin densityplots using the same function: library(ggplot2) library(ggExtra) library(ggthemes) p <- ggplot(faithful, aes(waiting, eruptions)) + geom_point() + theme_tufte(ticks=F) + theme(axis.title=element_blank(), axis.text=element_blank()) ggMarginal(p, type = "density")





…and it can also be used to create margin boxplots:



library(ggplot2) library(ggExtra) library(ggthemes) p <- ggplot(faithful, aes(waiting, eruptions)) + geom_point() + theme_tufte(ticks=F) + theme(axis.title=element_blank(), axis.text=element_blank()) ggMarginal(p, type = "boxplot", size=10, fill="transparent") Minimal boxplot Edward Tufte, The Visual Display of Quantitative Information (Cheshire, 1983), p. 125 & 129. Argument pars is used to deconstruct the default base graphics boxplot.

Minimal boxplot in base graphics x <- quakes$mag y <- quakes$stations boxplot(y ~ x, main = "", axes = FALSE, xlab=" ", ylab=" ", pars = list(boxcol = "transparent", medlty = "blank", medpch=16, whisklty = c(1, 1), medcex = 0.7, outcex = 0, staplelty = "blank")) axis(1, at=1:length(unique(x)), label=sort(unique(x)), tick=F, family="serif") axis(2, las=2, tick=F, family="serif") text(min(x)/3, max(y)/1.1, pos = 4, family="serif", "Number of stations

reporting Richter Magnitude

of Fiji earthquakes (n=1000)")

Minimal boxplot in base graphics with chart.Boxplot This uses chart.Boxplot function from PerformanceAnalytics package with dedicated as.Tufte=T argument. This solution requires data to be in wide table format and it has a limited customisation options. library(PerformanceAnalytics) library(psych) d <- msq[,80:84] chart.Boxplot(d, main = "", xlab="average personality rating (based on n=3896)", ylab="", element.color = "transparent", as.Tufte=TRUE)

Minimal boxplot in lattice Argument par.settings is used to deconstruct the default lattice boxplot. x <- quakes$mag y <- quakes$stations bwplot(y ~ x, horizontal=F, xlab="", ylab="", do.out = FALSE, box.ratio = 0, scales=list(x=list(labels=sort(unique(x)), fontfamily="serif"), y=list(fontfamily="serif")), par.settings = list(axis.line = list(col = "transparent"), box.umbrella=list(lty=1, col= 1), box.dot=list(col= 1), box.rectangle = list(col= c("transparent")))) ltext(current.panel.limits()$xlim[1]+250, adj=1, current.panel.limits()$ylim[2]+50, fontfamily="serif", "Number of stations

reporting Richter Magnitude

of Fiji earthquakes (n=1000)")

Minimal boxplot in ggplot2 Function geom_tufteboxplot from package ggthemes is used to draw boxplot in ggplot2. library(ggplot2) library(ggthemes) ggplot(quakes, aes(factor(mag),stations)) + theme_tufte() + geom_tufteboxplot(outlier.colour="transparent") + theme(axis.title=element_blank()) + annotate("text", x = 8, y = 120, adj=1, family="serif", label = c("Number of stations

reporting Richter Magnitude

of Fiji earthquakes (n=1000)")) Minimal barchart

Minimal barchart in base graphics Edward Tufte, The Visual Display of Quantitative Information (Cheshire, 1983), p. 125 & 129. Basic graphics has an awkward way to change the width of bars in barplot - especially when you want to draw axis names separately. It requires some tweaking of arguments width and space , as well as location in axis() function. I use ablines to draw Tufte-like grid lines. library(psych) d <- colMeans(msq[,c(2,7,34,36,42,43,46,55,68)], na.rm = T)*10 barplot(d, xaxt="n", yaxt="n", ylab="", border=F, width=c(.35), space=1.8) axis(1, at=(1:length(d))-.26, labels=names(d), tick=F, family="serif") axis(2, at=seq(1, 5, 1), las=2, tick=F, family="serif") abline(h=seq(1, 5, 1), col="white", lwd=3) abline(h=0, col="gray", lwd=2) text(min(d)/2, max(d)/1.2, pos = 4, family="serif", "Average scores

on negative emotion traits

from 3896 participants

(Watson et al., 1988)")

Minimal barchart in lattice Lattice barchart draws bars horizontally by default and it gets messy if you change it to vertical bars. Function panel.abline is used to draw grid lines. library(lattice) library(psych) d <- colMeans(msq[,c(2,7,34,36,42,43,46,55,68)],na.rm = T)*10 barchart(sort(d), xlab="", ylab="", col = "grey", origin=1, border = "transparent", box.ratio=0.5, panel = function(x,y,...) { panel.barchart(x,y,...) panel.abline(v=seq(1,6,1), col="white", lwd=3)}, par.settings = list(axis.line = list(col = "transparent"))) ltext(current.panel.limits()$xlim[2]-50, adj=1, current.panel.limits()$ylim[1]-100, "Average scores

on negative emotion traits

from 3896 participants

(Watson et al., 1988)")

Minimal barchart in ggplot2 library(ggplot2) library(ggthemes) library(psych) library(reshape2) d <- melt(colMeans(msq[,c(2,7,34,36,42,43,46,55,68)],na.rm = T)*10) d$trait <- rownames(d) ggplot(d, aes(x=trait, y=value)) + theme_tufte(base_size=14, ticks=F) + geom_bar(width=0.25, fill="gray", stat = "identity") + theme(axis.title=element_blank()) + scale_y_continuous(breaks=seq(1, 5, 1)) + geom_hline(yintercept=seq(1, 5, 1), col="white", lwd=1) + annotate("text", x = 3.5, y = 5, adj=1, family="serif", label = c("Average scores

on negative emotion traits from 3896 participants

(Watson et al., 1988)"))

Minimal barchart - interactive with highcharter library(psych) library(reshape) library(highcharter) values <- 1 + abs(rnorm(12)) d <- melt(colMeans(msq[,c(2,7,34,36,42,43,46,55,68)], na.rm = T)*10) trait <- row.names(d) value <- as.vector(d[,1]) highchart() %>% hc_chart(type = "column") %>% hc_add_series(data = value) %>% hc_xAxis(categories = row.names(d)) %>% hc_add_theme(hc_theme_tufte2()) Slopegraph

Slopegraph in base graphics Edward Tufte, The Visual Display of Quantitative Information (Cheshire, 1983), p. 158. Using Thomas Leeper’s slopegraph package. The most promising slopegraph functions for base graphics and ggplot2 comes from Thomas Leeper slopegraph package. Thomas’s solutions have evolved gradually and it’s now the most efficient method to create slopegraphs in R. However, a major limitation is inability to efficently offset left and right side labels to avoid don’t overlap (as seen below). library(devtools) #install_github("leeper/slopegraph")#install Leeper's package from Github library(slopegraph) data(cancer) slopegraph(cancer, col.lines = 'gray', col.lab = 1, col.num = 1, xlim = c(-.2,5), main = "Estimate of % survival rates", xlabels = c('5 Year','10 Year','15 Year','20 Year'))

Slopegraph in lattice (might not happen) Issue with ggslopegraph has been logged here on Github

Slopegraph in ggplot2 with ggslopegraph (with bugs, in preparation)

Slopegraph in ggplot2 with plot_slopegraph James Keirstead’s GitHubs function (currently doesn’t work). # library(ggplot2) # library(ggthemes) # library(devtools) # library(RCurl) # library(plyr) # source_url("https://raw.githubusercontent.com/jkeirstead/r-slopegraph/master/slopegraph.r") # d <- read.csv(text = getURL("https://raw.githubusercontent.com/jkeirstead/r-slopegraph/master/cancer_survival_rates.csv")) # df <- build_slopegraph(d, x="year", y="value", group="group", method="tufte", min.space=0.04) # df <- transform(df, x=factor(x, levels=c(5,10,15,20), # labels=c("5 years","10 years","15 years","20 years")), y=round(y)) # plot_slopegraph(df) + labs(title="Estimates of % survival rates") + # theme_tufte(base_size=16, ticks=F) + theme(axis.title=element_blank()) Sparklines There is no ‘out-of-box’ solution in the existing packages that truly replicate Tufte-style sparklines. Main issues are scaling the size of the plot and labeling of the points - those factors are likely to change depending on the data set you’re plotting, so you will have to adjust specific parameters (which I highlight for every graphical system). To make the output more consistent, every sparkline plot will be automatically saved in the working directory in a vector format as a PDF (using pdf() and dev.off() functions). A word of warning - in its current format, making sparklines requires a bit more advanced knowledge of R. Its far from perfect - proceed with caution.

Sparklines in base graphics Sparklines in base graphics use some elements of functions from YaleToolkit developed by John Emerson and Walton Green. In particular, it’s a result of mine and Ben’s hacking of YaleToolkit functions on Stackoverflow. I’ve use a simple loop that takes a number of columns in a data set and creates as much sparklines as there are columns. In the same manner I use mfrow parameter in par() function to set the number of rows to a number of columns in data frame. Data on US Rate of Crime (per 100,000 people) from 1960 to 2014 comes from FBI UCS Annual Crime Reports compiled by disastercenter.com. library(RCurl) dd <- read.csv(text = getURL("https://gist.githubusercontent.com/GeekOnAcid/da022affd36310c96cd4/raw/9c2ac2b033979fcf14a8d9b2e3e390a4bcc6f0e3/us_nr_of_crimes_1960_2014.csv")) d <- dd[,c(2:11)] pdf("sparklines_base.pdf", height=10, width=6) par(mfrow=c(ncol(d),1), mar=c(1,0,0,8), oma=c(4,1,4,4)) for (i in 1:ncol(d)){ plot(d[,i], lwd=0.5, axes=F, ylab="", xlab="", main="", type="l", new=F) axis(4, at=d[nrow(d),i], labels=round(d[nrow(d),i]), tick=F, las=1, line=-1.5, family="serif", cex.axis=1.2) axis(4, at=d[nrow(d),i], labels=names(d[i]), tick=F, line=1.5, family="serif", cex.axis=1.4, las=1) text(which.max(d[,i]), max(d[,i]), labels=round(max(d[,i]),0), family="serif", cex=1.2, adj=c(0.5,3)) text(which.min(d[,i]), min(d[,i]), labels=round(min(d[,i]),0), family="serif", cex=1.2, adj=c(0.5,-2.5)) ymin <- min(d[,i]); tmin <- which.min(d[,i]); ymax<-max(d[,i]); tmax<-which.max(d[,i]); points(x=c(tmin,tmax), y=c(ymin,ymax), pch=19, col=c("red","blue"), cex=1) rect(0, summary(d[,i])[2], nrow(d), summary(d[,i])[4], border=0, col = rgb(190, 190, 190, alpha=90, maxColorValue=255))} axis(1, at=1:nrow(dd), labels=dd$Year, pos=c(-5), tick=F, family="serif", cex.axis=1.4) dev.off()

Sparklines in base graphics with plotSparklineTable This uses plotSparklineTable function from epanetReader package by Bradley Eck. This compact solution requires data to be in long table format and it has a limited customisation options. Great for making a rapid summaries with sparklines. library(epanetReader) library(reshape) library(RCurl) dd <- read.csv(text = getURL("https://gist.githubusercontent.com/GeekOnAcid/da022affd36310c96cd4/raw/9c2ac2b033979fcf14a8d9b2e3e390a4bcc6f0e3/us_nr_of_crimes_1960_2014.csv")) d <- melt(dd[,c(2:11)]) pdf("sparklines_base_epanetReader.pdf", height=6, width=10) plotSparklineTable(d, row.var = 'variable', col.vars = 'value') dev.off()

Sparklines in lattice You have much better control over the location and size of sparklines when you use lattice. The only problem are right-side labels for which you have to use grid library in order to ‘hack’ the view parameters with functions pushViewport() and popViewport() . You can learn more about this in an extensive collection of grid vignettes. library(lattice) library(latticeExtra) library(grid) library(reshape) library(RCurl) dd <- read.csv(text = getURL("https://gist.githubusercontent.com/GeekOnAcid/da022affd36310c96cd4/raw/9c2ac2b033979fcf14a8d9b2e3e390a4bcc6f0e3/us_nr_of_crimes_1960_2014.csv")) d <- melt(dd, id="Year") names(d)[1] <- "time" pdf("sparklines_lattice.pdf", height=10, width=8) xyplot(value~time | variable, d, xlab="", ylab="", strip=F, lwd=0.7, col=1, type="l", layout=c(1,length(unique(d$variable))), between = list(y = 1), scales=list(y=list(at=NULL, relation="free"), x=list(fontfamily="serif")), par.settings = list(axis.line = list(col = "transparent"), layout.widths=list(right.padding=20, left.padding=-5)), panel = function(x, y, ...) { panel.xyplot(x, y, ...) pushViewport(viewport(xscale=current.viewport()$xscale-5, yscale=current.viewport()$yscale, clip="off")) panel.text(x=tail(x,n=1), y=tail(y,n=1), labels=levels(d$variable)[panel.number()], fontfamily="serif", pos=4) popViewport() panel.text(x=x[which.max(y)], y=max(y), labels=round(max(y),0), cex=0.8, fontfamily="serif",adj=c(0.5,2.5)) panel.text(x=x[which.min(y)], y=min(y), labels=round(min(y),0), cex=0.8, fontfamily="serif",adj=c(0.5,-1.5)) panel.text(x=tail(x,n=1), y=tail(y,n=1), labels=round(tail(y,n=1),0), cex=0.8, fontfamily="serif", pos=4) panel.points(x[which.max(y)], max(y), pch=16, cex=1) panel.points(x[which.min(y)], min(y), pch=16, cex=1, col="red") panel.rect(min(x), quantile(y, 0.25), max(x), quantile(y, 0.75), col = "grey", border = "transparent", alpha = 0.4) }) dev.off()

Sparklines in ggplot2 Created with the help of Wouter Van Der Bijl as part of this StackOverflow question. library(ggplot2) library(ggthemes) library(dplyr) library(reshape) library(RCurl) dd <- read.csv(text = getURL("https://gist.githubusercontent.com/GeekOnAcid/da022affd36310c96cd4/raw/9c2ac2b033979fcf14a8d9b2e3e390a4bcc6f0e3/us_nr_of_crimes_1960_2014.csv")) d <- melt(dd, id="Year") names(d) <- c("Year","Crime.Type","Crime.Rate") d$Crime.Rate <- round(d$Crime.Rate,0) mins <- group_by(d, Crime.Type) %>% slice(which.min(Crime.Rate)) maxs <- group_by(d, Crime.Type) %>% slice(which.max(Crime.Rate)) ends <- group_by(d, Crime.Type) %>% filter(Year == max(Year)) quarts <- d %>% group_by(Crime.Type) %>% summarize(quart1 = quantile(Crime.Rate, 0.25), quart2 = quantile(Crime.Rate, 0.75)) %>% right_join(d) pdf("sparklines_ggplot.pdf", height=10, width=8) ggplot(d, aes(x=Year, y=Crime.Rate)) + facet_grid(Crime.Type ~ ., scales = "free_y") + geom_ribbon(data = quarts, aes(ymin = quart1, max = quart2), fill = 'grey90') + geom_line(size=0.3) + geom_point(data = mins, col = 'red') + geom_point(data = maxs, col = 'blue') + geom_text(data = mins, aes(label = Crime.Rate), vjust = -1) + geom_text(data = maxs, aes(label = Crime.Rate), vjust = 2.5) + geom_text(data = ends, aes(label = Crime.Rate), hjust = 0, nudge_x = 1) + geom_text(data = ends, aes(label = Crime.Type), hjust = 0, nudge_x = 5) + expand_limits(x = max(d$Year) + (0.25 * (max(d$Year) - min(d$Year)))) + scale_x_continuous(breaks = seq(1960, 2010, 10)) + scale_y_continuous(expand = c(0.1, 0)) + theme_tufte(base_size = 15, base_family = "Helvetica") + theme(axis.title=element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank(), strip.text = element_blank()) dev.off() Stem-and-leaf display Edward Tufte, The Visual Display of Quantitative Information (Cheshire, 1983), p. 140 Based on John Tukey, Exploratory Data Analysis (Addison-Wesley, 1970). Stem-and-leaf display is not exactly a ‘Tuftesque’ solution as it invented in the beginning of 20 century but was only popularised in 1980s by John Tukey. A stem-and-leaf display is a display for presenting quantitative data in a graphical format, similar to a histogram, to assist in visualizing the shape of a distribution. Stem-and-leaf plot is the only visualisation in this collection thats printed in the console in R rather than being processed with any graphical system.

Stem-and-leaf display in console with base graphics Data shows the duration of the eruption for the Old Faithful geyser in Yellowstone National Park, Wyoming, USA. stem(faithful$eruptions) ## ## The decimal point is 1 digit(s) to the left of the | ## ## 16 | 070355555588 ## 18 | 000022233333335577777777888822335777888 ## 20 | 00002223378800035778 ## 22 | 0002335578023578 ## 24 | 00228 ## 26 | 23 ## 28 | 080 ## 30 | 7 ## 32 | 2337 ## 34 | 250077 ## 36 | 0000823577 ## 38 | 2333335582225577 ## 40 | 0000003357788888002233555577778 ## 42 | 03335555778800233333555577778 ## 44 | 02222335557780000000023333357778888 ## 46 | 0000233357700000023578 ## 48 | 00000022335800333 ## 50 | 0370

Stem-and-leaf display in console with CarletonStats Data set taken from package MASS. It shows risk factors associated with low infant birth weight. The data were collected at Baystate Medical Center, Springfield, Mass during 1986. The stemPlot function expands the basic stem plot by accepting a factor variable as a second argument to create stem plots for each of the levels. library(CarletonStats) library(MASS) stemPlot(birthwt$bwt, birthwt$smoke, varname="infant birth weight (in grams)", grpvarname="whether mother smoked during pregnancy (1) or not (0)") ## ## ***Stem and Leaf plot for infant birth weight (in grams) *** ## Grouped by levels of whether mother smoked during pregnancy (1) or not (0) ## ## 0 ## : ## The decimal point is 2 digit(s) to the right of the | ## ## 10 | 2 ## 12 | 3 ## 14 | 799 ## 16 | 03 ## 18 | 9037 ## 20 | 66809 ## 22 | 4480358 ## 24 | 14450025 ## 26 | 24423558 ## 28 | 144468822288 ## 30 | 6668990088 ## 32 | 003333377227 ## 34 | 0266794479 ## 36 | 011355037779 ## 38 | 0366814478 ## 40 | 00551577 ## 42 | ## 44 | 9 ## 46 | ## 48 | 9 ## ## ## 1 ## : ## The decimal point is 3 digit(s) to the right of the | ## ## 0 | 7 ## 1 | 1 ## 1 | 889999 ## 2 | 11112223344444444 ## 2 | 5555566677888899999 ## 3 | 0000011111233333444 ## 3 | 6666778999 ## 4 | 2