by Joseph Rickert

July was a big month for submitting new packages to CRAN; by my count, 251 unique and truly new packages were accepted. In addition to quantity, I was pleased to see quality and variety. For instance, tropicalSparse , a package for exploring some abstract mathematics, and eChem , a package for teaching analytical chemistry, exemplify R’s expansion into new fields.

Below are my “Top 40” picks organized into ten categories: Computational Methods, Data, Econometrics, Machine Learning, Mathematics, Science, Statistics, Time Series, Utilities, and Visualization

Computational Methods

osqp v0.4.0: Provides bindings to the OSQP solver, a numerical optimization package for solving convex quadratic programs written in C based on the alternating direction method of multipliers. See Stellato et al. (2018) for details.

sundailr v0.1.1: Provides a way to call the functions in SUNDIALS C ODE solving library. There is a vignette.

Data

fredr v1.0.0: Provides an R client for the Federal Reserve Economic Data (FRED). There are vignettes on FRED Categories, Releases, Series, Sources, and Tags, as well as a Getting Started Guide.

jstor v0.3.2: Provides functions to import metadata, ngrams, and full-texts delivered by Data for Research by JSTOR. There is an Introduction, and vignettes on Automating File Import and Known Quirks.

rLandsat v0.1.0: Provides functions to search and acquire Landsat data using an API built by Development Seed and the U.S. Geological Survey. See README for how to use the package.

weathercan v0.2.7: Provides tools for downloading historical weather data from the Environment and Climate Change Canada website. Data can be downloaded from multiple stations over large date ranges, and automatically processed into a single dataset. There is an Introduction, a Glossary, and vignettes on Flags and Interpolation.

Econometrics

beezdemand v0.1.0: Provides tools to facilitate analyses performed in studies of behavioral economic demand such as data screening proposed by Stein et al.(2015), and model fitting, including linear Hursh et al. (1989), exponential Hursh & Silberberg (2008), and modified exponential Koffarnus et al. (2015) models. The vignette provides examples.

sgmodel v0.1.0: Provides functions to compute the solutions of a generic stochastic growth model for a given set of user-supplied parameters. See Merton (1971) and Tauchen (1986). There is a vignette.

Machine Learning

bigdatadist v1.0: Provides functions to compute distances between probability measures, entropy measures for samples of curves, distances and depth measures for functional data, and the Generalized Mahalanobis Kernel distance for high dimensional data. For further details see Martos et al (2014) and Martos et al (2018).

L0Learn v1.0.2: Provides an optimized toolkit for approximately solving L0-regularized learning problems. The algorithms are based on coordinate descent and local combinatorial search. For more details see Hazimeh and Mazumder (2018). There is a vignette.

Mathematics

tropicalSparse v0.1.0: Implements some basic tropical algebra functionality for sparse matrices by applying sparse matrix storage techniques. These include addition and multiplication of vectors and matrices, dot product of the vectors in tropical form, and some general equations are also solved using tropical algebra. Look here for the math.

Science

eChem v1.0.0: Provides tools for use in courses in analytical chemistry. Functions simulate cyclic voltammetry, linear-sweep voltammetry, single-pulse and double-pulse chronoamperometry, and chronocoulometry experiments using the implicit finite difference method outlined in Brown (2015). There is an Overview and vignettes on Using eChem, Computational details, and Examples.

RaceID v0.1.1: Enables inference of cell types and prediction of lineage trees using the StemID2 algorithm of Herman, Sagar and Grün D. (2018). There is a vignette.

updog v1.0.1: Implements empirical Bayes approaches to genotype polyploids from next-generation sequencing data while accounting for allelic bias, over dispersion, and sequencing error. See Gerard et al. (2018) for implementation details, along with vignettes on Oracle Calculations, Parallization, Simulating Sequeencing Data, and an Example.

Statistics

adaptMT v1.0.0: Implements adaptive p-value thresholding (AdaPT), including a framework that allows users to specify any algorithm to learn local false-discovery rate, as well as a pool of convenient functions that implement specific algorithms. See Lei and Fithian (2016). The vignette provides an introduction to the package.

biglmm v0.9-1: Provides regression for data too large to fit in memory. This package functions exactly like the biglm package, but works with later versions of R.

circumplex v0.1.2: Provides tools for analyzing and visualizing circular data, including a generalization of the bootstrapped structural summary method from Zimmermann & Wright (2017), and functions for creating publication-ready tables and figures from the results. There is an Introduction and a vignette on Analysis.

MultiFit v0.1.2: Provides functions to test for independence of two random vectors and learn and report the dependency structure. For more information, see Gorsky and Ma (2018) and the vignette.

PHEIndicatormethods v1.0.8: Provides functions to calculate commonly used public health statistics and their confidence intervals using methods approved for use in the production of Public Health England indicators, such as those presented via Fingertips. The statistical methods are referenced in the following publications: Breslow and Day](doi:10.1002/sim.4780080614), Dobson et al (1991), Armitage and Berry (2002), and Wilson (1927). There is a vignette.

robmixglm v1.0-2: Implements robust generalized linear models (GLM) using a mixture method, as described in Beath (2018). See the vignette for details.

SingelCaseES v0.4.0: Provides functions for calculating basic effect size indices for single-case designs, including several non-overlap measures and parametric effect size measures, and for estimating the gradual effects model developed by Swan and Pustejovsky (2018). There is a vignette on Definitions and Mathematical Details and another on Calculations.

spCP v1.0: Implements a spatially varying change-point model with unique intercepts, slopes, variance intercepts and slopes, and change points at each location. Inference is within the Bayesian setting using Markov chain Monte Carlo (MCMC). See the vignette for an example.

TDAstats v0.3.0: Provides a tool set for topological data analysis, specifically via the calculation of persistent homology in a Vietoris-Rips complex. For a general background on computing persistent homology for topological data analysis, see Otter et al. (2017). To learn more about how the permutation test is used for nonparametric statistical inference in topological data analysis, read Robinson & Turner (2017). There is an Introduction and a vignette on Hypothesis Testing with TDA.

trafo v1.0.0: Provides functions to estimate, select, and compare several families of transformations, including Bickel-Doksum Bickel and Doksum (1981), Box-Cox, Dual Yang (2006), Glog Durbin et al. (2002), Gpower1, Log, Log-shift opt Feng et al. (2016), Manly, Modulus John and Draper (1980), Neglog Whittaker et al. (2005), Reciprocal and Yeo-Johnson. See the vignette for the math.

uniformly v0.1.0: Provides functions to uniformly sample from various geometric shapes, such as spheres, ellipsoids, simplices. See the vignette.

Time Series

rollRegress v0.1.0: Implements methods for fast-rolling and expanding linear regression models. The methods use rank-one updates and downdates of the upper triangular matrix from a QR decomposition. See Dongarra et al.(1979). The vignette provides some details.

Utilities

anyLib v1.0.4: Provides functions to install and load a list of packages from CRAN, Bioconductor or GitHub. For GitHub, if you do not have the full path with the maintainer name in it (e.g. “achateigner/topReviGO”), it will be able to load it but not to install the package. There is a brief vignette.

dbx v0.2.1: Provides select, insert, update, upsert, and delete database operations for PostgreSQL , MySQL , SQLite , and other databases. See the README for usage

envnames v0.3.0: Provides functions to keep track of user-defined environment names that cannot be retrieved with the base R function environmentName() . The main function in this package, environment_name() , returns the name of the environment given as parameter. The vignette offers an overview of the package.

librarian v1.3.0: Provides functions to automatically install, update, and load CRAN and GitHub packages in a single function call. See README for usage.

makeParallel v0.1.1: Provides functions to automate the transformation of serial R code into more efficient parallel versions. There is a Quickstart Guide and a vignette on Parallel Concepts.

metaDigitise v1.0.0: Provides functions to extract, summarize and digitize data from published figures in research papers. The vignette shows how to use the package.

RSuite v0.32-244: Provides a set of tools to be used with the R Suite for developing data-science workflows.

Visualization

ceterisParibus v0.3.0: Provides functions to create “What if?” plots of model responses around selected points in a feature space. The four vignettes offer several examples, including a Random Forests Example and a Classification Example.

cytofan v0.1.0: Implements fan plots for cytometry data in ggplot2 . See Britton et al. (1998) for information on fan plots, and README for package usage.

fingertipscharts v0.0.1: Provides tools to recreate the visualizations that are displayed on the Fingertips website of U.K. public health data. The vignette explains how to use the package.