[This article was first published on, and kindly contributed to R-bloggers ]. (You can report issue about the content on this page here Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Numerical optimization is an important tool in the data scientist's toolbox. Many classical statistical problems boil down to finding the highest (or lowest) point on a multi-dimensional surface: the base R function optim provides many techniques for solving such maximum likelihood problems.

Counterintuitively, numerical optimizations are easiest (though rarely actually easy) when all of the variables are continuous and can take any value. When integer variables enter the mix, optimization becomes much, much harder. This typically happens when the optimization is constrained by a limited selection of objects, for example packages in a weight-limited cargo shipment, or stocks in a portfolio constrained by sector weightings and transaction costs. For tasks like these, you often need an algorithm for a specialized type of optimization: Mixed Integer Programming.

For problems like these, Dirk Schumacher has created the ompr package for R. This package provides a convenient syntax for describing the variables and contraints in an optimization problem. For example, take the classic “knapsack” problem of maximizing the total value of objects in a container subject to its maximum weight limit. In mathematical form, the problem can be described as an objective (quantity to be maximized) and a constraint:

$$

\begin{equation*}

\begin{array}{[email protected]{}ll}

\text{max} & \displaystyle\sum\limits_{i=1}^{n} v_{i}x_{i} & &\\

\text{subject to}& \displaystyle\sum\limits_{i=1}^{n} w_{i}x_{i} \leq W, & &\\

& x_{i} \in \{0,1\}, &i=1 ,\ldots, n&

\end{array}

\end{equation*}

$$

In ompr, the same problem is naturally expressed in R as follows:

model % add_variable(x[i], i = 1:n, type = "binary") %>% set_objective(sum_expr(v[i] * x[i], i = 1:n)) %>% add_constraint(sum_expr(w[i] * x[i], i = 1:n) <= W)

To actually solve the problem, you need to provide a "backend" solver algorithm to ompr. It's designed to integrate with any solver, and currently works with the ROI (R Optimization Infrastructure) package. ROI in turn provides a number of solver algorithms including GLPK, the GNU Linear Programming Kit, which you can use to solve problems like this.

Dirk provides a number of worked examples of the ompr package in use. Examples include the Traveling Salesman problem, solving Soduku puzzles, and selecting optimal warehouse locations to minimize delivery costs to customers.

The ompr package is available on Github. For more information, check out the ompr home page linked below.

Dirk Schumacher: ompr: Mathematical Programming in R