"Help! My simulation is taking too long to run! How can I make it go faster?"

I frequently talk with statistical programmers who claim that their "simulations are too slow" (by which they mean, "they take too long"). They suspect that their program is inefficient, but they aren't sure why. Here are some tips and tricks that I've given to help programmers speed up their statistical simulations. Some of the tips apply to programs written in a high-level language, such as SAS/IML, but others apply to scalar languages such as the SAS DATA step, or even C/C++.

These tips and others are detailed in my forthcoming book, Simulating Data with SAS.

Tip #1: Vectorize the program

In a matrix/vector language such as SAS/IML, R, or MATLAB, the program should use matrix and vector operations instead of scalar computations whenever possible. If a simulation program is running slowly, often the problem is that the programmer has failed to vectorize the computations. Clues that a program is not vectorized include the following:

The innermost loop is an iteration over the number of rows or number of columns of matrix (or vector).

The innermost loop is an iteration that adds, multiplies, or accumulates some quantity.

Tip #2: Profile the program to identify trouble spots

In either case, try to replace that loop with an equivalent matrix operation or a function call

In SAS you can use the TIME function to profile a program, which means to measure the execution times of various subsections of the program. (Other languages have similar functions that time the execution of statements.) In a simulation, your goal is to make the computations inside the innermost loop go as quickly as possible. But where should you start?

No matter what computer language you use, don’t waste your time improving the performance of a subsection of a program until you verify that it is a performance bottleneck. In other words, do not guess which computations take a long time; measure them! For an example, see my article about solving linear systems efficiently in which I compare two ways to solve the same problem.

Tip #3: Allocate arrays that are used in a loop

Quite often I see a program that generates a matrix of results one row at a time. Do not use concatenation operators to grow the size of matrix dynamically within a loop! Instead, allocate space for results prior to the loop that computes the results. Similarly, if you need to generate random numbers, fill an entire array of random values in a single call, rather than generating a few random values at a time.

A corollary to this tip is to avoid generating many small arrays and matrices inside the innermost loop. In general, computer languages (such as SAS/IML) that pass arguments by reference can be more efficient than "functional" computer languages in which every function allocates and returns a vector of values.

Tip #4: Reduce the size of the parameter grid

Sometimes simulations include parameters. For example, perhaps you are running a simulation in which you simulate data from a normal distribution with parameters (μ, σ). You might decide to explore the parameter space by having outer loops that iterate over the parameters on a fine grid. For example, a simulation program might look like the following:

proc iml; /* explore the parameter space: (mu, sigma) in [-5,5]x[0.5, 10] */ do mu = -5 to 5 by 0.1; do sigma = 0.5 to 10 by 0.5; do i = 1 to 10000; /* number of simulations */ /* simulate data from N(mu, sigma) */ /* compute and store some statistic */ end; end; end;

The outer loops account for 2,020 parameter values. If the inner loop performs 10,000 iterations, then the program iterates 2 x 107 times. If the computation within the inner loop requires s seconds, the whole computation requires 2s x 107 seconds. Even if s is only a millisecond, the computation still requires 20,000 seconds, or 5.5 minutes.

There are a few steps you can take to reduce the number of parameters:

Is the problem symmetric in the parameters? Perhaps negative values of μ are unnecessary because of symmetry. For example, in a t test, the magnitude of μ might be important, but the sign unimportant. Does each parameter value result in a qualitatively different distribution or model? If not, reduce the number of grid points.

For example, perhaps you can use a coarser grid in the parameter space, as follows:

do mu = -3 to 3 by 1; do sigma = 0.5 to 4.5 by 0.5; ... end; end;

The new parameter grid involves only 63 pairs of parameters instead of 2,020. Consequently, the simulation will run 2020/63 ≈ 32 times faster.

A related idea is to use a nonuniform set of parameter values instead of a uniform grid. You can use closely spaced parameter values in some regions of the parameter space, and widely spaced parameter values in other regions.

Tip #5: Reduce the number of simulations

Do you really need 10,000 simulations? Perhaps 1,000 is sufficient. If so, you've just enabled your program to run 10 times faster!

Use a small number of simulations while you develop and profile a program. After the program works perfectly, decide whether more simulations are necessary for a final run.

Tip #6: Distribute independent computations

If you have access to multiple computers, there are ways to distribute computations across various resources. For example, if you have access to two computers, you can submit half of your parameters values (for example, μ < 0) to one computer and the other half (μ ≥ 0) to the other computer. Some issues associated with doing this (for example, independence of random number streams) are challenging. These issues are discussed in my forthcoming book.

Tip #7: Estimate the run time for the program BEFORE you run it!

This tip won't make your simulation run faster, but it is a critical skill to master. Before you run a simulation program, estimate how long it will take for the program to complete. Simulation problems often scale linearly with the number of iterations, so use only a few iterations the first time that you run a simulation. For example, if you want to simulate something 1,000 times, make a "test run" with 10 iterations and time how long it takes. The full simulation will take about 100 times longer than the test run.

For example, the following SAS/IML program is a good benchmark program. It generates one billion random values by simulating 106 samples, with 1,000 observations in each sample:

proc iml; call randseed(1); N = 1e3; /* num obs in each sample */ NumSim = 1e4; /* num of simulated samples */ x = j(N,1); results = j(NumSim,1); t0 = time(); /* start timing */ do mu = -5 to 5 by 0.1; /* 101 iterations */ do i = 1 to NumSim; call randgen(x, "Normal", mu); results[i] = sum(x); end; /* save or summarize results here... */ end; t = time()-t0; /* finish timing */ print t;

Before I run this program for the first time, I set NumSim = 1e2 . When I run the modified program, I see how long it takes to run. By multiplying the run time by 100, I can estimate the run time for the full simulation.

If you estimate how long a program will run, you will know what to do while it runs. Sit at your desk? Get a cup of coffee? Go to lunch? Go home for the day?

Tip #8: Respect the size of a billion

Due to advances in computing power, some people like to joke that "a billion is the new million," by which they mean that today you can run billions of computations in the same time that it once took to run millions.

True, but remember that millions of computations used to take a long time!

Like Tip #6, this tip does not make your program run faster, but it enables you to make good decisions when you design your simulation study. The tip is simple: know how long it takes to simulate a billion random values on your computer. One way to do this is to run a simple computer experiment, such as the PROC IML program listed in Tip #7. For SAS DATA step programmers, the following program generates and accumulates a billion random values:

data _null_ ; call streaminit ( 1 ) ; keep count; count = 0 ; do i = 1 to 1e9; count = count + rand ( "Normal" ) ; end ; run ;

This is a simple simulation. It doesn't read or write a data set. It doesn't do any fancy computations. All it does is generate a billion random numbers and add them up.

Run this program on your computer and time how long it takes. (On my PC, which was built in 2010, it takes about a minute.) Any "real" simulation will take longer than that, but you can use the time as a lower bound for computations of a similar size.

Summary

Follow these tips and technique to make your statistical simulations run faster. Although simulation studies are, by their nature, "brute force" computations, the tips in this article represent best practices that ensure that your simulations are efficient and run as quickly as possible.

The tips in this article are applicable to a host of computer languages: SAS, R, MATLAB, Python, Julia, and so forth. If you are using the SAS DATA step to simulate data and are using SAS procedures to analyze the data, there are some additional SAS-specific tips that are vital to know. I'll describe those SAS-specific techniques in a future blog post.

Note (10Jul2012): I am flattered that so many people are sending me their SAS programs and asking me to help them make their simulations run faster. Although I wish I could help everyone, that is not feasible. If you are running a simulation in the SAS/IML language, post your questions (and any code) to the SAS/IML community. If you are running your simulation in Base SAS, post your statistical questions to the SAS Statistical Procedures community, or your DATA step/macro questions to the Macro and DATA Step community.