First, I'll create a 1000-dimension vector in the main memory and measure the speed of generating random uniformly distributed entries.

( def x1K ( fv 1000 ) ) ( quick-bench ( rand-uniform! x1K ) )

Evaluation count : 1420254 in 6 samples of 236709 calls. Execution time mean : 422.105972 ns Execution time std-deviation : 1.347492 ns Execution time lower quantile : 420.687916 ns ( 2.5%) Execution time upper quantile : 423.978301 ns (97.5%) Overhead used : 1.369142 ns

It takes 422 /nano/seconds per 1000 entries, which is 0.42 /nano/seconds per entry.

Let's create a larger vector, one of million entries, and try that.

( def x1M ( fv 1000000 ) ) ( quick-bench ( rand-uniform! x1M ) )

Evaluation count : 1890 in 6 samples of 315 calls. Execution time mean : 319.874178 µs Execution time std-deviation : 4.399502 µs Execution time lower quantile : 317.444041 µs ( 2.5%) Execution time upper quantile : 327.527853 µs (97.5%) Overhead used : 1.369142 ns

Even faster per entry, 0.32 nanoseconds! This looks promising.

The next step is a billion entries. I'll create two vectors of 500 million entries each, which together takes a billion. The reason I'm doing this instead of one vector, is that each entry takes 4 bytes, so a billion entries requires 4GBs of bytes. Java (and Intel MKL) buffers are indexed with integers, and the largest integer is 2147483647.

So, generating 1 billion random numbers on the CPU takes…

( def x500M ( fv 500000000 ) ) ( def y500M ( fv 500000000 ) ) ( quick-bench ( do ( rand-uniform! x500M ) ( rand-uniform! y500M ) ) )

Evaluation count : 6 in 6 samples of 1 calls. Execution time mean : 347.586774 ms Execution time std-deviation : 5.861722 ms Execution time lower quantile : 341.168665 ms ( 2.5%) Execution time upper quantile : 356.432065 ms (97.5%) Overhead used : 1.369142 ns

347 milliseconds. Even that is a billion random numbers in a blink of an eye, if we choose the slowest eyes according to Wikipedia. But, this is Clojure and Neanderthal, so why not try GPU? Of course!

But before that, let's also mention that we have only demonstrated uniformly distributed random numbers. We often need normally distributed numbers, which are a bit more difficult to generate, and rand-normal expected to be slower. Let's see…

( quick-bench ( do ( rand-normal! x500M ) ( rand-normal! y500M ) ) )

Evaluation count : 6 in 6 samples of 1 calls. Execution time mean : 1.870732 sec Execution time std-deviation : 4.685253 ms Execution time lower quantile : 1.865571 sec ( 2.5%) Execution time upper quantile : 1.876130 sec (97.5%) Overhead used : 1.369142 ns

Almost 2 seconds. Although this is extremely fast (see the next section about motivation) this is slower than a blink of an eye!