Ideally, random number generators shouldn't be trivial to predict! Although you probably shouldn't use PCG-family generators for tasks with strong cryptographic security needs, they're much less predictable than most popular general-purpose RNGs.

What is Predictability A random number generator is predictable if, after observing some of its “random” output, we can make accurate predictions about what “random values” are coming up next. To most people, predictability seems like the antithesis of randomness, yet it is in part a matter of perspective. For example, here are some random numbers generated by /dev/random on my computer: 0x2089b25b, 0xa904bd46 0x17de4ab5, 0x3fbae22f 0x93fa9859, 0x58b8aa89 0x2911cfb7, 0xcd2e9be0 0x6faf0d43, 0xcfc13aec 0xeebd2f36, 0x4c19793d 0x89ff6083, 0x9666c577 0x2b387d0c, 0x17c76c76 0x175a6408, 0xe3b39734 0xa8168511, 0xf798a21c If you've never seen this page, they ought to look pretty random. But if you come back and read this page tomorrow, they'll be the same and they won't seem quite as random. You could predict that if you came back next week and read this page, the exact same numbers will be here, and if someone asked you “What comes after 0x17de4ab5 you could be pretty sure the answer is 0x3fbae22f . Thus, the above numbers both “look random” and are also “totally predictable”. In that sense, it is possible for an entirely predictable random number generator to pass a battery of statistical tests for randomness. Whether predictability matters depends on the application. If you're performing a simulation, it may not matter at all. But if you were running a lottery, it might matter a lot. Surprisingly, the general-purpose random number generators that are in most widespread use are easily predicted. (In contrast RNGs used to construct stream ciphers for secure communication are believed to be infeasible to predict, and are known as cryptographically secure).

How Can Predictability Arise? In the basics of pseudorandom number generation, we introduced an analogy for algorithmic random number generation—the idea that a random-number generation scheme is like a book of “random numbers”, where the seed for the generator tells us which page to begin reading from. The neat thing about algorithmic generation is that the contents of this mostrously huge book are not explicitly stored, they are computed as needed (using our position in the book). Nevertheless, the contents of the book itself never change, only our reading position (which is what is stored in the internal state of the generator). Although in some sense, the sequence for a given generator is fixed, the book is so huge that a brute-force strategy of simply looking though the entire book to figure out where we are reading from is impractical. But that's just a brute-force strategy, what if there were some cleverer strategy to figure out where in the book we are? Because the content of the book is calculated algorithmically, perhaps it is possible to somehow do the opposite calculation…? At present, there is no actual proof that every algorithmic random number generator isn't actually predictable given some of its output (even the “cryptographically secure” ones!). Proving a generator is impossible to predict amounts to proving the existence of one-way functions, and such a proof would show that P ≠ NP (see Wikipedia for more details). Nevertheless, in practice, there are random number generators that no one knows how to predict (and most computer scientists believe P ≠ NP). But there are also generators that are trivial to predict.

Predicting LCGs As an example, let's see how easy it is to predict a linear congruential generator. These are defined according to the formula \begin{equation*} r_{n+1} = a \, r_{n} + c \mod m \end{equation*} for some constants \(a\), \(c\), and \(m\). From that, we can say that \begin{align*} r_1 = a \, r_0 + c \mod m \\ r_2 = a \, r_1 + c \mod m \end{align*} and, rearranging, \begin{align*} a = \frac{r_1 - r_2}{r_0 - r_1} \mod m \\ c = \frac{r_0 r_2 - r_1^2}{r_0 - r_1} \mod m. \end{align*} And that's it. If we know the modulus, given just three values from the LCG, we can calculate the multiplicative and additive constants (if you'd like to try, note that we can perform modular division using the modular multiplicative inverse, which is available in Mathematica via the PowerMod function, for example). In fact, sometimes almost no mathematics is required. Consider this C++ program: #include <random> #include <iostream> int main () { std :: default_random_engine rng ; std :: cout << "Minimum: " << rng . min () << "

" ; std :: cout << "Maximum: " << rng . max () << "

" ; for ( int i = 0 ; i < 3 ; ++ i ) { std :: cout << rng () << "

" ; } } With the GNU C++ compiler and library, it prints out Minimum: 1 Maximum: 2147483646 16807 282475249 1622650073 We might immediately notice that \(16807^2 = 282475249\), and suspect that \(a = 16807, c = 0\). We might also realize that \(m = \mathrm{maximum}+1\) \(= 2147483647\). The third number confirms our suspicion. With the LLVM C++ library, it prints out Minimum: 1 Maximum: 2147483646 48271 182605794 1291394886 Based on the above experience with the GNU library, you might suspect that \(a = 48271, c = 0\) purely by inspecting the first output, and you'd be right. In fact, that's because these implementations define std::default_random_engine to be std::minstd_rand0 and std::minstd_rand ; these constants are well known as part of their specification. If you know it's std::minstd_rand , you only need to see one number to predict all its future output.

Other Trivially Predictable Generators In fact, any generator that outputs its entire internal state is trivial to predict. The page discussing other random number generators gives several examples, but one notable one is the Mersenne Twister. If you look online you can find several examples, such as this one, where people figure out the state of this generator from its output.

Cryptographically Secure Generators Cryptographically secure generators are generators that are strongly believed to be very difficult to predict, in part because they have received considerable scrutiny from the cryptographic community and no one has found a good way to break them. One such example is arc4random on OS X, iOS, and various BSD Unix variants, but there are many others. In an effort to make it very unlikely that they can be predicted, most cryptographically secure generators are fairly expensive to compute.

Generators that May Be Hard to Predict Some generators do not make any claims about their crypographic security, but nevertheless aren't trivial to predict. For example, some general-purpose generators use several generation schemes and then combine them to create their final output (e.g., Ran from the most recent edition of Numerical Recipes in C). It isn't obvious to me how someone could predict these generators, but that doesn't mean it is impossible. For example, some schemes for combining multiple linear-feedback shift-register RNGs are prone to a correlation attacks. I would class these generators as more secure than trivially predictable generators. A casual user may not know how to predict them, and may not be able to find any work showing them to be predictable, but without detailed analysis from the cryptographic community, it's hard to be confident that they actually cannot be predicted in reasonable time.