$\begingroup$

There are published techniques for cracking LCGs, but to my eye those techniques seem very brittle — very minor changes can add nonlinearity that renders techniques like the LLL algorithm unusable. Or, am I mistaken, are these variations still easy to crack?

Background

One thing that perplexes me somewhat about the cryptographic community is how people seem to hate LCGs and write them off without much effort to repair their flaws, and yet also like LFSRs and have repaired them often.

Raw LCGs and LFSRs Are Both Weak

In Schneier's Applied Crypography (1996), he writes damningly about LCGs:

In contrast, the book sends mixed messages about LFSRs, saying positive things like:

Because of the simple feedback sequence, a large body of mathematical theory can be applied to analyzing LFSRs. Cryptographers like to analyze sequences to convince themselves that they are random enough to be secure. LFSRs are the most common type of shift registers used in cryptography.

but also saying:

On the other hand, an astonishingly large number of seemingly complex shift-register-based generators have been cracked. And certainly military cryptanalysis institutions such as the NSA have cracked a lot more. Sometimes it’s amazing to see the simple ones proposed again and again.

More recently, Schindler 2009 concurs with this latter advice, saying:

Consequently, LFSRs do not fulfill requirement R2 [lack of predictability], and they are absolutely inappropriate for sensitive cryptographic applications.

More Secure Variations…?

Much of the rest of the RNG chapter of Applied Crypography, focuses on LFSR variants that combine LFSRs in complex non-linear ways to make them more challenging to crack.

But in what seems to me like a strange contrast, it is completely dismissive of attempts to improve LCGs, saying:

Various people examined the combination of linear congruential generators (Wichmann & Hill 1982, L’Ecuyer 1988). The results are no more cryptographically secure, but the combinations have longer periods and perform better in some randomness tests.

A source of surprise to me here is that unlike much of the rest of the book, the claims aren't supported by the citations. Neither paper cited mentions security or cryptography at all, so where is Schneier getting the claim that “results are no more cryptographically secure”?

It's interesting because Knuth 1985 made a quite different claim, saying:

Although we have seen that linear congruential sequences do not lead to secure encryptions by themselves, slight modifications would defeat the methods presented here. In particular, there appears to be no way to decipher the sequence of high-order bits generated by linear congruential sequences that have been shuffled [...]

Of course, these references are quite old, but most discussions I see seem to refer back to the 1980s and 1990s for their dismissal of LCGs.

How Brittle Are LCG-Cracking Techniques…?

So, this leads (finally) to my question. Just how brittle are these techniques for cracking LCGs? Are they really so fatally flawed that they can't be repaired?

Let's make it real with some example C code:

#include <stdio.h> #include <stdint.h> int main() { uint32_t result; uint64_t state = SEED; for (int i=0; i < 32; ++i) { state = state * MULT + INC; result = state >> 32; result ^= XOR; printf("0x%08x, ", result); } printf("

"); }

Case 0: Everything Known

Thus, if we set SEED=0x35e647cfd3423fd0ull , MULT=0xc278c0d1c04a88d9ull , and INC=0 XOR=0 , the program produces

0x59502137, 0xb6152ece, 0xbbd2cb88, 0xef05249f, 0x3ec02cd5, 0x2b0eca82, 0x0a3120be, 0x5116f6fb, 0x8b06b68c, 0x01367995, 0xca5789bd, 0xa40f57ff, 0x5f6d75bb, 0x544951f7, 0x8f9e70c8, 0x74307957, 0x70aab16c, 0x0ec42e72, 0x9bb2a42d, 0x2c5aa6aa, 0xe3cff469, 0x37881c03, 0x8d7853ba, 0xd6beb049, 0xa9fc0e6e, 0xbbc5bd2b, 0x33462a03, 0xad508c7e, 0xe31313e9, 0xf30418ae, 0xbefc1b02, 0xc0134d22,

Case 1: Trivial Case

Now SEED is unknown, MULT=0xc278c0d1c04a88d9ull INC=0 XOR=0 , and the output is

0x8b1294a5, 0xae5cbf0d, 0x2da164bd, 0xcbe27c6d, 0x6d800d17, 0x8f576a33, 0x6ea4915b, 0x97ada3d5, 0x8ab31e5d, 0x0bb313d2, 0xfbee8ebf, 0xf1d09659, 0x5a54428e, 0x34d32f9a, 0xe800efdb, 0x5a313abd, 0x844a1328, 0xed9cf267, 0x5883910f, 0x7a44aa80, 0x0e34d575, 0x7e3453df, 0x2267bf41, 0x8c234c85, 0xa359f8b8, 0xf78f0126, 0x7902934e, 0x5ae97dc1, 0x1ba40108, 0x67f5ca64, 0x7aed8c5e, 0xceccf54b,

The above should be trivial to break. Even brute force is practical since we only need to search 2^32 possible states.

Case 2: Known Techniques

Now SEED and INC are unknown, MULT=0xc278c0d1c04a88d9ull and XOR=0 , and the output is

0x8c005b3e, 0x27e3338e, 0x1bb199bb, 0x46449299, 0x4b747cca, 0x290032ca, 0x2a6e907f, 0x6b1bd36f, 0xab7f4d33, 0x9b7a73be, 0xe9ae522c, 0x171e7e55, 0x95b0dcd2, 0xd93e6986, 0xddd1a6d2, 0xf2e197e5, 0x8e621adc, 0x0ac2dd7e, 0x31fafcce, 0xc7e19a1a, 0x5f9b0788, 0x9f3a790e, 0xe0e76b17, 0x6fcf2716, 0x0106a4fb, 0x3e64838a, 0x508cc169, 0x690a7b96, 0xde80a6cc, 0xbbcc6546, 0x76e80fe9, 0x6683486d,

This should also be breakable, but we need to use techniques from the literature because a brute force approach is infeasible. I would love to see code that actually breaks this and works out what the next numbers will be.

Case 3: Added Nonlinearity (Harder?)

Now SEED , INC and XOR are unknown, MULT=0xc278c0d1c04a88d9ull , and the output is

0x5e3af925, 0x1b7f8e1a, 0x268c64d1, 0x4b614b92, 0xba6c7a4d, 0xe4103860, 0xfe373528, 0x768f9a04, 0xedab3415, 0x2605ff3f, 0xb01e70bb, 0xff65e40a, 0x50980bee, 0xe9fb0d78, 0xdf3754bd, 0x46cce80d, 0xfe1395ac, 0x2e663615, 0xdebea707, 0x4d2cd17d, 0x30f0c21c, 0xb15a64ee, 0x21d38d72, 0xbb8e8c6d, 0x114447d1, 0x5362837c, 0xed46a733, 0x37526997, 0xf7ac14c2, 0xc33e7134, 0x96a9d739, 0x3ee606ba

So, this is the crux of my question. How much harder is this variation than the previous one? From my reading of the standard techniques for breaking truncated LCGs, this added nonlinearity is a problem. We can cancel-out the XOR but that breaks the subtraction we were already doing to cancel-out the increment; we can cancel one or the other, but not both.

(If it's still easy, what are my constants? Want to share actual code that breaks it?)

Case 4: Any Harder?

So, now SEED , INC , XOR and MULT are all unknown, and the output is

0xc325ad70, 0x5ac2c779, 0xafa3561b, 0xa39f9107, 0x256264b5, 0x8d07c2e8, 0x55d53f9e, 0xb090eacc, 0x8b5a28a9, 0xa5e2a296, 0xc0650347, 0x0718efdb, 0x66c331c5, 0xd00236cf, 0x22118dc5, 0x4f9d67d0, 0xa6793bfe, 0x00ad774d, 0xd8337c8f, 0x49aab5b3, 0x96e419c8, 0xd6a9385b, 0x47108063, 0xde06326f, 0x2d4cd28c, 0xb0f97be8, 0x494f5df1, 0x8d53de30, 0x0eee9cbc, 0x9ea6beb1, 0xbefa03b6, 0x5a3d1fea,

How hard is this to break? (If it's still easy, what are my constants? Want to share actual code that breaks it?)

Previous StackExchange Questions

I'm aware of previous discussion here, here, and here, but none of these answers match this question, they address the question of how weak a particular kind of LCG is (often starting with an easily broken example), rather than the brittleness of the techniques used.

Update: Question Clarified

(Some users wanted to close my original question as “Too broad”, so I clarified it, resolving the issue.)

So, to be clear, my question is:

I know that Case 1 is easy, and Case 2 has published techniques for solving it. But, my questions are:

Are there any efficient known techniques for cracking Case 3 or 4? If so, what are they?

Is there any implementation code out there for cracking Case 2.

I hope that this is straightforward enough to be answerable, or for people to say “I've done lots of work in cryptography and I don't know of an answer”.