Is a random bit flip possible?



CE is Correctable Errors, UE is Uncorrectable Errors



FIT is Failure in Time: Errors per 10e9 hours of use

Error Correction Type: None

What influences the odds of a Soft Error happening?

A random bit flip being a security problem? Surely you're joking.

Did you try to flip a bit?

Some time ago I had a crazy/funny idea for a local privilege escalation: run a privilege granting operation in an infinite loop and wait for a random bit flip in CPU/RAM that would make a 'can this user do this' check return 'true' instead of 'false'. Is this theoretically possible? Yes. And practically? Almost impossible, due to the unlikeliness of a bit flip and even more, the unlikeliness of a bit flip in the just right place. Nevertheless, I thought this idea was quite interesting and decided to dig into the topic. This post will summarize what I've found out and mention a few papers/posts might be worth reading.As a start note: most of the data I found is kinda outdated (year 2003, etc), so links to newer data are most welcomed!Yes. Actually more possible then I've expected and that's why ECC RAM (ECC being Error-Correction Code ) dices are so widely used in servers.I found two cool papers with some statistics: "DRAM Errors in the Wild: A Large-Scale Field Study" , Bianca Schroeder (University of Toronto), Eduardo Pinheiro (Google Inc.), Wolf-Dietrich Weber (Google Inc.), SIGMETRICS, 2009. Soft Errors in Electronic Memory – A White Paper , Tezzaron Semiconductor, 2003/2004.It's worth looking at both of these for detailed information, but in short: a number 3751 appears in the first paper as the avg. number of Ccorrectable Errors (in ECC, in non-ECC RAM these are not corrected) in a DIMM dice per nearly 2.5y of constant work - that gives 4.11 CE per day (i.e. ~4 random bit flips that were corrected due to ECC being used). The full table is presented below:The second paper contains a table presenting collected failure rates for different types of memory:So yeah, random bit flips (actually called Soft Errors ) do happen.Also, some time ago I've seen a cool case study about tracing an error in software to a random bit filp: Attack of the Cosmic Rays! by Nelson Elhage.Linux users might want take a look thecommand's output, the Memory-related sections. If you have ECC RAM it will show you the number of detected and corrected errors. Otherwise you might just see aentry (thanks goes to Tavis for showing me this).Actually quite a few things (in random order; source: above papers and wiki pages):- Temperature- Alpha particles- Cosmic rays- Lower voltage- Higher speeds- Construction of the dice- and other... (see the papers for details)Actually I'm not.Let's start with the Gameboy Color Boot ROM post, where the author described how he bypassed the anti-ROM-dump mechanism by introducing random bit flips in the CPU. It's a fascinating read!The second paper I would like to point out here is: Using Memory Errors to Attack a Virtual Machine , Sudhakar Govindavajhala and Andrew W. Appel, Princeton University, 2003.The paper is about Java and .NET VMs and describes how to create such a memory layout that most of the random bit flips would cause a Write-What-Where condition to appear, which is exploitable in a straight forward and allows to get to get code execution.They also describe how they tested the idea: using an 50W spotlight that heated up the lamp (in short: it worked and took about 1 minute of heating, though some nasty system crashes also appeared):Huh, using spotlight for hacking, now that's cool!So, an interesting Sci-Fi idea (well, maybe not so "-Fi" after all) would be a "hacking gun" that pointed to a CPU/RAM would flip just the right bit ;)Actually I'm still trying. I've modified OSAmber (a pet bootloader+minimal kernel of mine) to scan memory for any bit flips, but no luck so far (even though I've heated the RAM by quite a lot a few times).I'll update this post if I get anywhere with this experiment. But for now, a screen shot will have to do:And that's it for now. Guess the next RAM I'll buy will be ECC. Cheers ;)P.S. Looks like this post was put on reddit (under ReverseEngineering) - the comments on reddit are often worth checking out :)I never got to update the results of the experiments, so I'll do that now:I've been testing for a few days, changing the way the detection works from time to time:- At first the memory was all zeroed and I tried to detect any bits that flipped to 1.- But I decided that might not work, so I changed the content of each memory cell (where a cell for me was a 32-bit block) to be equal to a simple hash of the address (this way I got different memory content across the memory), and check if the values changed.- In both cases I scanned 2GB of memory, taking care to disable caching by the CPU...- and I heatet the RAM using a powerful flashlight (it got really hot, though I don't have exact measurements).- In both cases after running a few days non stop ISo no luck with my flipping. I guess it might be fun to redo the experiment and let it run a longer time. And make it more scientific by actually noting the temperature.Random note: I had to restart the experiment after a few hours since I forgot to enable the A20 line in OS Amber. Ups.