SSD Reliability: Is Your Data Really Safe?

Back in 2008, Intel made a case to us about storage bottlenecking its Nehalem architecture. We were at IDF in San Francisco, the company was introducing its first solid-state drives, and its representatives stood on stage, describing the ways in which a conventional hard drive slowed down a Core i7 processor. Three years later, we've seen over and over in benchmarks that SSDs are legitimate performance-adders, changing the computing experience fairly dramatically.

IMFT: NAND Transition

With that said, performance isn’t everything. When it comes to your data, all of the speed in the world means little if you can't trust the device holding that important information. After all, when you read about Corsair's Force 3 recall, OCZ's firmware updates to prevent BSODs, Crucial's link power management issues, and Intel's SSD 320 that loses capacity after a power failure, all within a two-month period, you have to acknowledge that we're dealing with a technology that's simply a lot newer (and consequently less mature) than mechanical storage.

This topic is even more relevant now, in the wake of a swift shift from 3x nm NAND to flash memory manufactured at 25 nm. We've talked to some very bright minds in solid-state drive design, and the theme is consistent. It's more difficult to overcome the challenges presented by flash manufactured at 25 nm than it was at 34 nm. But today's buyers should still expect better performance and reliability compared to previous-generation products. Succinctly, the lower number of program/erase cycles inherent to NAND cells created using smaller geometry continues to be overblown.

P/E Cycles Total Terabytes Written (JEDEC formula) Years till Write Exhaustion (10 GB/day, WA = 1.75) 25 nm, 80 GB SSD 3000 68.5 TBW 18.7 years 25 nm, 160 GB SSD 3000 137.1 TBW 37.5 years 34 nm, 80 GB SSD 5000 114.2 TBW 31.3 years 34 nm, 160 GB SSD 5000 228.5 TBW 62.6 years

You shouldn’t have to worry about the number of P/E cycles that your SSD can sustain. The previous generation of consumer-oriented SSDs used 3x nm MLC NAND generally rated for 5000 cycles. In other words, you could write to and then erase data 5000 times before the NAND cells started losing their ability to retain data. On an 80 GB drive, that translated into writing 114 TB before conceivably starting to experience the effects of write exhaustion. Considering that the average desktop user writes, at most, 10 GB a day, it would take about 31 years to completely wear the drive out. With 25 nm NAND, this figure drops down to 18 years. Of course, we're oversimplifying a complex calculation. Issues like write amplification, compression, and garbage collection can affect those estimates. But overall, there is no reason you should have to monitor write endurance like some sort of doomsday clock on your desktop.

Obviously, we know that SSDs still fail, though. All it takes is 10 minutes of flipping through customer reviews on Newegg's listings. But write-cycle exhaustion isn't the problem. Sometimes firmware is to blame. We know this because of the firmware updates vendors issue specifically targeting a documented problem. Other failures are electronic in nature. A capacitor or memory IC might go out, taking the SSD with it. Of course, we'd expect fewer issues with SSDs than hard drives, which have moving parts that invariably wear out over time. Do solid-state drives' lack of moving parts translate into higher reliability? Is the data on your SSD any safer than it would be on a hard drive?

With that question weighing on an increasing number of enthusiasts' and IT professionals' minds, we set out to investigate SSD reliability and sort the facts from the fiction.