Starting with the basics…hard disk drives that are not flash based operate by having small “heads” move back and forth across spinning platters (usually made of ceramic these days, but there are still metal ones floating around). Think of it like a record player arm where the needle is the head.

The head can read or write, depending on its instructions. Each hard drive has multiple platters, with one head per platter. Good so far?

OK, so the drive writes data and reads data…but it’s reading incredibly small magnetic data from the platters. You know how sometimes the magnetic cards in your wallet get erased from being around magnets? Those bits are HUGE compared to the ones and zeros on a hard drive platter.

The bits in the drive are not stored in your wallet, of course; they’re inside a metal case which is inside of your computer…but occasionally strange things will happen, like maybe something bumped the disk while the head was reading or writing, and it creates a gouge on the surface of the disk (remember, the platters are spinning at 5,400RPM at least!), or maybe dust got into the drive and is blocking the sector, or maybe even cosmic rays have flipped a bit on the platter (really!).

Whatever it is, something stops the head from being able to read the data on the disk. This is called an Unrecoverable Read Error (or URE for short).

The likelihood of encountering a URE is dependent on things like the build quality of the head and the speed, but mostly on the size of the sectors on the disk (remember how your magnetic stripe has big bits? those are harder to overwrite).

Manufacturers often figure out how likely a URE is for a particular drive, and they usually put it in the documentation for that drive. You can find it if you go to Pricewatch and pick a hard drive at random, then look up the product manual for that part number. Here’s an example of the Western Digital Green Line, or maybe you’d prefer a Seagate Barracuda.

Anyway, get the product manual, and look in the specs for something like “Non-recoverable read errors per bits read”, or “Non-recoverable read errors”. What this gives you is the likelihood of your encountering a URE.

Both of those examples have “1 per 1014 bits read”. If we use the handy-dandy Google calculator, you can see that it really means one URE per 11.3 terabytes.

So, putting all of this together…if you have a 3TB drive and you fill it to capacity, you could probably read from it over 3 times completely before you’d encounter a URE, and lose the data that was held by that particular sector.

That’s kind of unsettling, right? 3 full reads on a 3TB drive, then a high likelihood of going kaput on the 4th read-through?

Fortunately, we use RAID levels which can protect us.

Lets start with RAID-1, which is mirrored disks. We are reading along happily when suddenly we get a URE on one of the drives! It would be sad but, on the other drive, we have an exact copy of the data as it was originally written! The RAID controller reads the data from the other drive, re-writes it on the drive that had the URE, and we continue merrily along.

Suppose we lose a RAID-1 drive. That leaves us with a good copy, but assuming the array was full, when we put a replacement drive into the array, we’ve got to re-read all of the data on the good drive. Keeping in mind that we’re likely to experience a failure after 11.3TB, what is the statistical probability that we’ll experience a URE reading 3TB? Around 1 in 4.

Now, lets move on to RAID-5. You need at least 3 drives in a RAID-5, because unlike the exact copy of RAID-1, RAID-5 has a parity, so that any individual pieces of data can be lost, and instead of recovering the data by copying it, it’s recalculated by examining the remaining data bits.

So when we encounter a URE during normal RAID operations, the array calculates what the missing data was, the data is re-written so we’ll have it next time, and the array carries on business as usual. But when a drive dies, we have to replace it, and that’s when things get hairy.

In order to rebuild the array, the new drive needs to be populated, and in order to do that, the entire contents of the remaining drives need to be read, in order to calculate the parity information. Assuming we have a RAID-5 array that has 3 3-TB disks, we’re now reading 6 terabytes of information. What is the statistical likelihood of encountering a URE?

1 in 2. A coin-flip.

Have a RAID-5 array with 4 3-TB disks? That’s 1 in 1, almost certainly a failure. You can see how quickly this goes downhill.

Now, a lot of people see this, freak out, and say “oh my god, I’m never using RAID-5 again! RAID-5 is the devil! It’s EVIL!”, but remember what is driving the numbers…it’s the URE rate.

Check out this Hitachi Ultrastar. It’s URE rate is 1:1016 which is a kind-of-amazing 1.1 petabytes…so the odds of your UltraStar-based RAID-5 array dying during a rebuild because of a URE is…very low.

So you can’t vilify RAID-5 across the board. It’s very much a matter of what you’re using it for, what the quality of the drives involved are, and what the capacity of the array is (or really, how much data is stored on the array).

Does that help you understand? Have questions, comments, or suggestions? Please leave them below, thanks!