I have a buddy named Matt who's a digital pack-rat. He's the guy you call when you need a movie or a TV show, because chances are he's got it. Among the finer things in his collection is the entire multi-decade run of "Dr. Who," including lots of rare black and white Hartnell stuff. He's long since reached the point where he could put his collection on random play and die of old age before he sees a repeat.

Matt's not alone, either—we're coming to a point where everyone and their dog has at least a digital music and photo collection, and tons of folks (especially folks in the Ars reader demographic) have collections of ripped movies and TV shows on top of that. All that stuff has to reside somewhere, and to that end there's a huge array of network attached storage devices—NAS boxes, as we in the biz say—that can keep the data safe, with redundancy and protection that you wouldn't get from storing the collection on your computer's main hard drive or on a single external disk.

I recently got my hands on a Drobo FS from Data Robotics, and I've been using it intensively for some time now. If you're interested in the Drobo, then this two-part review is perhaps the longest and most thorough look at the device you'll find anywhere. Indeed, it's more than just a review—In Part 1 I dig into Data Robotics patent filings so I can explain how the device works. In Part 2, I'll describe how the Drobo functions in day-to-day use.

How a WHS snob ended up with a Drobo

Though my own media collection pales next to Matt's, I'm no slouch—I've got my own sprawling collection of stuff, and since late 2007 I've been a user and evangelist of Windows Home Server, my favorite Microsoft product. Windows Home Server has two killer features: first, it will painlessly and automatically back up any Windows PC on your home network; and second, you can toss a bunch of differently sized hard drives into it and it will treat them all as a single usable glob of space. Microsoft accomplished the second feat with a technology they called Drive Extender. It uses some clever tricks and a specialized service or two to elegantly create the functional illusion of a single giant hard drive out of multiple hard drives, without the server administrator having to manage RAID groups or LUNs or any of the things that high-end servers and disk arrays need to approximate the same functionality. It was great. Over the years, I seamlessly grew my WHS from two terabytes of space on four 500 GB disks up to more than five terabytes on several mixed-size disks.

I say "was great," though, because the times are changing. Version 2 of Windows Home Server, currently under development, has seen Drive Extender grow, change, and then get canceled, a victim of Microsoft's storage server line trying to be all things to all people. There have always been other options in the home storage arena besides Windows Home Server, and those continue to evolve—Synology, Linksys, Netgear, and other companies all have devices that do more or less the same things as WHS, with varying degrees of success and sophistication, and the truly brave and skilled can even roll their own solution with tools like FreeNAS. FreeNAS even includes ZFS, a filesystem designed by Sun from the ground up with lots of advanced, fancy features which offer huge advantages over other more pedestrian filesystems—things like built-in snapshots and the ability to (sort of) seamlessly grow your pooled storage as you add disks.

I met the news of Drive Extender's demise with a lot of nerd angst. My home-built WHS rig was beginning to creak with age, and I was eager to get the release version of WHS 2 installed on some shiny new kit and migrate over to it. Without Drive Extender, though, one of the two most compelling reasons to use WHS has been removed (and the other reason, backups of all my Windows-based PCs, doesn't help me because I don't have any Windows-based PCs in the house anymore). Rather than deal with installing an older server OS on brand new hardware with all the potential annoyances that brings—WHS v1, after all, is based on Windows Small Business Server 2003—it was time to look for something else.

Most of the consumer NAS boxes available didn't hold my interest. I'd lived for too long in a world where I could upgrade my hard drives any time I wanted, and most consumer boxes use some flavor of traditional RAID—meaning that upgrades are complicated and potentially destructive. That limited the choices to things with disk pooling and seamless expansion, and I gave some ZFS-based options a hard look before deciding to pass. The ease-of-use still isn't anywhere near what I was used to, and the process of adding capacity to an existing ZFS setup is still a little...well, hairy.

That left one option, and it was one I'd actively ridiculed in the past: Data Robotics.

Data Robotics has been around for years. Their main product is a line of direct- and network-attached storage devices called "Drobos," squat little black cubes with a reputation for being both terribly easy to use and terribly slow. Data Robotics's launch product in 2007 was a four-bay device that attached to your computer (which was preferably a Mac) via Firewire 400/800 or USB 2.0, with no way for it to act as a NAS without an expensive Ethernet add-on module. In spite of its limitations, what made the original Drobo great was that there was practically nothing to configure—you shoved disks into it and it acted like a giant external hard drive, and it used a proprietary RAID-like scheme called BeyondRAID to keep the data protected (more on that in a bit).

Much like with Windows Home Server, times have changed. Far from just the single computer-attached Firewire box, Data Robotics now has a whole passel of different Drobos, from an upgraded version of the original four-bay Drobo all the way up to much larger eight-bay offerings for small businesses that need easy-to-manage NAS systems (one box, the DroboElite, even does iSCSI, if you're into that kind of thing). As derisive as I'd been in the past of Data Robotics products, I couldn't stop myself from looking again and again at the Drobo FS, a five-bay box with built-in gigabit Ethernet that shares data to Windows and Mac and Linux hosts alike. It was the only thing I could find that brought forward the WHS-style disk flexibility that had become my hardest requirement...plus, it was cool-looking and black and had colored lights on it.

So I bought one.

A bit about RAID

At this point, I must pause and talk about RAID, because RAID and its descendants and variants are an important part of most big storage solutions. Ars has done more than one feature about what RAID is and how it works, so I won't bore you with an in-depth recap, but I do need to set the stage here and at least touch on traditional RAID types in order to show that what a Drobo does is different.

There are two major drawbacks to traditional RAID that make it a poor substitute for WHS's disk pooling where home users' typical needs are concerned.

RAID is an acronym that originally stood for Redundant Array of Inexpensive Disks, though it's been bastardized often enough as Redundant Array of Independent Disks that the second meaning has become the dominant one. The opposite of RAID is SLED, Single Large Expensive Disk. You don't see SLED used much these days as an acronym because, frankly, large disks just aren't that expensive anymore. Even in the enterprise space, RAID arrays made up of 2TB SATA disks are becoming more and more common.

RAID's purpose is two-fold. First and foremost, RAID is a method to increase a system's reliability (that is, its ability to stay on-line and able to do useful work). Almost every RAID implementation involves making sure individual blocks of data are stored in multiple places, so that if there's a disk failure, nothing is lost that isn't recoverable. RAID's second purpose, aggregating lots of individual disks into what appear to be fewer (but bigger) disks, is a side-effect of the first—you take a bunch of smaller, inexpensive disks and glom them together so that they all act like one big disk, and you end up with lots of places to put your data.

There are several different schemes (called "RAID levels") for distributing information across disks with RAID. There's RAID 0, which stripes data across disks; RAID 1, which mirrors data across disks; RAID 5, which stripes data across disks and also saves parity information as it goes in order to recover from failures; and several others which I won't go into. For a good and proper in-depth explanation of RAID, check the old Ars feature The Skinny on RAID or hit up Wikipedia's RAID page, both of which have lots of good info.

File Versus Block — FIGHT!

There are two major drawbacks to traditional RAID that make it a poor substitute for WHS's disk pooling where home users' typical needs are concerned. First, once a RAID array is created out of disks, it can't be made larger or smaller unless you destroy it and start over. In other words, to resize a RAID array, you must delete every bit of data on it. That's bad. Second, RAID arrays should be made out of disks that are the same size, and if they're not, all disks are treated as equal in size to the smallest disk, meaning a RAID 5 array made up of three 250 GB disks and one 1 TB disk will look and act like it's made out of four 250 GB disks.

Drobo gets around this with a proprietary technology it calls "BeyondRAID." What makes BeyondRAID different from traditional RAID is that it's not really RAID at all. Traditional RAID occurs at the block level, while BeyondRAID is a file-level imitation of several different block-level RAID methods, used in parallel in order to fill up all of the available space on different-sized disks. To understand this in context, we have to make our second (and last, I promise!) digression, and talk about file-level versus block-level.

Things that happen at the file level are the things we as users are often most concerned about. This is where the underlying structures of the disk are organized by the computer's filesystem (NTFS if we're on Windows, HFS+ if we're on OS X, and ext3 or whatever crazy thing tickles your fancy if you're on Linux, though if you're a Linux user you've probably skipped out on this article already and gone back to tinkering with your home-built ZFS server) into objects that we as users fiddle with—files, directories, and their metadata (things like their permissions and their date and timestamps). Filesystems make files and directories out of things called clusters, which are organizational units of a fixed size (for example, they're all four kilobytes by default on NTFS), and as far as your operating system is concerned, clusters are the smallest thing it understands. Generally, all of the things that happen at the file level, including the cluster building-blocks, are logical constructs that are defined and controlled by the software that you've installed on your computer.

Everything has to be made of something, though, and it's not just turtles all the way down—clusters are made of blocks, and blocks are physical constructs; things of hardware. A cluster is something that the operating system's filesystem driver gets to define, but blocks are things that are burned into a hard drive at the factory. On just about every modern-day consumer hard drive, blocks are 512-byte chunks. In the old days, block size was something a user could fiddle with by performing a low-level format of a hard disk, but today it's almost unheard of for an end-user to have to do such a thing (mostly because hard drives today are reliable and precise machines and their heads don't drift like old drives, so we don't have to re-align the blocks to match the heads as they drift over the life of a disk). The operating system up above is concerning itself with writing files and directories and things into clusters, but all the disk cares about is blocks. Blocks, blocks, blocks.

There are layers of communication and abstraction designed to translate operations that happen at the file level into operations at the block level, so that when you save "pictures_of_my_dog.pptx" the operating system knows to make the right changes to the clusters that make up pictures_of_my_dog.pptx, and the hard drive knows to twiddle the right bits in the blocks that make up those clusters. (If this sounds to some of you a bit like the seven-layer burrito OSI model, then gold star for you, because it's the same concept—just like with the OSI model, you're dealing with an arrangement of things that are all made up of the same underlying bits but that increase in complexity as you go further up the stack.)

This all ties back to RAID because RAID deals with things happening at the block level. All that striping and mirroring and parity calculation and stuff is done with blocks, not with files.