As a baby nerd, sitting in the back room of my parents' house and dialing into BBSs in the late '80s and early '90s, I dreamed of "big iron." Like Kevin Flynn, I tried to imagine what happened to my data on the other end of the phone line—where were the modem's bleeps and hisses and scratches headed? What did that place look like? I imagined huge, frigid halls of raised floors under an actinic glare, populated by endless rows of towering, Cyclopean racks of complex equipment, studded with countless blinkenlights that sparkled off into the distance. More than any other experience, it was this desire to know what's behind the curtain, to see and learn about the systems that provide service to other systems, that pushed me toward a career in IT.

Even though I'm now on the other side of that curtain regularly, it's a fascination I carry with me still. Walking into a new data center for the first time, I immediately want to know what every rack is, what each new piece of equipment does, and how it all ties together. And, with the typical geek reverence for the biggest and loudest—the same reverence that makes shows like "WORLD'S MOST GIGANTIC CONSTRUCTION EQUIPMENT" so popular—I really like learning about storage arrays, since in a big data center those tend to be some of the largest and most imposing machines. I've upgraded my share of hard drives in home computers, and the sight of racks and racks of those same spinning platters (and, increasingly, SSDs) is still an awesome thing to behold, even after all these years.

Recently, Ars published a blurb about Apple's purchase of 12 petabytes of Isilon storage, reportedly to hold iTunes movie content. 12PB is a pretty big chunk by anyone's accounting—assuming we're figuring in powers of two, that's about 12,300TB, which is in turn more than twelve and a half-million gigabytes. In terms of the amount of storage folks typically have lying around the house, this is a lot, and if I were still that baby nerd living at home, I'd be goggle-eyed at the thought of that much spinning disk, trying to imagine what it looks like and how it works.

Fortunately, I don't have to wonder, and neither do you. This article was written in the spirit of that geeky desire to know about the big stuff. We're going to run through an in-depth examination of what Isilon storage is, what makes it a good fit for keeping a few petabytes of movies, and why it works the way it works.

But before we start...

I need to get an important disclaimer out of the way up-front—I work for EMC, which purchased Isilon around the start of 2011. This puts me in both a very good and very bad position to write this article—good because I've got some familiarity with the subject matter, both generally and specifically; bad because I'm employed by the company that now sells Isilon gear.

A sales pitch, though, is not at all my intention. This article is not me writing as an EMC employee, but rather me writing just as another storage geek and longtime member of the Ars forum community, giving my best shot at an informative, technical write-up of Isilon. I do not intend to get into any kind of competitive comparison between Isilon and other storage, nor do I intend to sling any FUD. I've tried to keep the image of baby nerd-me in my head while pulling the piece together, and the overriding question I've asked with every paragraph has been, "If I were still sitting there, would I want to know this?"

I've also done my best to strip the marketing out and focus strictly on the technical, on the why and how without any of the rah-rah-rah cheerleading. And, to preemptively answer two questions that I'm sure will arise: no, EMC and Isilon are not paying me to write this article (unless you count drawing my normal salary as payment), and no, EMC and Isilon have had no editorial input on the article's content. This is just me, writing as me, and (for what it's worth) Ars is the only one paying me to do this.

Meet Isilon

Isilon as a company has been around for about a decade and has gained a lot of traction with customers who have large numbers of big files that they need to keep up with. Traditionally, Isilon has done a lot of business with Hollywood studios—one of their proudest claims to fame is that Isilon storage was used to hold some of the tremendous amount of data generated during the production of James Cameron's Avatar. Indeed, Isilon storage is well-suited as a repository for huge files because of how the system keeps files organized.

I keep saying "files" because files are what what Isilon storage is centered around storing and serving, which makes it a type of NAS—that's Network-Attached Storage. Further, Isilon is what's commonly described as a "scale-out" NAS, because rather than a single monolithic cabinet to which you add disks as you grow in capacity, Isilon takes a node-based approach, and adds processors and cache along with capacity.

We'll get deeper into that in a bit. First, we need to pause and talk about what exactly NAS means, and how it differs from that other big storage acronym, SAN.

SANs and block-level storage

"SAN" stands for "Storage Area Network," and traditionally refers to a big, standalone disk array which makes pieces of itself available to servers—"hosts," in SAN terminology—as if those pieces were directly attached to the hosts, using block-level protocols like Fibre Channel or iSCSI. "Block-level protocols" mean exactly that, too—when a host is attached to a SAN via Fibre Channel, for example, the traffic that it passes back and forth over the FC connection consists of the same basic commands that the host would be using to manage a local hard disk drive, because to the host, that's what a SAN-attached chunk of storage is. Indeed, if you're on a Windows server with some SAN storage allocated to it and you look in the Disk Management MMC snap-in, you'll see that SAN storage showing up as one or more volumes to which you can assign drive letters, just as if they were regular hard disk drives.

The connection from a host to a SAN is done by a specialized interface card called an HBA, or Host Bus Adapter, so called because it provides a direct high-speed link between the host and an external source (older SCSI HBAs actually link the host's SCSI bus with the external source; an HBA today provides access to a host through its PCI or PCIe bus). HBAs take many forms, but the most common type of HBA today is a Fibre Channel HBA, for connecting hosts to Fibre Channel SANs, usually (but not always) over fiber optic connections. HBAs are most often connected to Fibre Channel switches, which work mostly like Ethernet switches and can connect many hosts to one or more FC SANs.