Why we care about file systems





Computer platform advocacy can bubble up in the strangest places. In a recent interview at a conference in Australia, Linux creator Linus Torvalds got the Macintosh community in an uproar when he described Mac OS X's file system as "complete and utter crap, which is scary."

What did he mean? What is a "file system" anyway, and why would we care why one is better than another? At first glance, it might seem that file systems are boring technical widgetry that would never impact our lives directly, but in fact, the humble file system has a huge influence on how we use and interact with computers.

This article will start off by defining what a file system is and what it does. Then we'll take a look back at the history of how various file systems evolved and why new ones were introduced. Finally we'll take a brief glance into our temporal vortex and see how file systems might change in the future. We'll start by looking at the file systems of the past, then we'll look at file systems used by individual operating systems before looking at what the future may hold.

What is a file system?

Briefly put, a file system is a clearly-defined method that the computer's operating system uses to store, catalog, and retrieve files. Files are central to everything we use a computer for: all applications, images, movies, and documents are files, and they all need to be stored somewhere. For most computers, this place is the hard disk drive, but files can exist on all sorts of media: flash drives, CD and DVD discs, or even tape backup systems.

File systems need to keep track of not only the bits that make up the file itself and where they are logically placed on the hard drive, but also store information about the file. The most important thing it has to store is the file's name. Without the name it will be nearly impossible for the humans to find the file again. Also, the file system has to know how to organize files in a hierarchy, again for the benefit of those pesky humans. This hierarchy is usually called a directory. The last thing the file system has to worry about is metadata.

Metadata

Metadata literally means "data about data" and that's exactly what it is. While metadata may sound relatively recent and modern, all file systems right from the very beginning had to store at least some metadata along with the file and file name. One important bit of metadata is the file's modification date—not always necessary for the computer, but again important for those humans to know so that they can be sure they are working on the latest version of a file. A bit of metadata that is unimportant to people—but crucial to the computer—is the exact physical location (or locations) of the file on the storage device.

Other examples of metadata include attributes, such as hidden or read-only, that the operating system uses to decide how to display the file and who gets to modify it. Multiuser operating systems store file permissions as metadata. Modern file systems go absolutely nuts with metadata, adding all sorts of crazy attributes that can be tailored for individual types of files: artist and album names for music files, or tags for photos that make them easier to sort later.

Advanced file system features

As operating systems have matured, more and more features have been added to their file systems. More metadata options are one such improvement, but there have been others, such as the ability to index files for faster searches, new storage designs that reduce file fragmentation, and more robust error-correction abilities. One of the biggest advances in file systems has been the addition of journaling, which keeps a log of changes that the computer is about to make to each file. This means that if the computer crashes or the power goes out halfway through the file operation, it will be able to check the log and either finish or abandon the operation quickly without corrupting the file. This makes restarting the computer much faster, as the operating system doesn't have to scan the entire file system to find out if anything is out of sync.