Ceph: Block Storage for the 21st Century

By: Steven J. Vaughan-Nichols

Storage used to be so simple. You had a Single Large Expensive Drive (SLED) and you stored all your data on it.

Then, we moved on to redundant arrays of inexpensive disks (RAID), and things got more complex. But, it was still pretty easy. Unless you were using Small Computer System Interface (SCSI). I still get the heebie-jeebies thinking about chaining SCSI drives together.

But, even as hard drives were replaced by solid-state drives (SSD), physical drives couldn’t keep up with modern server data needs, never mind those of clouds and containers. This is where Ceph, and software-defined storage (SDS) have stepped in.

Ceph is an open-source SDS system. It’s designed to run on commercial off-the-shelf (COTS) hardware. From the user’s viewpoint, you’re not concerned about the hardware — whether it’s cache-enabled hard-drives or RAID SSDs.

There are two ways to create this kind of SDM. One way is with a Distributed File Systems (DFS), which is basically the files and directories you’ve been using for years. The only real difference between DFS and conventional storage is that instead of storing files on a single drive or array of drives, they’re stored across drives on multiple servers.

The other method, which Ceph uses, is in an Object Store, where each piece of data is stored in a flat, non-hierarchical namespace and identified by an arbitrary, unique identifier. File details, its metadata, are stored along with the data itself.

Ceph stores your data on a Ceph Block Device (CBD). This is a virtual drive, which can be attached to bare-metal or virtual machine (VM) Linux-based servers. To manage its storage, Ceph uses the Ceph Reliable Autonomic Distributed Object Store (RADOS), which facilitates block storage capabilities such as snapshots and replication.

Within RADOS, an object is THE unit of storage. In turn, objects are stored in object pools. Each pool has a name (e.g., “foo”) and forms a distinct object namespace. Each pool also defines how the object is stored, designates a replication level (2x, 3x, etc.), and delineates a mapping rule, describing how replicas should be distributed across the storage cluster. (For example, each replica should live in a separate rack.)

Finally, the Ceph storage cluster is comprised of object storage daemons/devices (OSDs). This cluster can store multiple pools and makes Ceph so scalable. You can start with little more storage than you have on your desktop and surge up to petabytes.

While these things are all neat, what makes most people — and anyone who cares about the bottom-line — excited about Ceph is that it makes storage much more affordable.

You can access Ceph storage multiple ways.

If you’re just using block storage, you can use the RADOS Gateway, which is an object storage interface built on top of Librados to provide applications with a RESTful gateway to Ceph Storage Clusters. (In general, Ceph Object Storage supports two interfaces. These are Amazon Simple Storage Service (S3) and OpenStack Swift Representational State Transfer (REST)-based application programming interfaces (APIs). Ceph also has its own native API.)

Ceph also includes two other access modes. The first of these is as file storage. This uses the Portable Operating System Interface (POSIX)-compliant Ceph file system (CephFS). To users this looks like a DFS. You can even use such old-fashioned storage access means as Network File System (NFS) with CephFS.

You can also mount Ceph as a block device. In this mode, Ceph automatically stripes and replicates the data across the cluster. Ceph’s RADOS Block Device (RBD) also integrates with Linux’s built-in Kernel Virtual Machines (KVMs). This enables you to deploy Ceph’s storage to KVMs running on your Ceph clients.

Setting up Ceph isn’t too difficult. You should keep in mind that to use Ceph efficiently you’ll need ample system resources. For example, Ceph OSD nodes defaults to a replica factor of three. This means for every 1 Terabyte (TB) of objects, you need 3 TBs of capacity. Ceph also recommends that your OSD data, OSD journal and OS reside on separate disks. In other words, adding in a swap drive, you’ll need four physical disks.

Why should you use Ceph? To quote Red Hat, Ceph’s owner, “Efficient, agile, and massively scalable, Ceph significantly lowers the cost of storing enterprise information in the cloud and helps you manage exponential data growth, so you can focus on making your data available.”

Ceph is also very flexible. Whether you want to access storage as blocks, files, or objects, its there for you. In today’s world where we need quick, reliable access to huge data stores, not to mention Big Data, SDS programs like Ceph are as necessary now as RAID was back in the day.