I’ve got a cool new Amazon S3 feature to tell you about, but I need to start with a definition!

Let’s define durability (with respect to an object stored in S3) as the probability that the object will remain intact and accessible after a period of one year. 100% durability would mean that there’s no possible way for the object to be lost, 90% durability would mean that there’s a 1-in-10 chance, and so forth.

We’ve always said that Amazon S3 provides a “highly durable” storage infrastructure and that objects are stored redundantly across multiple facilities within an S3 region. But we’ve never provided a metric, or explained what level of failure it can withstand without losing any data.

Let’s change that!

Using the definition that I stated above, the durability of an object stored in Amazon S3 is 99.999999999%. If you store 10,000 objects with us, on average we may lose one of them every 10 million years or so. This storage is designed in such a way that we can sustain the concurrent loss of data in two separate storage facilities.

If you are using S3 for permanent storage, I’m sure that you need and fully appreciate the need for this level of durability. It is comforting to know that you can simply store your data in S3 without having to worry about backups, scaling, device failures, fires, theft, meteor strikes, earthquakes, or toddlers.

But wait, there’s less!

Not every application actually needs this much durability. In some cases, the object stored in S3 is simply a cloud-based copy of an object that actually lives somewhere else. In other cases, the object can be regenerated or re-derived from other information. Our research has shown that a number of interesting applications simply don’t need eleven 9’s worth of durability.

To accommodate these applications we’re introducing a new concept to S3. Each S3 object now has an associated storage class. All of your existing objects have the STANDARD storage class, and are stored with eleven 9’s of durability. If you don’t need this level of durability, you can use the new REDUCED_REDUNDANCY storage class instead. You can set this on new objects when you store them in S3, or you can copy an object to itself while specifying a different storage class.

The new REDUCED_REDUNDANCY storage class activates a new feature known as Reduced Redundancy Storage, or RRS. Objects stored using RRS have a durability of 99.99%, or four 9’s. If you store 10,000 objects with us, on average we may lose one of them every year. RRS is designed to sustain the loss of data in a single facility.

RRS pricing starts at a base tier of $0.10 per Gigabyte per month, 33% cheaper than the more durable storage.

If Amazon S3 detects that an object has been lost any subsequent requests for that object will return the HTTP 405 (“Method Not Allowed”) status code. Your application can then handle this error in an appropriate fashion. If the object lives elsewhere you would fetch it, put it back into S3 (using the same key), and then retry the retrieval operation. If the object was designed to be derived from other information, you would do the processing (perhaps it is an image scaling or transcoding task), put the new image back into S3 (again, using the same key), and retry the retrieval operation.

Update (for HTTP protocol geeks only):

Id like to provide clarification regarding our choice of the HTTP 405 (Method Not Allowed) status code. Although 410 (Gone) may seem more appropriate, the HTTP 1.1 spec says that this condition is expected to be permanent and that clients “SHOULD delete references to the Request-URI”. In other words, the 410 status code indicates that the object has intentionally been removed and will not return. That is not necessarily true when data is lost. The object owner may wish to resolve the data loss by reuploading the object, in which case it would have been inappropriate for S3 to return a 410 status code. We believe that 405 is most appropriate because other methods (e.g. PUT, POST, and DELETE) remain valid for the object even if the objects data has gone missing. The objects name (its URI) remains valid, but the data for the object is gone. The 422 and 424 status codes are specific to WebDav and dont apply here.

We expect to see management tools and toolkits add support for RRS in the very near future.

You can use either storage class with Amazon CloudFront, of course.

I anticipate many unanticipated uses for this cool new feature; please feel free to leave me a comment with your ideas.

— Jeff;

PS – check out Amazon CTO Werner Vogels’ take on RRS. His post goes in to a bit more detail on how S3 was designed so that it will never lose data — “Core to the design of S3 is that we go to great lengths to never, ever lose a single bit. We use several techniques to ensure the durability of the data our customers trust us with…”