In a live video stream from Amazon headquarters on January 18, Amazon CTO Werner Vogels and general manager of Amazon Web Services Swami Sivasubramanian announced the availability of a new database-as-a-service offering based on the company's Dynamo storage engine. Called DynamoDB, the service is a "NoSQL" database, with access based on key-value pairs, and is offered as a fully managed database service for Amazon Web Services customers.

Dynamo, described in a paper published in 2007 by Amazon, is the basis of a number of AWS services—including Amazon's Simple Storage Service (S3). It provides high availability through replication across multiple storage nodes, and an "eventual consistency" model—meaning that reads and writes of data to replicas happen without locking of the data, and variations in data are reconciled when replication happens again. Dynamo uses a vector clock versioning system that tracks changes, and can handle resolution of multiple changes on different replicas.

Depending on the application, the Dynamo model can be configured to have greater consistency (sacrificing performance by waiting for a reply from all the storage nodes to determine which is more up to date) or faster response to reads (with the possibility of getting stale data). The system can also be scaled up continuously with new hardware without having an impact on existing storage—it uses a scoring system to weight how much demand is placed on each new node based on its capacity and performance. (For a deeper explanation of Dynamo, see our upcoming feature on cloud distributed storage systems.)

Amazon has been using and developing Dynamo internally for years now, and DynamoDB is its latest evolution. Vogels said that it builds on Amazon's experience with other non-relational databases and cloud services built using Dynamo, including S3 and SimpleDB. In a blog post accompanying the release, Vogels said that DynamoDB "automatically spreads the data and traffic for a (database) table over a sufficient number of servers to meet the request capacity specified by the customer." He said that the expected response time from any query against a DynamoDB table is in the "single-digit milliseconds." The service uses SSD storage and can be spread across multiple AWS "availability areas" for resiliency and higher speed response. DynamoDB is also integrated with Amazon's Elastic MapReduce, the company's implementation of the Hadoop distributed processing engine, which can be used to create large-scale, complex analytical queries against large volumes of data.

Don MacAskill, CEO of photo site SmugMug, appeared with Vogels and Sivasubramanian during the announcement. SmugMug, an AWS customer, has been using DynamoDB in private beta. "We have a lot of scalability challenges," MacAskill said. "We have to store billions of photos." Many of the storage issues had been resolved with Amazon's S3 service, he said, but "we still had a huge monster on our back with databases." The databases that drive SmugMug's Web applications, he said, have had to continuously be scaled up and out—on bigger and more hardware—so the company "had to invest an awful lot of time and energy (and) capital expense into building, scaling, monitoring and handling all the problems around these databases. We've always wanted to not have to worry about that anymore."

MacAskill said that the company had been using a non-relational interface already into the MySQL database. But he said that DynamoDB's relatively simple API, the support for variable consistency, and extremely low latency were a major win for SmugMug—along with being able to hand off responsibility for backup, replication, monitoring and provisioning to Amazon.

Amazon is offering DynamoDB at a basic level as part of AWS' "free tier," with 100 MB of storage, 5 writes per second, and 10 reads per second of capacity. The service can be provisioned on a flat rate based on write, read, and storage capacity provisioned—a cent an hour for each 10 "units of write capacity"—10 writes per second of capacity for writes up to a kilobyte in size; a cent an hour for every 50 reads per second of that size; and a dollar per gigabyte of storage per month. There's no fee for the first gigabyte of data transfer per month, and all data transfers into DynamoDB are free; it's getting it out that starts to add up, starting at 12 cents per gigabyte up to the first 10 terabytes of data, and on a sliding scale down from there.