Elizabeth Weise

USATODAY

SAN FRANCISCO — It didn't quite break the Internet, but a 4-hour outage at Amazon's AWS cloud computing division caused headaches for hundreds of thousands of websites across the United States.

Little known to consumers familiar with Amazon's online shopping site, Amazon Web Services is a giant provider of the back-end of the Internet. For sites like Netflix, Spotify, Pinterest and Buzzfeed, as well as tens of thousands of smaller sites, it provides cloud-based storage and web services for companies so they don’t have to build their own server farms, allowing them to rapidly deploy computing power without having to invest in infrastructure.

For example, a business might store its videos, images or databases on an AWS server and access it via the Internet.

While not all AWS clients were affected by the outage at one of AWS's main storage systems, some experienced slowdowns, after a big portion of its S3 system went offline Tuesday afternoon.

Amazon wasn't able to update its own service health dashboard for the first two hours of the outage because the dashboard itself was hosted on AWS.

"This is a pretty big outage," said Dave Bartoletti, a cloud analyst with Forrester. "AWS had not had a lot of outages and when they happen, they're famous. People still talk about the one in September of 2015 that lasted five hours," he said.

The S3 system is used by 148,213 sites according to market research firm SimilarTech. It has "north of three to four trillion pieces of data stored in it," Bartoletti said.

The outage appeared to have begun around 12:35 pm ET, according to Catchpoint Systems, a digital experience monitoring company. Operations were fully recovered by 4:49 pm ET, Amazon said. The Seattle-based company did not comment on the cause of the outage.

The most common causes of this type of outage are software related, said Lydia Leong, a cloud analyst with Gartner. "Either a bug in the code or human error. Right now we don't know what it was."

The system that went down was the first of what now are three AWS regions in the United States. It is still the largest and is also where AWS rolls out new features, "so it's disproportionately big," she said.

AWS began as a profitable sideline to Amazon’s main online sales business but has since grown to become the major player in the arena as well as a major money-maker in its own right for Amazon. In the fourth quarter of 2016, the division accounted for 8% of Amazon’s total revenue.

MORE:

Cloud warriors led by Amazon, Microsoft battle for $300B in spoils

Amazon, Netflix elbow into Oscars with 4 wins

Coolest tech to see at Barcelona's Mobile World Congress

AWS S3 is used by businesses both large and small. “More than anything else, S3 customers need to be able to get at their data, because often S3 is used to store images. So no S3, no nice picture or fancy logo on your website,” said Leong.

Alabama bamboo site

That was exactly the problem faced by Lewis Bamboo, a small, family-owned bamboo nursery in Oakman, Alabama.

“As our business is in bamboo plants, pictures are a very important part of selling our product online. We use Amazon S3 to store and distribute our website images. When Amazon’s servers went down, so did the majority of our website,” said the company's chief technology officer Daniel Mullaly.

“Thankfully we also store the images locally and I was able to serve the images directly from our server instead,” he said.

The effects of the outage varied depending on the site and how it used AWS. Modern websites usually pull data from multiple databases in the cloud that can be stored all over the world, so a photo might come from one place, a price list from another and a customer database from a third.

For that reason, entire websites rarely go down but various part of them may take a long time to load or not load at all, leaving broken links or images.

Companies have been steadily moving storage to the cloud because it is cheaper, easily accessible and more resilient. But the downside is that when there are problems, there's a cascade effect.

It's possible to contract with multiple companies to avoid potential problems, but that strategy is pricey, so many companies make peace with the knowledge that on rare occasions they're going to have a very bad day.

"Only the most paranoid, and very large companies, distribute their files across not just AWS but also Microsoft and Google, and replicate them geographically across regions — but that's very, very expensive," Gartner's Leong said.