Monday, January 18, 2016 at 8:56AM

Which cloud should you use? It may depend on what you need to do with it. What Zach Bjornson needs to do is process large amounts scientific data as fast as possible, which means reading data into memory as fast as possible. So, he made benchmark using Google's new multi-cloud PerfKitBenchmarker, to figure out which cloud was best for the job.

The results are in a very detailed article: AWS S3 vs Google Cloud vs Azure: Cloud Storage Performance. Feel free to datamine the results for more insights, but overall his conclusions are:

Amazon and Azure provide the lowest latency , while Google provides the highest throughput, for both uploads and downloads . This means that AWS and Azure excel for smaller files, while GCE excels for larger files, and this highlights the importance of benchmarking with data that are comparable in size to what your application uses.

, while . This means that AWS and Azure excel for smaller files, while GCE excels for larger files, and this highlights the importance of benchmarking with data that are comparable in size to what your application uses. The substantial limitations on AWS EC2 network throughput must be taken into consideration when designing high-speed data processing systems.

must be taken into consideration when designing high-speed data processing systems. Google's unique multi-region buckets keep costs down when working with data from multiple datacenters in the same region (e.g. continent).

when working with data from multiple datacenters in the same region (e.g. continent). Object storage scales automatically to provide high aggregate throughput.

Finally, note that I’m only showing data from API access (which is the exact same boto code for AWS and Google), and I have unsurprisingly observed substantial differences in performance from different clients (the vendor-specific CLIs, node.js API package, cURL’ing URLs, etc.).

All benchmark caveats apply or course. Zach discusses more about how the benchmarks were made:

I actually ran the tests quite a few times (~30 times for GCS, ~15 times for AWS, less for Azure) over the course of about four months, and in the blog I focused on the consistent aspects: The ranking of the medians were stable, e.g. in every repeat of the benchmarks, Google had the highest throughput and S3 and Azure always had the lowest latency.

S3 appears to have a throughput cap at ~91 MB/s (figure 1-right where the line becomes perfectly flat).

AWS has relatively low caps on VM network throughput, which puts an effective cap on S3 throughput as well. For what its worth, the numerical throughput and latency values were surprisingly stable. The benchmarks take about 20 minutes to run, so there's also a fair amount of averaging going right there.

Related Articles