For the past 4 years, SchoolStatus has been an Amazon Web Services (AWS) shop. We’ve used almost all of the APIs and features available on the AWS platform. For the most part, we’ve been pretty happy with the platform. AWS provides us with a consistent experience and has enabled us to grow at a measured pace.

In fact, to say that we’re happy AWS customers is an understatement. We’re probably an 8 or 9 on the Net Promoter Score measure of customer happiness —we’re actively evangelizing other customers.

In any case, recently, a project we’re working on has a requirement for language translation and tone analysis. After some Googling, I was excited to see that the fairly new IBM Bluemix platform had both of these features available in a RESTful API.

I create an account in Bluemix and off I went.

Downtime from the Beginning

Full disclosure: Before I talk about downtime, I want to remind everyone that AWS has had its share of downtime — I think we all remember the US-EAST-1 inspired end of the world experience of 2011, 2012, and 2015. When AWS goes down, it’s typically a huge cascading failure that takes hours or even days to recover fully. We’ve mitigated these events by ensuring that we have no single point of failure in a single Availability Zone (AZ).

About two weeks ago, one of our developers started plumbing in the Bluemix translation APIs into our infrastructure. He asked me to create a credential for him in Bluemix. No problem, I thought. I logged into the Bluemix console and was promptly greeted with an error message. I tried Safari and Chrome and the same error message persisted.

This is going to be a problem.

Now, I want to give Bluemix a kudos here. Whoever is running their Twitter account is doing a good job. I sent them a message to their Twitter handle, and they responded in a few minutes, empathized, provided an update, and let me know when things were better.

Strange API Conventions

After we finally got the console issues resolved, I created the account (it gave a weird error when the end user tried actually to click on the email — IBM Support Staff had to delete the account. We then recreated the user and it worked). The dev was able to get in.

Our team was able to get the transcription API going without much issue. It was fast… and appeared to be accurate. But when we went to use the same API key with the IBM Bluemix Tone Analyzer API, we found our key didn’t work.

It appears that all Bluemix APIs use a different API key and secret. Why? I have no idea. It makes it more difficult for developers who now have to keep different API credentials for each service and honestly… It’s just weird. Google, Amazon, etc. don’t follow this convention.

Downtime, Downtime, Downtime

Fast forward to yesterday. We finally got everything plumbed in… published it to Production and then finally went to test the brand new API. Guess what? At that moment, IBM Bluemix’s APIs went down. Hard. Bluemix, after an 8-second delay, responded with an HTML error page in it’s API response.

We’re expecting JSON; they’re returning a proxy error page.

This was a great teachable moment for our developers… always expect the remote endpoint to fail. The natural solution would be:

case response.code

when 200

puts "Do what we're supposed to"

when 500...600

puts "Error handle like a boss"

end

Guess what? Bluemix is returning a 200 HTTP response code on error. Not a 5xx, which would be typical.

Side note: We finally settled on looking for JSON and if it’s not responding with that content-type, then error handle.

Looking at the IBM Bluemix Status Page, it appears that that they’ve had 12 downtime events in the past seven days. That’s a pretty good bit for a production API that’s been around for a while.

To us, it’s an unacceptable amount of downtime when we’re building apps on which our customers depend.

We’ll likely plumb in Google Translate as our primary engine and only use the Bluemix APIs when required. For instance, the Tone Analyzer API appears to be a feature unique to Watson. We have a specific use case for this API, and we’ll be required to use it. It makes me nervous going forward.

Is Bluemix ready for prime time?