To provide a first impression of the performance of theses services, I’ve created a benchmark using data curated by Kotzias et al. (2015).[14]

Three subsets of reviews from well-known data sets with 1,000 instances each are included in this compilation: Amazon product reviews, movie reviews from the IMDB data set and Yelp restaurant reviews.

Amazon Comprehend

Amazon’s natural language processing solution Comprehend was launched last year and, currently, supports English and Spanish documents.

Requests are measured in units of 100 characters, with a 3 unit minimum per request.

Just like the other services listed here, Amazon Comprehend has tiers based on the number of requests per month. Up to 10 million units, the price per 1,000 units is $0.1. For requests above the 50 million mark, the price is set to $0.025.

Given a credentials provider, a text and a language code, a prediction of the sentiment can be requested as follows:

Sentiment analysis with Amazon Comprehend

The API supports batch requests with up to 25 documents (with, at most, 5,000 characters) and generates a probability distribution over four classes: negative, mixed, neutral and positive.

Unsurprisingly, Comprehend achieved the best performance on the 1,000 Amazon product reviews. Combined with accuracy rates of close to 90% on the other two data sets, this makes Amazon’s API the runner-up in the benchmark.

Google Cloud Natural Language API

Google’s Cloud Natural Language API supports nine languages and generates two sentiment analysis values: score and magnitude.

The score of a document’s sentiment indicates the overall emotion of a document.

The magnitude indicates how much emotional content is present within a document and is often proportional to the length of the document.

Documents that express few emotions or mixed emotions have a neutral score around 0.0. The magnitude value can be used to disambiguate these two cases. Low-emotion documents have low magnitude values, while mixed emotions are associated with higher magnitude values.

The pricing model is based on units of 1,000 characters per document. For monthly requests in the range between 5,000 units and a million units, the price per 1,000 units is $1. Between 5,000 and a million units, the price per 1,000 units is $1. The price decreases to $0.25 for requests in the range between 5 million and 20 million units.

Assuming the GOOGLE_APPLICATION_CREDENTIALS environment variable is set to the path of the JSON file containing the project credentials, the following code performs sentiment analysis for a given text:

Sentiment analysis with Googles’ Cloud Natural Language API

Google’s services makes up for the lack of batching processing with an excellent accuracy of 92.1 %, achieving the best performance in two out of three data sets. Performance-wise, the Cloud Natural Language API is the clear winner of our competition.

Microsoft Text Analytics API

Microsoft’s sentiment analyzer performs binary classification and, consequently, assigns a probability to every document. When a text cannot be analyzed or has no sentiment, the service always returns 0.5 exactly.

A free tier with 5,000 transactions per month allows you to explore the API without a financial commitment. The entry-level Standard S0 tier is offered at a price of $74.71 per month and comes with 25,000 requests. The most expensive publicly disclosed tier, Standard S4, sets you back $4,999.99 per month and includes 10 million requests.

The price per 1,000 transactions above the tier limits ranges from $0.50 to $3.

The API supports 15 European languages and batch requests of up to 1,000 documents.

Unfortunately, these features are not matched by its performance in our test. With an average accuracy of 81.8% the Text Analytics API trailed Google’s service by more than ten percentage points.

A beta version of a Java SDK is available, but it was easier to work with Unirest and GSON:

Watson Natural Language Understanding

IBM Watson’s sentiment analyzer supports ten languages and returns a score ranging from -1 to +1.

The billing unit consists of 10,000 characters. Under the free Lite plan, 30,000 units are available per month. The entry-level tier of the standard plan covers the first 250,000 units per month at a price of $3 per 1,000 units. After 5 million units, the price drops down to $0.20.

In our test, the Watson API performed significantly better than Microsoft’s Text Analytics API but worse than Amazon Comprehend.

The Java SDK does not appear to support batch requests.