Sentiment analysis is a common Natural Language Processing (NLP) task that can help you sort huge volumes of data, from online reviews of your products to NPS responses and conversations on Twitter.

In this post, you’ll learn how to do sentiment analysis in Python and how to build a simple sentiment classifier with SaaS tools like MonkeyLearn. Just sign up for free to build models and use APIs in Python, or check out this online sentiment analyzer to get an idea of how MonkeyLearn’s tools work.

What Is Sentiment Analysis?

Sentiment analysis is a set of Natural Language Processing (NLP) techniques that extract opinions in natural language text. Simply put, the objective of sentiment analysis is to categorize the sentiment of a text by sorting it into positive, neutral, and negative.

For example, you could use sentiment analysis tools to monitor brand sentiment on social media to discover:

Trending topics in relation to your business

Highly urgent brand mentions (e.g. those that could damage your reputation)

Exactly what you’re doing right or wrong

Getting machines to perform sentiment analysis is no easy feat, and involves skills from machine learning experts.

Tutorial: How to Do Sentiment Analysis in Python

Now, you can do sentiment analysis by rolling out your own application from scratch, or maybe by using one of the many excellent open-source libraries out there, such as scikit-learn.

However, implementing a machine learning solution on your own can be a daunting task that requires data scientists.

You will need to gather quality data to train the models, source some hardware (maybe even GPUs) to run your software on, and test relentlessly to get a data analysis solution that works. Once you’ve built your model – and it works– more resources are needed to integrate the new module into your existing solution, to maintain it and to keep it updated.

Instead, you might be better off using a SaaS API in Python to perform sentiment analysis, provided by cloud solutions like MonkeyLearn. In this tutorial, you will learn how to use MonkeyLearn’s API in Python to connect a sentiment analysis model.

First, sign up to access your MonkeyLearn API.

Now, let’s say you are happy with how our sentiment analysis classification model categorizes texts, and you want to automate access to the model, and programmatically perform sentiment analysis. In the API tab, there are instructions on how to integrate using your own code, whether written in Python, Ruby, PHP, Node, or Java:

You can also send plain requests to our API, and parse the JSON responses yourself. However, we built SDKs in multiple languages to make integrating our API simpler for developers.

Now that the introductions are out of the way, let’s get down to business. First of all, to use our API you need to get an API key. Sign up for free to get yours. Then, install the Python SDK:

pip install monkeylearn 1 2 pip install monkeylearn

You can also clone the repository and run the setup.py script:

$ git clone git@github.com:monkeylearn/monkeylearn-python.git $ cd monkylearn-python $ python setup.py install 1 2 3 4 $ git clone git @ github .com : monkeylearn / monkeylearn - python .git $ cd monkylearn - python $ python setup .py install

And that’s it for setup.

You’re ready to run a sentiment analysis on your texts with the following code:

from monkeylearn import MonkeyLearn ml = MonkeyLearn('<<Your API key here>>') data = ['The restaurant was great!', 'The curtains were disgusting'] model_id = 'cl_pi3C7JiL' result = ml.classifiers.classify(model_id, data) print(result.body) 1 2 3 4 5 6 7 8 9 from monkeylearn import MonkeyLearn ml = MonkeyLearn ( '<<Your API key here>>' ) data = [ 'The restaurant was great!' , 'The curtains were disgusting' ] model_id = 'cl_pi3C7JiL' result = ml . classifiers . classify ( model_id , data ) print ( result . body )

The output will be a Python dict generated from the JSON sent by MonkeyLearn, and should look something like this:

[{ 'text': 'The restaurant was great!', 'classifications': [{ 'tag_name': 'Positive', 'confidence': 0.993, 'tag_id': 33767179 }], 'error': False, 'external_id': None }, { 'text': 'The curtains were disgusting', 'classifications': [{ 'tag_name': 'Negative', 'confidence': 0.979, 'tag_id': 33767178 }], 'error': False, 'external_id': None }] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 [ { 'text' : 'The restaurant was great!' , 'classifications' : [ { 'tag_name' : 'Positive' , 'confidence' : 0.993 , 'tag_id' : 33767179 } ] , 'error' : False , 'external_id' : None } , { 'text' : 'The curtains were disgusting' , 'classifications' : [ { 'tag_name' : 'Negative' , 'confidence' : 0.979 , 'tag_id' : 33767178 } ] , 'error' : False , 'external_id' : None } ]

We return the input text list in the same order, with each text and the output of the model. Now, you’re ready to start automating processes and get insights from data.

For full documentation of our API and its features, check out our docs.

Build Your Own Sentiment Analysis Classifier

It’s important to remember that machine learning models perform well on texts that are similar to the texts used to train them. For example, if you train a sentiment analysis model using survey responses, it will likely deliver highly accurate results for new survey responses, but less accurate results for tweets.

Generic sentiment analysis models are great for getting started right away, but you’ll probably need a custom model trained with your own data and labeling criteria for more accurate results.

With MonkeyLearn you can easily build a custom classifier by:

uploading your own texts

defining your own tags and criteria for positive, neutral and negative sentiments

training models via a simple user interface

Before you get started, you’ll need training data for your model:

Data Training Set For Your Model

The single most important thing for a machine learning model is the training data. Without good data, the model will never be good; as the saying goes, garbage in, garbage out.

For this example, you can use this dataset, composed of texts from hotel reviews. The dataset is a CSV file with two columns: Text and Sentiment, which can be one for negative or positive.

Not all the texts of the dataset are tagged. MonkeyLearn will train a model with the tagged texts, and then you can keep improving the model by tagging more texts yourself using our UI.

Now, let’s upload this data to MonkeyLearn to train a sentiment model for hotel reviews.

Training Your Custom Sentiment Analysis Model

Creating a custom model is simple. All you need to do is upload your data and tag it if needed, and the model will learn from this data. MonkeyLearn automatically chooses the best parameters and handles the training for you. Sign up to start building custom models for free.

1. Create a text classifier

First, go to the dashboard, then click Create a Model, and choose Classifier:

You’ll be prompted to choose a more specific classification model, so we can automatically tune it to your needs. Choose Sentiment Analysis:

2. Upload the data from the dataset

Next, you have to upload the data for your classifier. There are many ways to do this but, in this case, choose CSV and upload the example dataset from earlier.

The final step is to tell MonkeyLearn how to interpret the columns in the file. If you were to upload an untagged file this wouldn’t be an issue. However, since our dataset has some tags already, you need to check Advanced and select Use as Tag on the tag column:

3. Test the model

You’re done! The model has been trained and is now ready to use.

In the Run tab, you can find all the options for testing and using your model, just like with the pre-trained sentiment analysis model from before.

4. Keep improving the model

Remember, in the dataset, we included some untagged texts as well. You can keep training and testing your model by going to to the ‘train’ tab and tagging your test set – this is also known as active learning and will improve your model:

Calling the Model API with Python

You’re done! Now the model is ready to use.

You can perform classification and get sentiment labels for your texts in pretty much the same way as with the public model you used earlier:

from monkeylearn import MonkeyLearn ml = MonkeyLearn('<<Your API key here>>') data = ['The room was great!', 'The curtains were disgusting'] model_id = '<<Your model ID here>>' result = ml.classifiers.classify(model_id, data) print(result.body) 1 2 3 4 5 6 7 8 9 from monkeylearn import MonkeyLearn ml = MonkeyLearn ( '<<Your API key here>>' ) data = [ 'The room was great!' , 'The curtains were disgusting' ] model_id = '<<Your model ID here>>' result = ml . classifiers . classify ( model_id , data ) print ( result . body )

And the output for this code will be similar as well:

[{ 'text': 'The room was great!', 'classifications': [{ 'tag_name': 'positive', 'confidence': 0.836, 'tag_id': 103237939 }], 'error': False, 'external_id': None }, { 'text': 'The curtains were disgusting', 'classifications': [{ 'tag_name': 'negative', 'confidence': 0.924, 'tag_id': 103237938 }], 'error': False, 'external_id': None }] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 [ { 'text' : 'The room was great!' , 'classifications' : [ { 'tag_name' : 'positive' , 'confidence' : 0.836 , 'tag_id' : 103237939 } ] , 'error' : False , 'external_id' : None } , { 'text' : 'The curtains were disgusting' , 'classifications' : [ { 'tag_name' : 'negative' , 'confidence' : 0.924 , 'tag_id' : 103237938 } ] , 'error' : False , 'external_id' : None } ]

An important side note is that you can do all of this from the API – you can create a classifier, upload data to it, create or delete tags, and so on. If you’re interested in how this is done, check out our API documentation.

Wrap-Up

Sentiment analysis is a powerful tool with many applications. However, using it to automate processes and get insightful data is not always simple.

Instead of setting up your own algorithms from scratch to run sentiment analysis, we recommend using a pre-built model. At least to start with, so that you can understand how sentiment analysis works, as well the benefits it can bring to your business. It’s also time-consuming to set up your own machine learning infrastructure, not to mention costly since you’ll need extra resources and hardware.

With MonkeyLearn, you can start doing sentiment analysis right now, either with a pre-trained model or by training your own. We recommend the latter so that you can tailor models to your business using data and tags that are relevant to the problems you’re trying to solve – leading to better insights for your business.

MonkeyLearn also has clear documentation on how to set up your own models using our API. First, you’ll need an API key, which you can get when you sign up for free to MonkeyLearn. Then, all that’s left is to get started with sentiment analysis by installing the Python SDK. So, what are you waiting for?!