A sentiment analysis of a Premier League game: Man United vs Arsenal

One of the biggest rivalries in the English Premier League is the one between Manchester United and Arsenal. Less so in recent years since the retirement of Alex Ferguson, but still a hugely anticipated match between two of the world's biggest clubs.

One of the major outlets for fans of these two clubs to voice their opinions and feelings about upcoming games is Twitter. With a Twitter fanbase of 9.31M for Man United and 8.48M for Arsenal, it can be a great source of data about how the fans are feeling in the build-up to a game and their emotions in the hours after it.

On the 19th of November 2016, Man United and Arsenal went head to head in a game that, in all honesty, won't go down in history as being a great game. Although both teams are battling to stay in the running for the title, the game ended with just a 1-1 draw. Nothing all that exciting for the Twitter followers to remark on (with the exception of the equalising goal).

But we can still see some interesting results when we analyse the tweets being posted about the game by doing a sentiment analysis. A sentiment analysis essentially takes a piece of text and assigns emotions to the specific words being used. That overall text can then be determined to be positive or negative and we can work out the specific emotions being expressed. We can then plot these emotions on a graph and examine how they change over time.

For example, this tweet below can be classed as being an overall negative one. Each of the words being used “abysmal”, “gutless” etc. can be grouped into specific emotions, this helps us understand the feelings being expressed in the tweet.

Below is a graph of the results. As you can see, tweets about this game started growing strongly an hour or two before the match, peaked towards the end of the match, and declined steadily until ten hours after the match.

Two interesting points worth highlighting from the results are the levels of surprise and trust:

Looking at the surprise, we can see a clear spike towards the end of the game, most likely caused when Olivier Giroud scored the equalising goal in the 89th minute.

Analysing the trust is quite interesting, a huge number of people tweeting felt a lot of trust before the game kicked off, it then drops slightly after kickoff but starts to rise again half way through the game. Possibly at halftime with the score being 0-0, the fans felt it was all still all to play for.

It's clear to see the potential use cases for a system such as this, a complex analysis of a large constantly updating dataset, scheduled to run at predefined intervals. For example, we’ve used this previously to explore the sentiment on the US presidential election.

The difficulty with an analytical project like this is setting it up. Building the data pipeline that goes from gathering the data, to building the analysis workflow, to scheduling that workflow to run periodically and then to display the results, usually takes a lot of expertise and overhead. However, Idiro Analytics have developed a tool called Red Sqirl which can perform each of these steps in one intuitive interface.

Modern sports and data analytics now go hand-in-hand, it'd be hard to imagine a professional sports organisation that wouldn't be utilising data analytics in some form. And with data becoming more easily obtainable, it opens up so many more opportunities. With the right tools data analytics can be accessible to a lot more people.

Red Sqirl

Red Sqirl is a flexible drag-and-drop Big Data analytics platform with a unique open architecture.

Red Sqirl makes it easy for your analysts and data scientists to analyse the data you hold on your Hadoop platform.

For more information visit RedSqirl.com, and for a guide on how to build the entire process of analysing Twitter data using Red Sqirl, as outlined above, please read our detailed guide.

Title image courtesy of Premier League ©