



The tiny vertical black lines mark the mean score for each emoticon.





The complete data collection, analysis and plotting code can be found on There is no ordering to the colour scale. The colours just help differentiate each row.The complete data collection, analysis and plotting code can be found on this github repo





Okay, so here's a list of observations and (partial) explanations for some surprises

o.O and :* score higher than :-)

I think the ubiquity of :-) is its burden. People feel :-) for all sorts of reasons. Also, the score for o.O is computed over a much smaller number of tweets, and is possibly unstable. I can understand people using :-) at sad stuff, but what kind of a person uses :-( for happy tweets? (There aren't many of these, but a couple of them are too far right.) Let's look at one of those tweets:



Wow I was sleeping sooooo good which doesn't happen very often & They called from work & woke me up .. Now I can't go back to sleep :-(

That makes sense. It's a tweet that turned sour half way through, but overall, had a pretty high density of positive words, so it's no surprise that our scorer tagged it with a positive score Here's a tweet with a 8D in it:



Got to take a pic with heage ! Who has by far been the most fun, funny and candid lecturer(in my... http://t.co/WaOTW8D2YO



Notice anything funny? It's a happy tweet, but the emoticon we were looking for, is conspicuously absent! Actually, the 8D does occur in the tweet - albeit in a url http://t.co/WaOTW 8D 2YO

Thanks to Twitter's automatic url compression using t.co, it's entirely possible to see an arbitrary collection of alphanumeric characters in a tweet without any semantic information. So be wary of the scores for stuff like 8D and xD.

So the next time you can't tell what someone is trying to convey with an emoticon, this chart might come in handy as a reference. In the meantime, if you're happy and you know it, contort your pupils o.O

All code used has been made available on github

Clearly, a :) is happier than a :( but what about a :-* and a :-D ? Or a :-| and a :-o ? In this post I attempt to rank emoticons in order of how happy someone has to be to use each one. (And punctuate horribly to avoid mixing punctuation with the emoticon)To start off, I need a collection of emoticons associated with some text. And where else would I find this, but that gigantic compendium of everyday emotions, the definitive corpus of our age - Twitter.If you'd rather read the code yourself, it's available here , but the methodology is this: I collect lots of tweets containing emoticons, assign each one a 'sentiment' score[1], and then order the emoticons based on the average sentiment score of tweets containing each emoticonThe tweet gathering process is fairly direct. I parse tweets obtained from the streaming API[2] which contain any of a set of predefined emoticons and write them out to a file. If you want to, have a look at the Python code here . For the purpose of the R analysis , the tweet texts are already in a file. Each line is then (a) parsed for the emoticons it contains, and (b) assigned a sentiment score[3].Finally, we plot each tweet on an emoticon-score plot. Like so:Notes[1] A linear scale where positive is happy, negative is unhappy[2] Twitter's Search API handles punctuation poorly, so that's not an option[3] Assignment of this score is done via a relatively simple lookup mechanism. This file provides a good evaluation