A few months ago, I started looking into the Twitter API and I have developed twitter4s, an asynchronous non-blocking Twitter client in Scala.

In this article, we will introduce twitter4s providing examples of how to download tweets from a user timeline and how to perform some simple data analysis.

The code shown in this tutorial is available on Github.

Getting Started

The Twitter API can be accessed by means of a registered app. So, first of all, we need to register our app. Login with your twitter account and have a look at the Twitter API terms and conditions.

If you are happy with them, register your app at http://apps.twitter.com.

In order to do so, you will need to provide an app name and an brief description of what it does. After the registration, a consumer key and consumer secret will be provided: save them as we will need them when setting up twitter4s.

Also, generate an access key and access secret and make sure you have the correct permissions: for this tutorial “Read Only” is enough.

Finally, please note that the Twitter API has rate limits — have a look at the Twitter’s development website for more information.

Also, have a look at the rate limits chart where the rate limit for each endpoint is summarized.

Setup

If not already there, add Maven Central as resolver in your SBT configuration:

resolvers += "Maven central" at "http://repo1.maven.org/maven2/"

Also, you need to include the library as your dependency:

libraryDependencies ++= Seq( "com.danielasfregola" %% "twitter4s" % "0.2.1" )

Usage

Add your consumer and access token to your configuration file and initialize your Twitter Client:

import com.danielasfregola.twitter4s.TwitterClient val client = new TwitterClient()

Alternatively, you can also specify your tokens directly when creating the client:

import com.danielasfregola.twitter4s.TwitterClient import com.danielasfregola.twitter4s.entities.{AccessToken, ConsumerToken} val consumerToken = ConsumerToken(key = "my-consumer-key", secret = "my-consumer-secret") val accessToken = AccessToken(key = "my-access-key", secret = "my-access-secret") val client = new TwitterClient(consumerToken, accessToken)

Now that our Twitter Client has been initialized, we are now ready to use it! 😀

Have a look at its documentation for a complete list of the supported functionalities.

Top Hashtags in Timeline

As a sample code, let’s collect the tweets in a user timeline and display the top 10 hashtags used. In the tutorial, we will download and analyze tweets by Martin Odersky (the creator of Scala).

First, we need to get the tweets from the user timeline:

client.getUserTimelineForUser(screen_name = "odersky", count = 200)

The method getUserTimelineForUser (see scaladoc) return type is Future[Seq[Tweet]] .

Note that a Tweet is a quite rich case class that contains a lot of information (see its scaladoc): it has more than 22 fields!

The need of having huge case classes is the reason why this library doesn’t support Scala versions older than 2.11: previous versions allow up to 22 fields in a case class.

In order to retrieve the hashtags used in a Tweet, we don’t have to parse the text of the tweet, as the Twitter API has already done all the hard work for us: we just need to access the Entities field and count how many times each hashtag is used:

def getTopHashtags(tweets: Seq[Tweet], n: Int = 10): Seq[(String, Int)] = { val hashtags: Seq[Seq[HashTag]] = tweets.map { tweet => tweet.entities.map(_.hashtags).getOrElse(Seq.empty) } val hashtagTexts: Seq[String] = hashtags.flatten.map(_.text.toLowerCase) val hashtagFrequencies: Map[String, Int] = hashtagTexts.groupBy(identity).mapValues(_.size) hashtagFrequencies.toSeq.sortBy { case (entity, frequency) => -frequency }.take(n) }

Let’s put everything together and add some code to print the results with a nice layout:

import com.danielasfregola.twitter4s.TwitterClient import com.danielasfregola.twitter4s.entities.{HashTag, Tweet} import scala.concurrent.ExecutionContext.Implicits.global object UserTopHashtags extends App { def getTopHashtags(tweets: Seq[Tweet], n: Int = 10): Seq[(String, Int)] = { val hashtags: Seq[Seq[HashTag]] = tweets.map { tweet => tweet.entities.map(_.hashtags).getOrElse(Seq.empty) } val hashtagTexts: Seq[String] = hashtags.flatten.map(_.text.toLowerCase) val hashtagFrequencies: Map[String, Int] = hashtagTexts.groupBy(identity).mapValues(_.size) hashtagFrequencies.toSeq.sortBy { case (entity, frequency) => -frequency }.take(n) } val client = new TwitterClient() val user = "odersky" client.getUserTimelineForUser(screen_name = user, count = 200).map { tweets => val topHashtags: Seq[((String, Int), Int)] = getTopHashtags(tweets).zipWithIndex val rankings = topHashtags.map { case ((entity, frequency), idx) => s"[${idx + 1}] $entity (found $frequency times)"} println(s"${user.toUpperCase}'S TOP HASHTAGS:") println(rankings.mkString("

")) } }

At the time of this writing, running the following code generates the following output:

ODERSKY'S TOP HASHTAGS: [1] scala (found 25 times) [2] scaladays (found 5 times) [3] scalajs (found 4 times) [4] progfun (found 3 times) [5] coursera (found 2 times) [6] scalax (found 1 times) [7] community (found 1 times) [8] aws (found 1 times) [9] iexpectmoreofapple (found 1 times) [10] scalamatsuri (found 1 times)

Summary

In this article we have introduced a new asynchronous non-blocking Scala Client for Twitter, called twitter4s.

We have described how to register our app, setup the Twitter Client and we have provided a sample code to download tweets from a user timeline and analyze their hashtags.

The code shown in this tutorial can be found here.