Photo by Brooke Lark on Unsplash

Background

All the code is available at https://github.com/bulkan/queshuns

Since reading this post by Simon Willison I’ve been interested in Redis and have been following its development. After having a quick play around with Redis I’ve been looking for a project to work on that uses Redis as a data store. I then came across this blog post by Mirko Froehlich, in which he shows the steps and code to create a Twitter filter using Redis as the datastore and Sinatra as the web app. This blog post will explain how I created queshuns.com in Python and the various listed tools below.

tweetstream - provides the interface to the Twitter Streaming API

CherryPy - used for handling the web app side, no need for an ORM

Jinja2 - HTML templating

jQuery - for doing the AJAXy stuff and visual effects

redis-py - Python client for Redis

Redis - the “database”, look here for the documenation on how to install it

Retrieving tweets

The first thing we need to is retrieve tweets from the Twitter Streaming API. Thankfully there is already a Python module that provides a nice interface called tweetstream. For more information about tweetstream look at the Cheeseshop page for its usage guide.

Here is the code for the filter_daemon.py, which when executed as a script from the command-line will start streaming tweets from Twitter that contain the words “why”, “how”, “when”, “lol”, “feeling” and the tweet must end in a question mark.

Copy 1 import time 2 3 import redis 4 import tweetstream 5 6 from datetime import datetime 7 8 try : 9 import simplejson as json 10 except : 11 import json 12 13 14 class FilterRedis ( object ) : 15 16 key = "tweets" 17 r = redis . Redis ( ) 18 r . connect ( ) 19 num_tweets = 20 20 trim_threshold = 100 21 22 def __init__ ( self ) : 23 self . trim_count = 0 24 25 26 def push ( self , data ) : 27 self . r . push ( self . key , data , True ) 28 29 self . trim_count += 1 30 if self . trim_count >= self . trim_threshold : 31 self . r . ltrim ( self . key , 0 , self . num_tweets ) 32 self . trim_count = 0 33 34 35 def tweets ( self , limit = 15 , since = 0 ) : 36 data = self . r . lrange ( self . key , 0 , limit - 1 ) 37 return [ json . loads ( x ) for x in data if int ( json . loads ( x ) [ 'received_at' ] ) > since ] 38 39 40 if __name__ == '__main__' : 41 fr = FilterRedis ( ) 42 43 words = [ "why" , "how" , "when" , "lol" , "feeling" ] 44 45 username = "your twitter username" 46 password = "password for twitter account" 47 48 with tweetstream . TrackStream ( username , password , words ) as stream : 49 for tweet in stream : 50 if 'text' not in tweet : continue 51 if '@' in tweet [ 'text' ] or not tweet [ 'text' ] . endswith ( '?' ) : 52 continue 53 fr . push ( json . dumps ( { 'id' : tweet [ 'id' ] , 54 'text' : tweet [ 'text' ] , 55 'username' : tweet [ 'user' ] [ 'screen_name' ] , 56 'userid' : tweet [ 'user' ] [ 'id' ] , 57 'name' : tweet [ 'user' ] [ 'name' ] , 58 'profile_image_url' : tweet [ 'user' ] [ 'profile_image_url' ] , 59 'received_at' : time . time ( ) } 60 ) 61 ) 62 print tweet [ 'user' ] [ 'screen_name' ] , ':' , tweet [ 'text' ] . encode ( 'utf-8' )

In this script I define a class, FilterRedis which I use to abstract some methods that will be used by both filter_daemon.py and later by the web app itself.

The important part of this class is the push method, which will push data onto the tail of a Redis list. It also keeps a count of items and when it goes over the threshold of 100 items, it will trim starting from the head and the first 20th elements (or the oldest tweets).

The schema for the tweet data that gets pushed into the Redis list is a dictionary of values that gets jsonified (we can probably use then new Redis hash type);

Copy 1 { 2 "id" : "the tweet id" , 3 "text" : "text of the tweet" , 4 "username" : "" , 5 "userid" : "userid" , 6 "name" : "name of the twitter user" , 7 "profile_image_url" : "url to profile image" , 8 "received_at" : time.time() 9 }

‘received_at’ is important because we will be using that to find new tweets to display in the web app.

Web App

I picked CherryPy to write the web application, because I wanted to learn it for the future when I need to write a small web frontends that dont need an ORM. Also, CherryPy has a built-in HTTP server that is sufficient for websites with small loads, which I initially used to run queshuns.com it is now being run with mod_python. For templating, I used Jinja2 because its similair in syntax to the Django templating language that I am familiar with.

The following is the code for questions_app.py which is the CherryPy application.

Copy 1 import time 2 import os 3 4 import cherrypy 5 import jinja2 6 7 from filter_daemon import * 8 9 try : 10 import json 11 except : 12 import simplejson as json 13 14 from simplejson import JSONEncoder 15 encoder = JSONEncoder ( ) 16 17 def jsonify_tool_callback ( * args , ** kwargs ) : 18 response = cherrypy . response 19 response . headers [ 'Content-Type' ] = 'application/json' 20 response . body = encoder . iterencode ( response . body ) 21 22 cherrypy . tools . jsonify = cherrypy . Tool ( 'before_finalize' , jsonify_tool_callback , priority = 30 ) 23 24 root_path = os . path . dirname ( __file__ ) 25 26 27 env = jinja2 . Environment ( loader = jinja2 . FileSystemLoader ( os . path . join ( root_path , 'templates' ) ) ) 28 def render_template ( template , ** context ) : 29 global env 30 template = env . get_template ( template + '.jinja' ) 31 return template . render ( context ) 32 33 34 class Questions ( object ) : 35 _cp_config = { 36 'tools.encode.on' : True , 37 'tools.encode.encoding' : 'utf8' , 38 } 39 40 fr = FilterRedis ( ) 41 42 @cherrypy . expose ( ) 43 def index ( self ) : 44 tweets = self . fr . tweets ( since = 0 ) 45 return render_template ( 'index' , tweets = tweets ) 46 47 @cherrypy . expose ( ) 48 @cherrypy . tools . jsonify ( ) 49 def latest ( self , since , nt ) : 50 if not since : 51 since = 0 52 53 tweets = self . fr . tweets ( limit = 5 , since = float ( since ) ) 54 return render_template ( 'tweets' , tweets = tweets ) 55 56 if __name__ == '__main__' : 57 cherrypy . quickstart ( Questions ( ) )

The index (method) of the web app will get the all the tweets from Redis. The other exposed

function is latest which accepts an argument since which is used to get tweets that are newer (since is the latest tweets received_at value). nt is used to create a different URL each time so that IE doesn’t cache it. This method returns JSON at.

The templates are located in a directory called templates :)

Here is the template for the root/index of the site; index.jinja

Copy 1 < html xmlns = " http://www.w3.org/1999/xhtml " > 2 < head > 3 < title > Queshuns </ title > 4 < script type = " text/javascript " src = " http://ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.min.js " > </ script > 5 </ head > 6 < body > 7 < script type = " text/javascript " > 8 function refreshTweets ( ) { 9 $ . getJSON ( '/latest' , { since : window . latestTweet , nt : ( new Date ( ) ) . getTime ( ) } , 10 function ( data ) { 11 $ ( '#content' ) . prepend ( data [ 0 ] ) ; 12 $ ( '.latest' ) . slideDown ( 'slow' , function ( ) { $ ( this ) . removeClass ( 'latest' ) ; } ) ; 13 $ ( '#content div:gt(50)' ) . remove ( ) ; 14 setTimeout ( refreshTweets , 10000 ) ; 15 } ) ; 16 } ; 17 18 $ ( function ( ) { setTimeout ( refreshTweets , 10000 ) ; } ) ; 19 </ script > 20 21 < div id = ' content ' > 22 {% for tweet in tweets %} 23 < div > 24 < h1 > < a href = " http://twitter.com/{{ tweet.username }}/status/{{ tweet.id }} " class = " more " > {{ tweet.username }} </ a > </ h1 > 25 < div > 26 < p > 27 < img height = 45 width = 48 src = " {{ tweet.profile_image_url }} " > 28 < span > {{ tweet.text }} < span > 29 </ p > 30 </ div > 31 </ div > 32 {% endfor %} 33 </ div > 34 35 {% if tweets %} 36 < script type = " text/javascript " > 37 window . latestTweet = { { tweets . 0. received_at } } ; 38 </ script > 39 {% else %} 40 < script type = " text/javascript " > 41 window . latestTweet = 0 ; 42 </ script > 43 {% endif %} 44 </ body > 45 </ html >

This template will be used to render a list of tweets and also assign the first tweets recieved_at value to a variable on the window object. This is used by the refreshTweets function which will pass it on to /latest in a GET parameter. refreshTweets will try to get new tweets and prepend it to the content div and then slide the latest tweets. This is the template used to render the HTML for the latest tweets;

Copy 1 {% if tweets %} 2 < div class = ' latest ' style =' display : none ; ' > 3 {% for tweet in tweets %} 4 < div > 5 < h1 > < a href = " http://twitter.com/{{ tweet.username }}/status/{{ tweet.id }} " class = " more " > {{ tweet.username }} </ a > </ h1 > 6 < div class = " entry " > 7 < p > 8 < img align = ' left ' height = 45 width = 48 src = " {{ tweet.profile_image_url }} " > </ img > 9 < span > {{ tweet.text }} </ span > 10 </ p > 11 </ div > 12 </ div > 13 {% endfor %} 14 15 < script > 16 window . latestTweet = { { tweets . 0. received_at } } ; 17 </ script > 18 </ div > 19 {% endif %}

I explicitly set the the latest div to “display: none” so that I can animate it.

Now we should be able to run questions_daemon.py to start retrieving tweets then start questions_app.py to look at the web app. On your browser go to https://localhost:8080/ and if everything went correctly you should see a list of tweets that update every 10 seconds.

Thats it. Hope this was helpful.