As I mentioned in my last post on our mobile site, one of the key features for our site was making sure that we don’t use any javascript unless absolutely necessary. If you use Google Analytics (GA) as your stats package, this poses a problem, since the supported way to run GA is via a chunk of javascript at the bottom of every page. And to make matters worse, the ga.js file is not gzipped, so you’re loading 9K which would otherwise be about 4k, on a platform where every byte counts. By contrast, if you could just serve the tracking gif, it is 47 bytes. And no javascript that might not run on B-grade or below devices.

A few weeks ago, Google announced support for analytics inside mobile apps and some cursory support for mobile sites:

Google Analytics now tracks mobile websites and mobile apps so you can better measure your mobile marketing efforts. If you’re optimizing content for mobile users and have created a mobile website, Google Analytics can track traffic to your mobile website from all web-enabled devices, whether or not the device runs JavaScript. This is made possible by adding a server side code snippet to your mobile website which will become available to all accounts in the coming weeks (download snippet instructions). We will be supporting PHP, Perl, JSP and ASPX sites in this release. Of course, you can still track visits to your regular website coming from high-end, Javascript enabled phones.

And that is the extent of the documentation you will find anywhere on Google on how to run analytics without javascript. The code included is handy if you happen to run one of their platforms, but the Walker’s mobile site runs on the python side of AppEngine, so their code doesn’t do us much good. Thankfully, since they provide us with the source, we can without too much trouble, translate the php or perl into python and make it AppEngine friendly.

How it works

Regular Google Analytics works by serving some javascript and a small 1px x 1px gif file to your site from Google. The gif lets Google learn many things from the HTTP request your browser makes, such as your browser, OS, where you came from, your rough geo location, etc. The javascript lets them learn all kinds of nifty things about your screen, flash versions, event that fire, etc. And Google tracks you through a site by setting some cookies on that gif they serve you.

To use GA without javascript, we can still do most of that, and we do it by generating our own gif file and passing some information back to Google through our server. That is, we generate a gif, assign and track our own cookie, and then gather that information as you move through the site, and use a HTTP request with the appropriate query strings and pass it back to Google, which they then compile and treat as regular old analytics.

The Code

To make this work in appeinge, we create a URL in our webapp that we’ll serve the gif from. I’m using “/ga/”:

[python]

def main():

application = webapp.WSGIApplication(

[(‘/’, home.MainHandler),

# edited out extra lines here

(‘/ga/’, ga.GaHandler),

],

debug=False)

wsgiref.handlers.CGIHandler().run(application)

[/python]

And here’s the big handler for /ga/. I based it mostly off the php and some of the perl (click to expand the full code):

[code lang=”python” collapse=”true”]

from google.appengine.ext import webapp

from google.appengine.api import urlfetch

import re, hashlib, random, time, datetime, cgi, urllib, uuid

# google analytics stuff

VERSION = "4.4sh"

COOKIE_NAME = "__utmmobile"

# The path the cookie will be available to, edit this to use a different cookie path.

COOKIE_PATH = "/"

# Two years in seconds.

COOKIE_USER_PERSISTENCE = 63072000

GIF_DATA = [

chr(0x47), chr(0x49), chr(0x46), chr(0x38), chr(0x39), chr(0x61),

chr(0x01), chr(0x00), chr(0x01), chr(0x00), chr(0x80), chr(0xff),

chr(0x00), chr(0xff), chr(0xff), chr(0xff), chr(0x00), chr(0x00),

chr(0x00), chr(0x2c), chr(0x00), chr(0x00), chr(0x00), chr(0x00),

chr(0x01), chr(0x00), chr(0x01), chr(0x00), chr(0x00), chr(0x02),

chr(0x02), chr(0x44), chr(0x01), chr(0x00), chr(0x3b)

]

class GaHandler(webapp.RequestHandler):

def getIP(self,remoteAddress):

if remoteAddress == ” or remoteAddress == None:

return ”

#Capture the first three octects of the IP address and replace the forth

#with 0, e.g. 124.455.3.123 becomes 124.455.3.0

res = re.findall(r’d+.d+.d+.’, remoteAddress)

if res:

return res[0] + "0"

else:

return ""

def getVisitorId(self, guid, account, userAgent, cookie):

#If there is a value in the cookie, don’t change it.

if type(cookie).__name__ != ‘NoneType’: # or len(cookie)!=0:

return cookie

message = ""

if type(guid).__name__ != ‘NoneType’: # or len(guid)!=0:

#Create the visitor id using the guid.

message = guid + account

else:

#otherwise this is a new user, create a new random id.

message = userAgent + uuid.uuid1(self.getRandomNumber()).__str__()

m = hashlib.md5()

m.update(message)

md5String = m.hexdigest()

return str("0x" + md5String[0:16])

def getRandomNumber(self):

return random.randrange(0, 0x7fffffff)

def sendRequestToGoogleAnalytics(self,utmUrl):

”’

Make a tracking request to Google Analytics from this server.

Copies the headers from the original request to the new one.

If request containg utmdebug parameter, exceptions encountered

communicating with Google Analytics are thown.

”’

headers = {

"user_agent": self.request.headers.get(‘user_agent’),

"Accepts-Language": self.request.headers.get(‘http_accept_language’),

}

if len(self.request.get("utmdebug"))!=0:

data = urlfetch.fetch(utmUrl, headers=headers)

else:

try:

data = urlfetch.fetch(utmUrl, headers=headers)

except:

pass

def get(self):

”’

Track a page view, updates all the cookies and campaign tracker,

makes a server side request to Google Analytics and writes the transparent

gif byte data to the response.

”’

timeStamp = time.time()

domainName = self.request.headers.get(‘host’)

domainName = domainName.partition(‘:’)[0]

if len(domainName) == 0:

domainName = "m.walkerart.org";

#Get the referrer from the utmr parameter, this is the referrer to the

#page that contains the tracking pixel, not the referrer for tracking

#pixel.

documentReferer = self.request.get("utmr")

if len(documentReferer) == 0 or documentReferer != "0":

documentReferer = "-"

else:

documentReferer = urllib.unquote_plus(documentReferer)

documentPath = self.request.get("utmp")

if len(documentPath)==0:

documentPath = ""

else:

documentPath = urllib.unquote_plus(documentPath)

account = self.request.get("utmac")

userAgent = self.request.headers.get("user_agent")

if len(userAgent)==0:

userAgent = ""

#Try and get visitor cookie from the request.

cookie = self.request.cookies.get(COOKIE_NAME)

visitorId = str(self.getVisitorId(self.request.headers.get("HTTP_X_DCMGUID"), account, userAgent, cookie))

#Always try and add the cookie to the response.

d = datetime.datetime.fromtimestamp(timeStamp + COOKIE_USER_PERSISTENCE)

expireDate = d.strftime(‘%a,%d-%b-%Y %H:%M:%S GMT’)

self.response.headers.add_header(‘Set-Cookie’, COOKIE_NAME+’=’+visitorId +’; path=’+COOKIE_PATH+’; expires=’+expireDate+’;’ )

utmGifLocation = "http://www.google-analytics.com/__utm.gif"

myIP = self.getIP(self.request.remote_addr)

#Construct the gif hit url.

utmUrl = utmGifLocation + "?" + "utmwv=" + VERSION +

"&utmn=" + str(self.getRandomNumber()) +

"&utmhn=" + urllib.pathname2url(domainName) +

"&utmr=" + urllib.pathname2url(documentReferer) +

"&utmp=" + urllib.pathname2url(documentPath) +

"&utmac=" + account +

"&utmcc=__utma%3D999.999.999.999.999.1%3B" +

"&utmvid=" + str(visitorId) +

"&utmip=" + str(myIP)

# we dont send requests when we’re developing

if domainName != ‘localhost’:

self.sendRequestToGoogleAnalytics(utmUrl)

#If the debug parameter is on, add a header to the response that contains

#the url that was used to contact Google Analytics.

if len(self.request.get("utmdebug")) != 0:

self.response.headers.add_header("X-GA-MOBILE-URL" , utmUrl)

#Finally write the gif data to the response.

self.response.headers.add_header(‘Content-Type’, ‘image/gif’ )

self.response.headers.add_header(‘Cache-Control’, ‘private, no-cache, no-cache=Set-Cookie, proxy-revalidate’ )

self.response.headers.add_header(‘Pragma’, ‘no-cache’ )

self.response.headers.add_header(‘Expires’, ‘Wed, 17 Sep 1975 21:32:10 GMT’ )

self.response.out.write(”.join(GIF_DATA))

[/code]

So now we know what to do with our requests at /ga/ when we get them, we just need to make the proper requests to that URL in the first place. So we need to generate the URL we’re going to have the visitor’s browser request in the first place. With normal django, we would be able to use template_context to automatically insert it into the page’s template values. But, since AppEngine doesn’t use that, we have our own helper functions to do that, which I showed some of in my last post. Here’s the updated helper functions, with the GoogleAnalyticsGetImageUrl function included:

[code lang=”python”]

import settings

def googleAnalyticsGetImageUrl(request):

url = ""

url += ‘/ga/’ + "?"

url += "utmac=" + settings.GA_ACCOUNT

url += "&utmn=" + str(random.randrange(0, 0x7fffffff))

referer = request.referrer

query = urllib.urlencode(request.GET) #$_SERVER["QUERY_STRING"];

path = request.path #$_SERVER["REQUEST_URI"];

if len(referer) == 0:

referer = "-"

url += "&utmr=" + urllib.pathname2url(referer)

if len(path)!=0:

url += "&utmp=" + urllib.pathname2url(path)

url += "&guid=ON";

return {‘gaImgUrl’:url}

def getTempalteValues(request):

myDict = {}

myDict.update(ua_test(request))

myDict.update(googleAnalyticsGetImageUrl(request))

return myDict

[/code]

Assuming we use getTemplateValues to set up our inital template_values dict, we should have a variable named ‘gaImgUrl’ in our page. To use it, all we need to do is put this at the bottom of every page on the site:

[code lang=”html”]

<img src="{{ gaImgUrl }}" alt="analytics" />

[/code]

My settings file contains the GA_ACCOUNT variable, but replaces the standard GA-XXXXXX-X setup with MO-XXXXXX-X. I’m assuming the MO- tells google that it’s a mobile so accept the proxied requests.

One thing to keep in mind with this technique is that you cannot cache your rendered templates. The image you server will necessarily have a different query string every time, and if you cached it, you would ruin your analytics. Instead, you should cache nearly everything from your view functions, except the gaImgUrl variable.