Introduction

The most widely used analytics implementation is including a tracking code in every web page you want to track. Your analytics provider is generating the code and a relevant tracking id which identifies your application. When a client program (like web browser) visits your page, the tracking code is running at the same time, sending data to the provider about the specific tracking id. This is the client - side way of user tracking.

Why server - side?

Proxy servers or "ad blocker" browser plugins like ABP, uBlock usualy block access to your analytics provider. In fact, your users are blocking analytics depending on how sensitive they are about their privacy. Consequently the data is not reliable. You probably want to provide your analytics provider, with the data required for your own analysis. Not necessarily with all the data they are able to gather running code on your user's machine. A browser running javascript may be not be the case with your application. You may want to gather data about an API usage, mobile requests, custom events etc. You may want to follow your own tracking logic.

Django middleware

Django middleware is a good point to do the tracking job because it stays between every HTTPRequest and HTTPResponse. It is basicaly a regular python class with some methods called at request phase and some other called at response phase. For details about django middleware check the documentation: https://docs.djangoproject.com/en/

Tracking logic, Requirements, Dependencies

So our middleware need to do the following things:

Generate a unique tracking id for each user making requests Ensure that every request from one user is tracked with the same tracking id Exclude some preset web pages from tracking (for example, admin pages or rss pages) Exclude error HTTP responses from tracking (track only HTTP status code 200) Do the tracking asyncronously (obviously we do not want our users to wait for en external resource)

We are going to implement the 2. by using cookies technology, so another requirement occurs. We are required by law to:

Inform the user that we are using cookie technology for tracking him.

The "Django messages framework" (https://docs.djangoproject.com/en/2.0/ref/contrib/messages/) is handy for us for implementing the 2.1. requirement.

We will also use the Celery infrastructure described in another article for implementing the 5. requirement (asyncronous tracking).

For this demonstration will use Google analytics as analytics provider and meausurement protocol.

Of course, we suggest you to use your own analytics infrastructure and be the owner of your data. You can use both proprietary and open source products like or . Using Google Analytics, obviously, you are aware that your data is retained "for ever" and may be used by Google and/or its associates.

Analytics application

Assuming that you already have a project in the current directory, create the new app:

$ python manage.py startapp analytics

Create a file named "tracker.py". We will use it as our tracking library:

./analytics/tracker.py

import random import uuid from django.conf import settings VERSION = settings . ANALYTICS_API_VERSION COOKIE_NAME = settings . ANALYTICS_COOKIE_NAME COOKIE_PATH = settings . ANALYTICS_COOKIE_PATH COOKIE_AGE = settings . ANALYTICS_COOKIE_AGE ANALYTICS_ID = settings . ANALYTICS_ID def cookie_exists (request): cookie = request . COOKIES . get(COOKIE_NAME) if cookie: return True else : return False def set_cookie (visitor_id, response): response . set_cookie( COOKIE_NAME, value = visitor_id, max_age = COOKIE_AGE, path = COOKIE_PATH, ) return response def build_params (request, path = None , event = None , referer = None , visitor_id = None , site = None ): meta = request . META site = site referer = referer or request . GET . get( 'r' , '' ) path = path or request . GET . get( 'p' , '/' ) user_agent = meta . get( 'HTTP_USER_AGENT' , 'Unknown' ) cookie = request . COOKIES . get(COOKIE_NAME) visitor_id = visitor_id or cookie visitor_ip = meta . get( 'REMOTE_ADDR' , '' ) try : pagetitle = request . current_page . get_page_title() except : pagetitle = None params = { 'v' : VERSION, 'z' : str (random . randint( 0 , 0x7fffffff )), 't' : 'pageview' , 'dt' : pagetitle, 'dh' : site, 'dr' : referer, 'dp' : path, 'tid' : ANALYTICS_ID, 'cid' : visitor_id, 'uip' : visitor_ip, 'ua' : user_agent, } return params

Lets create the Celery task that will submit every pageview to our analytics provider:

./analytics/tasks.py

from celery.decorators import task import requests @task (ignore_result = True ) def submit_tracking (params, provider_url): response = requests . post( provider_url, data = params) response . raise_for_status()

The middleware class contains two methods.

process_request() running at request phase and

process_response() running at response phase.

./analytics/middleware.py

from django.conf import settings from analytics.tracker import build_params, set_cookie, cookie_exists from analytics.tasks import submit_tracking from django.contrib import messages from django.contrib.sites.shortcuts import get_current_site import uuid provider_url = settings . ANALYTICS_PROVIDER_URL class AnalyticsMiddleware ( object ): def process_request ( self , request): if not cookie_exists(request): site = get_current_site(request) messages . info(request, '<strong>Welcome!</strong> ' + site . domain + ' \ is using <strong>cookie</strong> technology \ for tracking everything you do.' , extra_tags = 'alert-info' ) request . session[ 'visitor_id' ] = str (uuid . uuid4()) request . session[ 'site' ] = site . domain else : return None def process_response ( self , request, response): httprspcode = response . status_code if not httprspcode == 200 : return response if hasattr (settings, 'ANALYTICS_IGNORE_PATH' ): exclude = [p for p in settings . ANALYTICS_IGNORE_PATH if request . path . startswith(p)] if any (exclude): return response path = request . path referer = request . META . get( 'HTTP_REFERER' , '' ) visitor_id = request . session . get( 'visitor_id' ) site = request . session . get( 'site' ) params = build_params(request, path = path, referer = referer, visitor_id = visitor_id, site = site) response = set_cookie(visitor_id, response) submit_tracking . delay(params, provider_url) return response

Now, all we have to do is to register the middleware class, the new application and some variables in our project's setting file:

./project/settings.py

MIDDLEWARE_CLASSES = ( # ... 'analytics.middleware.AnalyticsMiddleware' , # ... ) # ... INSTALLED_APPS = ( # ... 'analytics' , # ... ) # ... # Analytics configuration ANALYTICS_COOKIE_NAME = 'project_stats' ANALYTICS_COOKIE_PATH = '/' ANALYTICS_COOKIE_AGE = 31556926 # 1 year in seconds ANALYTICS_ID = 'UA-xxxxxxxx-x' ANALYTICS_API_VERSION = '1' ANALYTICS_IGNORE_PATH = [ '/page1/' , '/page2/' ] ANALYTICS_PROVIDER_URL = 'https://www.google-analytics.com/collect'

For rendering any message with your base template you can use something like this:

./project/templates/base.html

{ % if messages % } { % for message in messages % } < div class = "alert { { message.extra_tags } } alert-dismissible" role = "alert" > < button type = "button" class = "close" data - dismiss = "alert" aria - label = "Close" > < span aria - hidden = "true" >& times; </ span > </ button > { { message | safe } } </ div > { % endfor % } { % endif % }

Notice that we are using bootstrap alerts for displaying our messages to the user. The specific boltstrap alert class is specified by the messages framework with an extra_tag. This way, we can characterize our messages as "info", "warning" etc. rendering them with the related color.

Downside

There are some things you have to consider when you are thinking about server-side tracking: