In this post we wanted to share two scripts that will allow you to cluster users across your website, based on the way that they move their mouse through your site and what they click on.

We have deployed this A.I method in conjunction with other predictive A.I to great effect across numerous sites. We thought you guys would also appreciate the opportunity to do so.

Clustering based on Behaviour

The world of Artificial Intelligence is full of complex names based on mathematical equations or TWA’s (Three Word Acronyms). As such, it is a pleasant surprise to find that the section of A.I known as Clustering is just as it sounds. A Clustering Algorithm groups events, people or things based on their proximity to each other, and these clusters can be used to form insights.

In the world of business, Clustering has been used with great success across a wide variety of applications, but it’s strength is perhaps best highlighted by how it was implemented by the people at Netflix, and the company’s success as a direct result of this clustering. When a human groups people (or animals, or anything), they divide by factors such as gender, geographical location, income, hobbies, etc. This method has found success, and still certainly has its place in the world, but a clustering algorithm removes all bias and simply clusters based on what it finds in the data.

For Netflix, this meant stepping away from the groupings listed above, and simply let the Algorithm cluster as it saw fit, regardless of who they were. An example of how beautifully simple this is: If Netflix grouped you, the customer based on geography, they would make recommendations based on what your neighbour watches, is a flawed system, as you both have different tastes (You may be more Rick and Morty to their Big Bang Theory). So Netflix developed a global clustering system, which makes recommendations based on what other users with similar tastes watched — irrespective of gender, location, or age.

Essentially, Clustering does away with human bias to let AI find it’s own patterns, and in the labs at Remi, we have been exploring the concept of behavioural clustering in relation to the way visitors use websites, and we thought we’d share the A.I as it has proved an invaluable tool. The code can define different behaviour groups by capturing mouse movement and then clustering individuals that have similar patterns or journeys through a site. We’ve used it in conjunction with Website Goals to predict how likely a user is to convert. We’re now also integrating it into our Optimizely Account to let a separate Reinforcement Learning AI alter the website in order to increase conversions.

Below, we are pleased to present a (relatively) simple general framework to achieve such a grouping of user behaviours.

Imagine a client who owns the website, www.example.com. They are interested in analysing the different user behaviours that visit that website, knowing that there are a variety of intention with which a potential customer would visit the site. To list but a few, there are people who are just browsing the service, those contemplating purchasing the service and those who know they will be making a purchase as soon as humanly possible. In each case, mousetracking can play a part in determining a user’s intention, helping the website owner to streamline their website and analyse the journey of the different types of visitors to their site.

The following scripts will enable the client to keep track of a user’s mouse movements and analyse them for common behaviours.

A user navigating through the Remi.ai site.

Tracker.js

tracker.js is code written in javascript that needs to be included into the source code of the client’s website. When taking the website live with this snippet, the mouse movements of every visitor to the website are gathered into a packet of data points and sent to a server in JSON format, where the data is saved in a database. The data points that are collected include: clicks, click targets, x,y coordinates of mouse, width and height of the browser and timestamp.

Saving to a database

With access to a server, the client would need to set up a receiving API URL that accepts the JSON data packet and saves it into a database. This URL would need to be included in the tracker.js code.

Because the data packet is in JSON format, saving to a database depends on the type of database. For example, a document database such as MongoDB means the data can be saved directly to the database. A relational database such as SQL, however, would need some formatting of the data packet into a tabular format to make it easier to be inputted into the database. Conversion of the data packet into tabular format can be done by applying the packetToDF function in the following python script and entering each row into the database (see saveToRelational).

An example of the HTTP payload sent to server:

Backend Analysis

With the data saved on the server, we can then begin to analyse it. The below python script gathers all JSON packets (in the example file) and converts them to one big data frame with the columns of x, y, width, height, isClick etc. The data for each user can be separated by their corresponding IPs and the sessions of each user are further separated, meaning that if a user’s activity is idle for more than 10 minutes, we say that the user has started a new session. This preprocessing of the data gets us ready to perform the clustering.

In the code, the default clustering variables are the normalised x,y coordinates, which would create clusters of user sessions based only on their mouse movements within each session. However, if the client wants to cluster not only on mouse movements but, for example, clicking as well as movement or timing of mouse actions, then those clustering variables can be included in the clusteringOfSessions function.

An example of cluster outputs from our

For the sake of initial simplicity, we’ve started with an inbuilt clustering function (fclusterdata) from the scipy.cluster package. In a few weeks we will share a more in depth clustering algorithm of our own to give a deeper understanding of how clustering works.

The above post is just one of the areas of A.I we have explored, so if you have found the above post interesting and would like to see some of the work we have done elsewhere, or some of the products we have built, please visit remi.ai.

By Shahrad Jamshidi & Calum Hamilton