We’ve been trying to figure out event tracking for a while, and it’s something that can be deceptively complex. Getting your first events sent to a tracking system like Google Analytics is done in a few minutes and a few lines of code.

But, as your teams and projects get bigger, those few minutes and few lines of code start to add up. Before you know it, your codebase and reporting are littered with unused or outdated events. Your developers are bitter and hesitant to add even more random tracking logic to the code, and your analysts are scared of the angry developers and just make do with the sub-optimal decisions that were made earlier on.

This is a bit of an exaggeration (I hope). But we definitely did notice that this way of working could use an update from having all our reporting business logic hardcoded in the product.

I’d like to walk you through the system that we have currently set up and are quite happy with. There’s a few key elements that we’ve focused on to make this workflow fit our needs:

Simple tech: Anyone should be able to understand what’s going on quickly. Our codebase should have a minimal amount of edge-cases and experimental business requirements. Separation of concerns for developers and analysts: Developers should be able to focus on product development, while analysts should be empowered to gather whatever insights are needed. These responsibilities shouldn’t negatively impact each other. Tracking/analytics system agnostic: If we wanted to plug in any other package for tracking, data visualisation, or whatever, it should be super easy and doable by our analysts (or PMs, designers, etc.). Rapid iteration: All of the above serves the purpose of moving and learning quickly.

I’ll start with explaining our problem, then move on to the front-end implementation and finally: close the loop in Google Tag Manager.

Our starting point: straight Google Analytics

We come from using Google Analytics and will still be using it in some form over the coming years. In Google Analytics, events consist of four properties: a category, an action, a label and a value. Next to that, every event has a whole bunch of default dimensions (i.e. the user’s country or screen size) and potentially some custom ones you set up yourself.

Imagine a user on our platform, who has chosen a specific game to play, and clicks the tile to navigate there. You can imagine a few potential data points:

It’s a tile The user clicked the tile The tile contains a game (as opposed to say, a category) The game’s unique identifier is 1337

When you’re just starting out with tracking some basic events, these four fields are more than enough. We can quite nicely fit this in the Google Analytics event model, which consists of up to 4 fields.

{ category: "tile", action: "clicked", label: "game", value: 1337 }

However, as soon as we want to add more data, it quickly starts to fall apart. What if we wanted to know which position the tile had on our page grid? We immediately have to resort to combining data in the same fields, maybe like so:

{ category: "tile_game", action: "clicked", label: "x1y5", value: 1337 }

// Or maybe

{ category: "tile_clicked", action: "game", label: "x1y5", value: 1337 }

// Or...

{ category: "tile_x1y5", action: "clicked", label: "game", value: 1337 }

// :(

Apart from complicating our reporting (we will now need to resort to partial matches to group events), it’s also becoming painfully obvious that whatever is contained in any of these four fields is completely arbitrary. We cannot have a single rule that says what is in an event’s label or value as it is dependent on what is in its category. We’re also not quite sure what is in a category anymore, at some point it was a component, but by now it has turned into some sort of ‘more specific component’.

Note that Google Analytics limited format does make sense, Google Analytics is not just an event tracking tool but a complete analytics product. Implementation is very easy to get started with, making the whole package super effective for beginners, it’s just when you start to get a bit more advanced that things can get tricky.

A new format: simplicity!

Let’s forget about Google Analytics and its limitations for a second. The people who are implementing these events are developers. The developers have access to huge amounts of information, along with every possible user interaction. Ideally, they do not have to worry about what tracking system is used or what that particular system’s preferred format is. Ideally, the developers just send everything, allowing our analysts to cherry-pick what they need later.

We’re using a very basic format, which still allows for any amount of data. We decided on defining an event with the following set of properties: noun, verb and data. This is what the previous example now looks like:

{

noun: "tile",

verb: "click",

data: {

id: 1337,

type: "game",

position: { x: 0, y: 5 }

}

}

Much nicer. Now we have a lot of room to play with, any additional information can simply go into the data object.

Note that one rule we try to adhere to is that for any noun, the data structure should be the same (i.e. tile will always have id, type and position as its data). This will help us later on when we process the event.

The last part we need is context. This will be a reflection of our application’s state, giving us additional information about what our application looked like at the time of an event, similar to the way custom dimensions are used in Google Analytics. This can contain things like the current page path, the user’s viewport or even everything currently visible on the page (yolo!). Whatever we think (or our analyst said) would be useful information to have alongside the events, let’s just put it in there.

Here’s an example (our production context is many times bigger, but I don’t want to scare anyone):

{

domain: "poki.com",

applicationVersion: "v13.3.7",

page: {

path: "/awesome-games"

},

previousPage: {

path: "/"

}

}

We keep track of the context as our state changes, separately from our events.

Pro-tip: If you’re using Redux, try using a single Reselect selector to directly select the context from your state! This makes managing context extremely elegant and straightforward.

So, how do we actually get this data into our tracking tool?

Our combination of events and context is quite nice, we now have a format that’s flexible and designed developer-first. Developers can simply think about what information is available, spar with our analysts to discuss potential data points and then just ‘send’ it over. No worrying about how it should be organised or how it will be used.

However, just making some Javascript objects will not make data magically appear in reports. It still needs to be converted, sent over to a tool like Google Analytics and then be reported on in a sane way.

For this, we’ll be using Google Tag Manager (GTM). GTM can do a lot of things, but for our purposes you could define it as follows: GTM is a system that takes Javascript objects as input, transforms them and sends them anywhere you’d like.

Plus it has a flexible but powerful GUI that anybody (read: our analysts) can understand and use. Exactly what we need.

Talking to GTM

Let’s start by getting our stuff into GTM. After you’ve created a container and implemented it on your website, you’ll need a tiny bit of Javascript to send events and change the context.

Here’s our code:

function pushContext(context) {

window.dataLayer.push({ context });

} function pushEvent(eventNoun, eventVerb, eventData) {

window.dataLayer.push({

event: `${eventNoun}-${eventVerb}`,

eventNoun,

eventVerb,

eventData,

});

}

All that’s required is pushing some objects into the window.dataLayer array which is created by GTM. Also note that we send an additional field event, this is required by GTM to understand we’re sending an event.

Something that requires a special mention: As GTM lives on your page, you don’t have to worry about any network requests, the system keeps track of the Data Layer array in the client’s memory.

Bring forth the analysts: Setting up GTM

Phew. All the tech work is done! This is where the developer’s job ends, and where the analyst’s job starts (I hope you’re still here).

The first time you use GTM it might seem a little complicated, but it’s really not that bad. It’s super powerful when used properly.

GTM is built on three main components: tags, triggers and variables. Tags are where business logic is executed, like tracking an event in Google Analytics. Triggers contain logic for when a tag must be fired, for example upon receiving a tile-clicked event in our Data Layer. Variables are used throughout tags and triggers to refer to data from the Data Layer.

Let’s implement our event. First we’re going to need variables for the data we send. We’ll keep it simple and only do the most basic event we started with: tracking tile clicks of type game with their id. This is what we should be getting from the Data Layer:

{

event: "tile-clicked"

eventNoun: "tile",

eventVerb: "click",

eventData: {

id: 1337,

type: "game"

}

}

Variable configuration

First up are variables. We want to be able to refer to all of the above, so we create a new variable of type “Data Layer Variable“ for eventNoun, eventVerb, eventData.id and eventData.type.

The “Data Layer Variable Name” should contain the path to the data in the Data Layer.

I recommend using naming conventions to keep the system nice and organised.

Once we’re done, we’ll see the following in our variable overview:

Tag configuration

Next up, we create a Google Analytics tag for tracking tile events:

We’ve now ‘hard-coded’ the tracking ID, but you can use a variable for that too!

At some point we might want to create a separate tag for different verbs, but for now we can contain all tile events in this single tag.

Trigger configuration

Create a new trigger of type “Custom Event”, that responds to our tile-clicked events of type game.

Now, go back to your tag, edit it and set this up as its trigger. That’s it!

A quick recap

So what we did is the following: as soon as we detect a tile-clicked event of type game in the Data Layer, our trigger is triggered. The trigger fires our tag right away, which uses our variables to send the correct event to Google Analytics. It works!

Yay!

We didn’t end up implementing anything from our context, but it works the same way. We can simply add more Data Layer variables and refer to the context instead of eventData. The only difference here is that our eventData is always tied to the last event, but our context could be the same across events, or change multiple times without any events being sent.

As a side-note: I highly recommend getting up to speed with GTM’s preview system. It’s a very well made debugging tool that gives you all the insights necessary to double-check and overcome problems in your configuration.

Calculated variables

GTM also allows you to do a bunch of more advanced things, one of the most powerful being calculated variables.

For example, imagine we’d like to know how long somebody was on a page before clicking a tile. First we’ve asked a developer to keep track of the timestamp on which a page was opened. Then we have a calculated variable that simply subtracts that timestamp from the current time, and refer to this variable in our tag (instead of the developer having to do this calculation and send it to the Data Layer every time).

Example of a calculated variable, saving us complexity in the codebase and empowering analysts

You can imagine there’s a lot more cases like this, and you’ll start to feel like a true event tracking ninja soon enough.

Why is this simple again?

That might have seemed like a lot of steps to replace a single call to Google Analytics, but once the set-up is done everything becomes easier.

What if we also wanted to start tracking category tile clicks?

We simply remove the condition from our trigger that says “Event Data Type” must be game.

What if we’d like to change the format of our event in GA?

We just adjust the tag in GTM.

What if we want to test out a different analytics tool?

We create a new tag and use our already defined triggers and variables.

None of these ‘issues’ will require more than a few minutes to resolve. Nor do any of them require a developer or a redeployment of our website.

Of course this is different when starting out or when new features are designed and developed. Developers and analysts will still have to actually talk to each other (I’m sure they’ll manage) to discuss what could or needs to end up in the Data Layer. But you’ll be building an increasing dictionary of potential data and a shared vocabulary, greatly empowering your team to collaborate more effectively.

That’s about it. Happy tracking!