Our starting point is your standard Google Analytics code block.

<!doctype html>

<html>

<script

async

src="https://www.googletagmanager.com/gtag/js?id=ASSHOLE">

</script>

<script>

window.dataLayer = window.dataLayer || [];

function gtag(){dataLayer.push(arguments);}

gtag('js', new Date());

gtag('config', 'ASSHOLE');

</script>

</html>

This isn’t working correctly. Your user’s adblocker is disallowing requests to the remote script. It’s most likely doing this by reading blocking patterns from a community built list of bad strings and matching them against every request URL. The pattern isn’t going to be the whole URL because it’s too specific. It probably also isn’t just the domain, googletagmanager.com , because you can access it just fine. So it’s bound to be something in between those two, such as googletagmanager.com/gtag/js . We’ll just stick with the domain for simplicity.

Let’s go all Web 3.0 and setup a Service Worker.

<!doctype html>

<html>

<script>

navigator.serviceWorker.register('sw.js');

</script>

...

</html>

Why? Turns out SWs have the very useful feature of being able to intercept all HTTP requests made by the including web app. They do this by listening for the fetch event.

self.addEventListener('fetch', event => {

const url = event.request.url;

if (url.includes('googletagmanager.com') ||

url.includes('google-analytics.com')) {

const req = new Request(

'/proxy/?url=' + btoa(url), event.request);

event.respondWith(fetch(req));

}

});

All requests made to domains which we know to be blocked will be intercepted. Besides googletagmanager.com , we know the analytics script will eventually make requests to the google-analytics.com domain, so we include that one too.

These would-be-blocked requests will be rerouted to the proxy/ endpoint on our very own web server, with the blocked URL as a query parameter. We can’t do this in the clear, though, or else the filter would still match the URL and block it. So, we encode the URL as a Base64 string — this is what the btoa function is doing.

Now we just need to configure a web server. We’re using nginx.

...

location /proxy/ {

set $decoded_url ''; rewrite_by_lua_block {

ngx.var.decoded_url = ngx.decode_base64(ngx.var.arg_url)

} proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

proxy_pass $decoded_url;

...

}

The proxy/ endpoint decodes the url query parameter using ngx_http_lua_module ’s rewrite block, and proxies the request to it. The catch, though, is that it will hit Google’s servers with our proxy’s IP which is obviously unacceptable, as we’ll stop getting location data and ruin our site demographics studies. To adjust for this, we set the X-Forwarded-For header to the client’s IP and send it along as well.

Done. With just a little bit of work, we can now bypass all pattern list based adblockers. As long as all proxied requests happen through HTTPS, third-party cookie goodies will be passed along to the tracker as well.

Of course this isn’t that hard to counter. The adblocker would only need to detect our specific assholery, decode the query parameter and re-run it through its lists. To go corporate, we could encrypt, instead of encode, the URL with a specific key made available to both client and server. The adblocker would then have to actually parse our SW’s script, get the key, and decrypt the URL.

Maybe we could even program the Service Worker to periodically renegotiate the encryption key with the server. There’s nothing an adblocker could do about this — except block the renegotiation endpoint— so the best counter would be to start blocking the proxying endpoint itself. At which point we would start randomly generating it and keeping a catch-all location on the web server.

Don’t do this.