I was conducting some experiments on how Googlebot parses and renders Javascript, and I came across a couple of interesting things about the way it does so. The first is that Googlebot’s Math.random() function produces an entirely deterministic series. I created a small script which uses this identify Google in an obfuscated fashion:

The first time Googlebot calls Math.random() the result will always be 0.14881141134537756 , the second call will always be 0.19426893815398216 . The script I linked to above simply uses this fact but obfuscates it a little and ‘seed’ it with something that doesn’t look too arbitrary.

Crawling at Google Scale

Consider the amount of work Google have to undergo to crawl the whole web AND now run Javascript. Optimisations will need to be abundant, and I imagine that having a deterministic random number function is probably:

Faster More secure Predictable – Googlebot can trust a page will render the same on each visit

Speeding up the clock…

Googlebot also seems to run Javascript with a sped up clock, which makes sense. Why actually wait 5 seconds when you are a bot? So Google actually runs the timer a lot faster. If you create a simple ticker script and put it through the Google Search Console ‘Fetch & Render’ function it returns almost instantly, but with results looking like this:

The second date is a date from the future! Marty McFly would be proud.

Since when?

I did wonder if the random number generation sometimes updates, but a Google search for 0.14881141134537756 turns up over 18,000 results, so it seems like it is quite stable. After discovering this I Google about a bit and found an old Hacker News comment by ‘KMag’:

At some point, some SEO figured out that random() was always returning 0.5. I’m not sure if anyone figured out that JavaScript always saw the date as sometime in the Summer of 2006, but I presume that has changed.

So it seems things were similar to this for some time now, but instead of random() always returning 0.5 , it became a deterministic series. The date is actually accurate initially, but can go into the future, as seen above. KMag went on to say:

I hope they now set the random seed and the date using a keyed cryptographic hash of all of the loaded javascript and page text, so it’s deterministic but very difficult to game.

Which doesn’t seem to be the case, but I’m unsure this allows you to do much you couldn’t do based on User-Agent / IP, but perhaps does allow you to do it with some plausible deniability!