By Gareth Renowden • 23/06/2011 This post was syndicated from Hot Topic » Gareth - View original source

What is it with so-called sceptics and dodgy statistics? Is comprehension failure a precondition for believing six impossible things before breakfast, or does wishful thinking trump all common sense? It can’t be an unwillingness to do research, because they find cherries in the most unlikely places. Recently Richard Treadgold put his finger into the world of web statistics and pulled out a plum:

Just a quick note to draw your attention to a new feature on the sidebar: scroll down one page and you should see it. There’s a little table showing the recent Alexa rankings for the Climate Conversation, SciBlogs and Hot Topic. At the moment we’re leading them by big margins. […] it’s humbling to see that this modest little blog is more popular and thousands more people visit it than other, brasher sites around the country that even get into the newspapers.

What should be humbling is the fact that Treadgold’s claim is almost certainly nonsense. To show this, I need to explain something about web statistics — and in particular the Alexa metric Treadgold has discovered. Apologies for this detour off the climate beat — normal service will be resumed shortly…

Alexa is an Amazon-owned web stats service. It compiles rankings and readership data on web sites by analysing data supplied by people who have installed an Alexa “toolbar” in their browser. This “toolbar community” is a small subset of total web users, and likely to be biased in a number of ways. For example, the toolbar is only available for the Internet Explorer, Firefox and Chrome browsers, so Alexa stats exclude people surfing with Apple’s Safari or other browsers (about 20% of visitors to Hot Topic). Alexa samples web traffic via its toolbar users, and then processes the data to rank web sites. The data they produce is interesting, but is only statistically reliable for sites with lots of traffic. This is what their FAQ says about comparisons between the stats produced by Google Analytics and Alexa:

Alexa looks at traffic patterns across the web as a whole. Our traffic data are based on the past three months of global traffic according to our data sources, primarily our community of Alexa Toolbar users. This helps us get an excellent portrait of how traffic is shaped across the web. It’s a great ’big picture’ analysis, like using a satellite in orbit to look at weather patterns across the globe. It is an excellent tool for comparing sites and understanding them in relation to one another, but it is not meant to serve as a tool for analyzing visitors’ behavior in depth at a particular site in the manner of Google Analytics. This is especially true for sites that are ranked worse than 100,000. We have limited data available for such sites, and because of that, our traffic estimates will not be as robust or reliable as they are for sites approaching #1 on the list.

The ranking they refer to is their global ranking — not the ranks they derive for sites in small countries like NZ. In other words, if you rank somewhere in the top 100,000 sites worldwide you’ll be part of their “big picture”, and there’s a good chance your Alexa stats will be based on a large enough sample to be useful and reliable. Treadgold’s Climate Conversation blog ranks at around 500,000. By way of comparison, David Farrar’s Kiwiblog.co.nz is ranked #68,226 in the world (#88 in NZ). Climate Conversation is so far down in Alexa’s long tail that the Alexa rank Treadgold is keen to trumpet is effectively meaningless.

A more realistic way to track blog readership in a small country like New Zealand is to look at the visitor statistics provided by services like Google Analytics, Statcounter, Sitemeter, Woopra and so on. To use these services, the blog owner installs a plug-in or code snippet to their blog software. Visitors are then tracked by the stats service as they visit, often in near real time.

There have been a number of efforts over the years to compile rankings of New Zealand blogs. Tim Selwyn’s nzblogosphere was one of the earliest efforts, sadly no longer updated, but Open Parachute blogger Ken Perrott has filled the gap by compiling a ranking based on data from blogs with publicly accessible stats. Ken’s May ranking lists over 240 blogs. The aforementioned Kiwiblog is #2 in the list, with 277,553 visits in the month. Sciblogs is #7, with a very creditable 45,308, and Hot Topic is #14 with 17,144. Treadgold’s Climate Conversation is not listed.

Back in January, the last time Treadgold got his knickers in a twist over site stats, I sent him an email explaining web stats in some detail, and provided information about good, free stats services and how they are installed in WordPress (the blog software we both use). He thanked me, but professed no real interest in pursuing the matter. However, he has all the information he requires to generate good, directly comparable readership statistics.

Here’s my challenge. Why not publish real readership data, Richard? Why not supply data to Ken Perrott so that we can see where your readership numbers fall in the New Zealand blogosphere? My suspicion is that you will be nowhere near Sciblogs’ visitor numbers, and I’d be surprised if you were able to do better than Hot Topic. But I’m prepared to be proved wrong. I’ll send you a bottle of Limestone Hills’ finest pinot noir if I am…

[Update: Treadgold responds in a rather over the top manner. Trouble is, a meaningless stat is still meaningless, whoever it applies to. He would be well advised to stop digging, I suspect. Meanwhile, my pinot is drinking rather well…]