Prev. [Title for Previous Page]

Next [Title for Next Page]

November 20, 2008

VIEW SINGLE PAGE

Google Analytics is probably the most popular web analytics tool in use today. It is certainly the best tool you can get for the price -- it's free and worth every penny. Unfortunately, like every web analytics tool, it is not 100 percent accurate. There are reasons why it is impossible to build perfectly accurate web analytics software, which I have covered previously in "



Visits and bounces

Firstly, we must remember that Google did not create Google Analytics. The company bought a product called Urchin and rebranded it. After about a year, it then released an upgraded version. Urchin had contained a very big mistake in its calculations, which Google fixed in the upgrade. The problem lay with counting visits. According to all the web analytics standards, a "visit" only happens when someone reads more than one page on a site. If someone comes to your site, looks at the first page, then leaves, this is not a "visit" -- it is a "bounce." The best way to think of this is in terms of a shop. If someone looks at the store window but moves on, they have bounced. It is only when they enter the store (or site) that you have a visit.



This is a very important definition to grasp because it fundamentally affects how you see the performance of the site. The reason we need to distinguish between bounces and visits is that we can't tell how long someone spent looking at a web page. We can only tell what time they accessed it. We calculate the time spent looking at a page by comparing what time they accessed it with what time they accessed the next page.



For example, if someone accesses my homepage at 2 p.m., then accesses the next page at 2:15 p.m., we assume they spent 15 minutes reading the homepage. If someone only accesses one page -- in other words, if they bounce -- we have no way of knowing how long they spent looking at that one page. So you can only calculate visit duration when someone reads two or more pages.



A key metric when analyzing the performance of a site is the average duration -- how long people are spending in the site. You need to know your overall average duration for the site, and you often want to know the average duration for visits from certain ads or for specific products. It's a very good way of assessing engagement. Average duration is calculated by totaling up all the individual visit durations and dividing this by the total number of visits.



Next page >> Google Analytics is probably the most popular web analytics tool in use today. It is certainly the best tool you can get for the price -- it's free and worth every penny. Unfortunately, like every web analytics tool, it is not 100 percent accurate. There are reasons why it is impossible to build perfectly accurate web analytics software, which I have covered previously in " Things that throw your stats ," so we should not blame Google for this. However, Google Analytics is different from other products in that it has been intentionally designed by Google to be inaccurate over and above the normal inaccuracies that are inevitable. These inaccuracies are so glaring that most people are getting a very false picture of what is happening in their sites. This article will explain where these inaccuracies lie and provide methods to correct for them.Firstly, we must remember that Google did not create Google Analytics. The company bought a product called Urchin and rebranded it. After about a year, it then released an upgraded version. Urchin had contained a very big mistake in its calculations, which Google fixed in the upgrade. The problem lay with counting visits. According to all the web analytics standards, a "visit" only happens when someone reads more than one page on a site. If someone comes to your site, looks at the first page, then leaves, this is not a "visit" -- it is a "bounce." The best way to think of this is in terms of a shop. If someone looks at the store window but moves on, they have bounced. It is only when they enter the store (or site) that you have a visit.This is a very important definition to grasp because it fundamentally affects how you see the performance of the site. The reason we need to distinguish between bounces and visits is that we can't tell how long someone spent looking at a web page. We can only tell what time they accessed it. We calculate the time spent looking at a page by comparing what time they accessed it with what time they accessed the next page.For example, if someone accesses my homepage at 2 p.m., then accesses the next page at 2:15 p.m., we assume they spent 15 minutes reading the homepage. If someone only accesses one page -- in other words, if they bounce -- we have no way of knowing how long they spent looking at that one page. So you can only calculate visit duration when someone reads two or more pages.A key metric when analyzing the performance of a site is the average duration -- how long people are spending in the site. You need to know your overall average duration for the site, and you often want to know the average duration for visits from certain ads or for specific products. It's a very good way of assessing engagement. Average duration is calculated by totaling up all the individual visit durations and dividing this by the total number of visits.





In July 2007, Google changed the calculation of average duration so that it did not include bounces anymore. This was the correct thing to do and meant that its figures for average duration were now accurate. A month later, Google put it back to the old (wrong) way of calculation. In other words, Google intentionally rolled Google Analytics back so that it produced an incorrect average duration. Why? Brett Crosby, senior manager at Google Analytics, explained in a Google blog that it was because people complained the change meant the new (accurate) numbers were out of line with the old (inaccurate) ones. (That blog has since been removed by Google, but you can find it copied on many sites.) In other words, these people considered consistency more important than accuracy, and Google obliged them.



It's been that way ever since -- Google is intentionally and knowingly providing inaccurate numbers because a few people preferred neatness to truth.



Areas of error

Treating bounces as visits doesn't just affect the accuracy of average duration -- it affects any metric based on the number of visits.





Visit count

When you see a figure in your Google Analytic reports for Total Visits, ask yourself: What do you think that represents? If you see Total Visits as the number of people who entered your site, who reacted to the sales pitch, who engaged with your content, who potentially could have bought products, then you are wrong. It is the number of people who arrived at the front door of the site, nothing more.





When you see a figure in your Google Analytic reports for Total Visits, ask yourself: What do you think that represents? If you see Total Visits as the number of people who entered your site, who reacted to the sales pitch, who engaged with your content, who potentially could have bought products, then you are wrong. It is the number of people who arrived at the front door of the site, nothing more.

Conversion rate

The Conversion Rate tells me how successful my site is at selling. It is legitimate to calculate Conversion Rate including bounces, but my personal experience is that it is misleading to do so. I use Conversion Rate to improve my site's sales pitch. People who bounce were never exposed to it, so including them in the calculation means I cannot possibly know whether my sales pitch is working or not.





The Conversion Rate tells me how successful my site is at selling. It is legitimate to calculate Conversion Rate including bounces, but my personal experience is that it is misleading to do so. I use Conversion Rate to improve my site's sales pitch. People who bounce were never exposed to it, so including them in the calculation means I cannot possibly know whether my sales pitch is working or not.

Exit rate

Google Analytics tells me how many people exited the site from any given page. Like Conversion Rate, this is useful for assessing the sales performance of the page. However, getting someone to enter the site and getting them to stay in it once they have entered are two very different tasks. They have different factors and processes involved, and they have to be measured and improved separately. You can't assess the ability of a page to hold someone in the site and the ability of that page to engage a new arrival in the same number. Including bounces in the Exit Rate makes the Exit Rate metric useless.





Google Analytics tells me how many people exited the site from any given page. Like Conversion Rate, this is useful for assessing the sales performance of the page. However, getting someone to enter the site and getting them to stay in it once they have entered are two very different tasks. They have different factors and processes involved, and they have to be measured and improved separately. You can't assess the ability of a page to hold someone in the site and the ability of that page to engage a new arrival in the same number. Including bounces in the Exit Rate makes the Exit Rate metric useless.

AdWords

It is important to bear in mind that this error does not affect the assessment of key metrics for AdWords traffic. You pay for people to come to the site, whether they bounce or not, so cost-per-visitor and ROI for AdWords is not affected. At the time Google purchased the software Urchin, it treated (and counted) bounces as 0-duration visits. When calculating the average duration, this meant that it came up with a number that was under the real one. How inaccurate this was depended on the percentage of a site's arrivals who bounced.In July 2007, Google changed the calculation of average duration so that it did not include bounces anymore. This was the correct thing to do and meant that its figures for average duration were now accurate. A month later, Google put it back to the old (wrong) way of calculation. In other words, Google intentionally rolled Google Analytics back so that it produced an incorrect average duration. Why? Brett Crosby, senior manager at Google Analytics, explained in a Google blog that it was because people complained the change meant the new (accurate) numbers were out of line with the old (inaccurate) ones. (That blog has since been removed by Google, but you can find it copied on many sites.) In other words, these people considered consistency more important than accuracy, and Google obliged them.It's been that way ever since -- Google is intentionally and knowingly providing inaccurate numbers because a few people preferred neatness to truth.Treating bounces as visits doesn't just affect the accuracy of average duration -- it affects any metric based on the number of visits.

Correcting Google Analytics

If I have inaccurate metrics that are being used by someone to assess performance, I like to aim for minimal disruption when I correct them. I therefore leave the existing metrics in place and add a few (accurate) ones. I have found the best way to correct for this problem is to add a metric I call "Retained Visits," leaving the existing Total Visits in place. I can then re-calculate my metrics using Retained Visits.



I get my number of Retained Visits by removing the bounces. For example, if Total Visits was 1,000 and the Bounce Rate was 25 percent, I had 250 bounces, so the number of Retained Visits was 750.



Mathematically, we express this as:

RV = TV – (TV * BR)

where:

RV is Retained Visits

TV is Total Visits

BR is Bounce Rate



Correcting average duration

In order to get the genuine Average Duration, I need to rewind the calculation Google made about half way. Google Analytics calculated Average Duration by adding up all the individual durations and then dividing that by the number of visits (including bounces). I rewind this by multiplying the Average Duration by the Total Visits. That gives me the number Google had before it factored in bounces, which we can call Total Duration. (Remember that the bounces, with durations of zero, contributed nothing to this number.) I then redo the calculation of average using Retained Visits, thus:

TAD = (AD * TV) / RV

where:

TAD is True Average Duration

AD is Average Duration (as reported by Google Analytics)

TV is Total Visits

RV is Retained Visits

It is surprising, even terrifying, how far apart the old and new numbers are, and the difference in the picture this provides. Here is a set of figures from a site that averages around 30 percent bounce rate:







These show that rather than an average engagement of around three minutes, the reality is closer to five -- people are spending almost twice as long on the site as Google Analytics says they are.



You can use Retained Visits to recalculate your Conversion Rate in a similar fashion. While I keep my calculations of AdWords ROI untouched, I also add another one based on Retained Visits. I have found that if you break down the ROI on each advertising campaign (or ad) by Retained Visits, you can often find extremely valuable revenue streams concealed inside a bland total.



Conclusion

It is critical to know how any web metrics package calculates its numbers. You cannot assume, no matter how big the company, that the numbers will be correct. A small variation doesn't make much difference, but you could be getting a completely false view of what is happening in your site. You may need to correct those numbers yourself. I recommend the use of Retained Visitors as a good way of correcting for errors in Google Analytics.



The other conclusion we can draw from this is how limited the understanding of web analytics is among practitioners, and what a low priority Google gives Google Analytics. Can you imagine an accountant insisting on using inaccurate numbers in their bookkeeping simply because that's what they had been doing before? How long do you think that business would survive? Can you imagine Google agreeing to downgrade the accuracy of its search results to meet user demand? On second thought, that's exactly what they agreed to do in China, isn't it?



Brandt Dainow is an independent web analytics and marketing consultant working in the U.K. and Ireland.

Brandt Dainow Brandt is an independent web analyst, researcher and academic. As a web analyst, he specialises in building bespoke (or customised) web analytic reporting systems. This can range from building a customised report format to creating an...



View full biography

Comments