The New York Times and Internet metrics firm comScore have partnered on a new study that shows what savvy web surfers have known for some time: we're all being tracked on the web. In this case, though, comScore names names, and it turns out that Yahoo, MySpace, AOL, and Google collect the most average data about online browsing habits in any given month.

The Times partnered with comScore to develop a new metric that looks at how many "data collection events" many top web sites use to grab data from visitors. The events include the URL of requested pages, search query strings, videos played, advertising displayed, and ads served on ad networks owned by the companies (but appearing on other sites). The events chosen for tracking mean that each page may feature multiple events. The methodology is explained in more detail in a Times blog posting.

What comScore found, using this new metric, is that visitors to the top sites on the 'Net get tracked. A lot.



Yodel if you love tracking

Yahoo, for instance, is far and away the leader in data collection. comScore found that the site collects 2,520 unique pieces of data on an average visitor in a one-month period. That dramatically overshadows the 1,229 events logged at MySpace and the 610 at AOL. Google came in fourth with 578. All of these numbers include the companies' extended advertising networks.

As Louise Story, the Times' writer on the piece, points out, "Not all of this data is useful; not all of it is retained by the companies with access to it; much of it cannot be traced back to individuals." Still, she hopes to "start a conversation" about the sheer volume of data that suck firms are aggregating.

The story breaks as numerous other recent stories shine a spotlight on the issue of online privacy and data collection. UK-based Phorm has recently signed deals with three of the largest ISPs in that country that will see it deliver super-targeted advertising based on users' clickstream data.

Cable TV companies here in the US, hamstrung in their attempts to make more money from advertising by a lack of highly-targeted data, are working on Project Canoe. Canoe will try to overcome a bit of the "data gap" between Internet companies and TV companies by drawing from set-top box data to better target consumers who watch specific shows or channels.

Is this a "race to bottom" when it comes to data collection, or do you believe the rhetoric about bringing value to consumers by offering them more targeted ads? If you don't yet know what to think, help is on the way: AOL will soon be offering consumer-friendly explanations of its data-targeting practices... explanations that feature adorable, Internet-using penguins.