Updated As the AVG LinkScanner continues to spew fake traffic across the internet, web masters say they've uncovered a reliable means of filtering these rogue hits from their log files.

Bundled with AVG's newest anti-virus engine, AVG 8, and used by roughly 20 million people worldwide, LinkScanner checks search engine results for malware before you on click them. If you type a keyword into Google, for instance, it automatically visits each address that appears on Google's results page.

This has caused an enormous spike in traffic on sites across the web, including The Register, and many webmasters may not realize where these hits are coming from. Hoping to fool malware writers, LinkScanner mimics real live human clicks. At least in part.

When scanning pages, LinkScanner employs the IP addresses of those 20 million people who use the product, and as of last week, it sends the same user agent as Microsoft's IE6 browser.

But there is a way of eliminating this fake traffic from log files - without clipping the real thing. Two separate webmasters have learned - through two different avenues - that an AVG request leaves out at least one HTTP header: the "Accept-Encoding" header, which defines what sort of compression a browser can handle.

Filtering or redirecting traffic according to this header (and the IE6 user agent) should do the trick - though AVG did not respond when asked for confirmation.

AVG has said "There are still ways for concerned web masters to filter LinkScanner requests out of their statistics." But it hasn't given specifics - at least publicly. And any fix may be temporary.

Part of the problem is that AVG won't put its cards on the table. But the company may worry that divulging too much information would feed the black hats. If a web master can identify AVG scans, so can a malware writer. But more on that later.

Meanwhile, web masters continue to complain that LinkScanner forces them to pay for extra bandwidth. But AVG has promised a fix for this problem as well. ®

Bootnote

Many Reg readers have asked why LinkScanner doesn't scans links after you click on them. But it does. It scans both before and after. AVG chief of research Roger Thompson argues that this two-layer approach guards against so-called zero-day exploits. If a site is infected with one piece of malware, there's a good chance it's infected with a second, he says, and if you prevent users from even clicking on a site, you protect them from exploits you see as well as those you don't.

Update

To avoid confusion, we've updated this story to point out that webmasters are filtering traffic by identifying both the missing HTTP header and the IE6 agent. It should also be noted that LinkScanner seems to be using two other user agents. But these are far less prevalent. At the moment, these are the three user agents the product seems to employ:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;1813)

The first is the IE6 agent.