When you visit a web page, you might expect that the code and images from the page will make their journey through the tubes unmolested and unaltered, but according to security researchers, you would also be wrong 1.3 percent of the time.

Researchers from the University of Washington and the International Computer Science Institute wanted to see whether ISPs, enterprise firewalls, or proxy servers commonly made changes to requested HTML while it was "in-flight." Testing 50,171 unique IP addresses around the world, the researchers found that content requested by 657 IP addresses was modified during transit, but the modifications weren't always nefarious.

We've reported before on companies that sell products to ISPs to do exactly this. ISPs can use the devices to insert or replace their own advertisements into web pages (MetroFi does this to offer its free WiFi service in the US), or they can simply add notification messages to websites (Rogers in Canada began doing this late last year).

But such interventions, though worrisome, are rare. Only 1.3 percent of pages tested were modified in any way, and 70 percent of those modifications were caused by client proxies installed to deal with pop-ups or to block advertising. The researchers also note that not every alteration is problematic; some cellular operators, for example, will strip extra whitespace from pages or will provide extra compression for images to keep bandwidth usage low and browsing quick.

One caveat, however: the research tells us little about the prevalence of ad blocking in general. Browser extensions (which do much of the ad blocking) that may modify page layouts were not included in the tests, so only separate proxies that blocked ads generated positive results.

Because altering the pages in transit might not be in the publisher's interest (ad-blocking) or the user's interest (some of the alterations caused security problems), the researchers devised a method of "web tripwires" to detect any changes. The tripwire is some additional JavaScript code that is executed on the client and can check the received HTML for possible alterations.

For example, the tripwire could count the number of script tags on a page, a method that would have turned out 90 percent of the modifications found during testing. If the number of tags differs from the number hardwired into the script, it can pop a message into the browser notifying the user that the page has been modified.



The tripwire at work

For publishers who care about the integrity of their pages, the tripwire method offers a simpler and lower-overhead solution when compared to more extreme measures, such as only allowing access over an encrypted HTTPS connection.

If you want to check whether your browser is receiving web pages as they were created, the researchers have left their validation tool online for public use. A paper based on the research (PDF) will be presented at the NSDI '08 conference on network systems design this week.