I first saw Yahoo's 13 Simple Rules for Speeding Up Your Web Site referenced in a post on Rich Skrenta's blog in May. It looks like there were originally 14 rules; one must have fallen off the list somewhere along the way.

It's solid advice culled from the excellent Yahoo User Interface blog, which will soon be packaged into a similarly excellent book. It's also available as a powerpoint presentation delivered at the Web 2.0 conference.

I've also covered similar ground in my post, Reducing Your Website's Bandwidth Usage.

But before you run off and implement all of Yahoo's solid advice, consider the audience. These are rules from Yahoo, which according to Alexa is one of the top three web properties in the world. And Rich's company, Topix, is no slouch either-- they're in the top 2,000. It's only natural that Rich would be keenly interested in Yahoo's advice on how to scale a website to millions of unique users per day.

To help others implement the rules, Yahoo created a FireBug plugin, YSlow. This plugin evaluates the current page using the 13 rules and provides specific guidance on how to fix any problems it finds. And best of all, the tool rates the page with a score-- a score! There's nothing we love more than boiling down pages and pages of complicated advice to a simple, numeric score. Here's my report card score for yesterday's post.

To understand the scoring, you have to dissect the weighting of the individual rules, as Simone Chiaretta did:

My YSlow score of 73 is respectable, but I've already made some changes to accommodate its myriad demands. To get an idea of how some common websites score, Simone ran YSlow on a number of blogs and recorded the results:

Google: A (99)

Yahoo Developer Network blog : D (66)

Yahoo! User Interface Blog : D (65)

Scott Watermasysk : D (62)

Apple : D (61)

Dave Shea's mezzoblue : D (60)

A List Apart : F (58)

Steve Harman : F (54)

Coding Horror : F (52)

Haacked by Phil : F (36)

Scott Hanselman's Computer Zen : F (29)

YSlow is a convenient tool, but either the web is full of terribly inefficient web pages, or there's something wrong with its scoring. I'll get to that later.

The Stats tab contains a summary of the total size of your downloaded page, along with the footprint with and without browser caching. One of the key findings from Yahoo is that 40 to 60 percent of daily visitors have an empty cache. So it behooves you to optimize the size of everything and not rely on client browser caching to save to you in the common case.

YSlow also breaks down the statistics in much more detail via the Components tab. Here you can see a few key judgment criteria for every resource on your page...

Does this resource have an explicit expiration date?

Is this resource compressed?

Does this resource have an ETag?

... along with the absolute sizes.

YSlow is a useful tool, but it can be dangerous in the wrong hands. Software developers love optimization. Sometimes too much.

There's some good advice here, but there's also a lot of advice that only makes sense if you run a website that gets millions of unique users per day. Do you run a website like that? If so, what are you doing reading this instead of flying your private jet to a Bermuda vacation with your trophy wife? The rest of us ought to be a little more selective about the advice we follow. Avoid the temptation to blindly apply these "top (x) ways to (y)" lists that are so popular on Digg and other social networking sites. Instead, read the advice critically and think about the consequences of implementing that advice.

If you fail to read the Yahoo advice critically, you might make your site slower, as as Phil Haack unfortunately found out. While many of these rules are bread-and-butter HTTP optimization scenarios, it's unfortunate that a few of the highest-weighted rules on Yahoo's list are downright dangerous, if not flat-out wrong for smaller web sites. And when you define "smaller" as "smaller than Yahoo", that's.. well, almost everybody. So let's take a critical look at the most problematic heavily weighted advice on Yahoo's list.

Use a Content Delivery Network (Weight: 10)

If you have to ask how much a formal Content Delivery Network will cost, you can't afford it. It's more effective to think of this as outsourcing the "heavy lifting" on your website-- eg, any large chunks of media or images you serve up -- to external sites that are much better equipped to deal with it. This is one of the most important bits of advice I provided in Reducing Your Website's Bandwidth Usage. And using a CDN, below a reasonably Yahoo-esque traffic volume, can even slow your site down.

Configure ETags (Weight: 11)

ETags are a checksum field served up with each server file so the client can tell if the server resource is different from the cached version the client holds locally. Yahoo recommends turning ETags off because they cause problems on server farms due to the way they are generated with machine-specific markers. So unless you run a server farm, you should ignore this guidance. It'll only make your site perform worse because the client will have a more difficult time determining if its cache is stale or fresh. It is possible for the client to use the existing last-modified date fields to determine whether the cache is stale, but last-modified is a weak validator, whereas Entity Tag (ETag) is a strong validator. Why trade strength for weakness?

Add an Expires Header (Weight: 11)

This isn't bad advice, per se, but it can cause huge problems if you get it wrong. In Microsoft's IIS, for example, the Expires header is always turned off by default, probably for that very reason. By setting an Expires header on HTTP resources, you're telling the client to never check for new versions of that resource-- at least not until the expiration date on the Expires header. When I say never, I mean it -- the browser won't even ask for a new version; it'll just assume its cached version is good to go until the client clears the cache, or the cache reaches the expiration date. Yahoo notes that they change the filename of these resources when they need them refreshed.

All you're really saving here is the cost of the client pinging the server for a new version and getting a 304 not modified header back in the common case that the resource hasn't changed. That's not much overhead.. unless you're Yahoo. Sure, if you have a set of images or scripts that almost never change, definitely exploit client caching and turn on the Cache-Control header. Caching is critical to browser performance; every web developer should have a deep understanding of how HTTP caching works. But only use it in a surgical, limited way for those specific folders or files that can benefit. For anything else, the risk outweighs the benefit. It's certainly not something you want turned on as a blanket default for your entire website.. unless you like changing filenames every time the content changes.

I don't mean to take anything away from Yahoo's excellent guidance. Yahoo's 13 Simple Rules for Speeding Up Your Web Site and the companion FireBug plugin, YSlow, are outstanding resources for the entire internet. By all means, read it. Benefit from it. Implement it. I've been banging away on the benefits of GZip compression for years.

But also realize that Yahoo's problems aren't necessarily your problems. There is no such thing as one-size-fits-all guidance. Strive to understand the advice first, then implement the advice that makes sense for your specific situation.