Call to improve browser caching

Over Christmas break I wrote Santa my browser wishlist. There was one item I neglected to ask for: improvements to the browser disk cache.

In 2007 Tenni Theurer and I ran an experiment to measure browser cache stats from the server side. Tenni’s write up, Browser Cache Usage – Exposed, is the stuff of legend. There she reveals that while 80% of page views were done with a primed cache, 40-60% of unique users hit the site with an empty cache at least once per day. 40-60% seems high, but I’ve heard similar numbers from respected web devs at other major sites.

Why do so many users have an empty cache at least once per day?

I’ve been racking my brain for years trying to answer this question. Here are some answers I’ve come up with:

first time users – Yea, but not 40-60%.

– Yea, but not 40-60%. cleared cache – It’s true: more and more people are likely using anti-virus software that clears the cache between browser sessions. And since we ran that experiment back in 2007 many browsers have added options for clearing the cache frequently (for example, Firefox’s privacy.clearOnShutdown.cache option). But again, this doesn’t account for the 40-60% number.

– It’s true: more and more people are likely using anti-virus software that clears the cache between browser sessions. And since we ran that experiment back in 2007 many browsers have added options for clearing the cache frequently (for example, Firefox’s privacy.clearOnShutdown.cache option). But again, this doesn’t account for the 40-60% number. flawed experiment – It turns out there was a flaw in the experiment (browsers ignore caching headers when an image is in memory), but this would only affect the 80% number, not the 40-60% number. And I expect the impact on the 80% number is small, given the fact that other folks have gotten similar numbers. (In a future blog post I’ll share a new experiment design I’ve been working on.)

– It turns out there was a flaw in the experiment (browsers ignore caching headers when an image is in memory), but this would only affect the 80% number, not the 40-60% number. And I expect the impact on the 80% number is small, given the fact that other folks have gotten similar numbers. (In a future blog post I’ll share a new experiment design I’ve been working on.) resources got evicted – hmmmmm

OK, let’s talk about eviction for a minute. The two biggest influencers for a resource getting evicted are the size of the cache and the eviction algorithm. It turns out, the amount of disk space used for caching hasn’t kept pace with the size of people’s drives and their use of the Web. Here are the default disk cache sizes for the major browsers:

Internet Explorer: 8-50 MB

Firefox: 50 MB

Safari: everything I found said there isn’t a max size setting (???)

Chrome: < 80 MB (varies depending on available disk space)

Opera: 20 MB

Those defaults are too small. My disk drive is 150 GB of which 120 GB is free. I’d gladly give up 5 GB or more to raise the odds of web pages loading faster.

Even with more disk space, the cache is eventually going to fill up. When that happens, cached resources need to be evicted to make room for the new ones. Here’s where eviction algorithms come into play. Most eviction algorithms are LRU-based – the resource that was least recently used is evicted. However, our knowledge of performance pain points has grown dramatically in the last few years. Translating this knowledge into eviction algorithm improvements makes sense. For example, we’re all aware how much costlier it is to download a script than an image. (Scripts block other downloads and rendering.) Scripts, therefore, should be given a higher priority when it comes to caching.

It’s hard to get access to gather browser disk cache stats, so I’m asking people to discover their own settings and share them via the Browser Disk Cache Survey form. I included this in my talks at JSConf and jQueryConf. ~150 folks at those conferences filled out the form. The data shows that 55% of people surveyed have a cache that’s over 90% full. (Caveats: this is a small sample size and the data is self-reported.) It would be great if you would take time to fill out the form. I’ve also started writing instructions for finding your cache settings.

I’m optimistic about the potential speedup that could result from improving browser caching, and fortunately browser vendors seem receptive (for example, the recent Mozilla Caching Summit). I expect we’ll see better default cache sizes and eviction logic in the next major release of each browser. Until then, jack up your defaults as described in the instructions. And please add comments for any browsers I left out or got wrong. Thanks.