In my previous post I was writing about the new cache backend and some of the very first testing.

Now we’ve stepped further and there are significant improvements. I was also able to test with more various hardware this time.

The most significant difference is a single I/O thread with relatively simple event prioritization. Opening and reading urgent (render-blocking) files is done first, opening and reading less priority files after that and writing is performed as the last operation. This greatly improves performance when loading from non-warmed cache and also first paint time in many scenarios.

The numbers are much more precise then in the first post. My measuring is more systematic and careful by now. Also, I’ve merged gum with latest mozilla-central code few times and there are for sure some improvements too.

Here are the results, I’m using 50MB limit for keeping cached stuff in RAM.

[ complete page load time / first paint time ]

Old iMac with mechanical HDD Backend First visit Warm go to 1) Cold go to 2) Reload mozilla-central 7.6s / 1.1s 560ms / 570ms 1.8s / 1.7s 5.9s / 900ms new back-end 7.6s / 1.1s 530ms / 540ms 2.1s / 1.9s** 6s / 720ms

Old Linux box with mechanical 'green' HDD Backend First visit Warm go to 1) Cold go to 2) Reload mozilla-central 7.3s / 1.2s 1.4s / 1.4s 2.4s / 2.4s 5.1s / 1.2s new back-end 7.3s/ 1.2s

or** 9+s / 3.5s 1.35s / 1.35s 2.3s / 2.1s 4.8s / 1.2s

Fast Windows 7 box with SSD Backend First visit Warm go to 1) Cold go to 2) Reload mozilla-central 6.7s / 600ms 235ms / 240ms 530ms / 530ms 4.7s / 540ms new back-end 6.7s / 600ms 195ms / 200ms 620ms / 620ms*** 4.7s / 540ms

Fast Windows 7 box and a slow microSD Backend First visit Warm go to 1) Cold go to 2) Reload mozilla-central 13.5s / 6s 600ms / 600ms 1s / 1s 7.3s / 1.2s new back-end 7.3s / 780ms

or** 13.7s / 1.1s 195ms / 200ms 1.6 or 3.2s* / 460ms*** 4.8s / 530ms

To sum – most significant changes appear when using a really slow media. For sure, first paint times greatly improves, not talking about the 10000% better UI responsiveness! Still, space left for more optimizations. We know what to do:

deliver data in larger chunks ; now we fetch only by 16kB blocks, hence larger files (e.g. images) load very slowly

think of interaction with upper levels by means of having some kind of an intelligent flood control

1) Open a new tab and navigate to a page when the cache is already pre-warmed, i.e. data are already fully in RAM.

2) Open a new tab and navigate to a page right after the Firefox start.

* I was testing with my blog home page. There are few large images, ~750kB and ~600kB. Delivering data to upper level consumers only by 16kB chunks causes this suffering.

** This is an interesting regression. Sometimes with the new backend we delay first paint and overall load time. Seems like the cache engine is ‘too good’ and opens the floodgate too much overwhelming the main thread event queue. Needs more investigation.

*** Here it’s combination of flood fill of the main thread with image loads, slow image data load it self and fact, that in this case we first paint after all resources on the page loaded – that needs to change. It’s also supported by fact that cold load first paint time is significantly faster on microSD then on SSD. The slow card apparently simulates the flood control here for us.