WebRender, is a 2D renderer for the web. It started as Servo‘s graphics engine, and we are in the process of integrating it in Firefox.

I have been meaning for a while to write about what WebRender is, how it works and how its architecture is different from what Firefox and other browsers currently do. To do that, I first wanted to provide some context and look at what Firefox (and other browser engines) look like today, and it grew into a blog post of its own.

In this post we’ll go through a very high level and simplified overview of what Gecko’s graphics pipeline looks like. We’ll see that there are striking similarities with the architecture of other browsers, which is not to say that all browsers work the same way (the devil is always in the details), but some of these similarities are – in my humble opinion – interesting so I wanted to mention them.

Current web renderers

Broadly speaking, All major browser engines are designed around similar architectural concepts. In short:

The result of the layout computation is a tree of positioned elements in the page that we call the frame tree (some other engines call it the flow tree).

computation is a tree of positioned elements in the page that we call the (some other engines call it the flow tree). From the frame tree we generate a mostly flat list of drawing commands that we call the display list .

. Portions of the display list are then painted into layers (which you can think of as the layers in a lot of image editing software). Painting is when the browser computes the color of actual pixels: going from information about what’s on the page to images that represent things on the page. This is traditionally done with immediate mode drawing libraries such as Cairo, Skia or Direct2D.

into (which you can think of as the layers in a lot of image editing software). Painting is when the browser computes the color of actual pixels: going from information about what’s on the page to images that represent things on the page. This is traditionally done with immediate mode drawing libraries such as Cairo, Skia or Direct2D. Layers are then combined together into one final image during the compositing phase.

In Firefox, layers are painted on the content process while compositing is performed on the compositor process.

Here is a rough (and over-simplified) sketch of Gecko’s graphics pipeline, from the DOM to the screen:





Notable differences and similarities between browser engines

This is the general idea but there are of course variations between browsers:

Display list

Some browsers skip the display list and instead paint layers directly off of the frame tree. It used to be the case of Chromium for example, although I think that they have been moving towards a display list-like approach as part of their “slimming paint” project.

I find display lists handy because:

Respecting the painting order of elements while traversing the frame tree is hard and being able to sort the display items helps. Generating the layer tree from the sorted display list is a lot easier than doing it from the frame tree because of complicated interactions between the rules of stacking contexts and z ordering in the CSS specification (If you want to know more about this, look up the deliciously dreadful name the Chromium folks have given to this problem: the “fundamental compositing bug”).

A display list is a convenient data structure to compute invalidation (figuring out the smallest region of pixels that need to be painted when something changes). Or at least it has worked very well in Firefox and Chromium’s invalidation design (also part of the slimming paint project) looks very similar.

Compositing

Some browsers have their own compositor (Firefox, Chromium), while others (Edge, Safari) more closely integrate with the OS’s compositing window manager and delegate compositing layers to the latter. Nonetheless, all of these browsers have a notion of retained layers.

(Chromium appears to be moving towards using DirectComposition on Windows which indicates they will delegate at least some of the compositing on windows at some point.)

One thing that all major browsers have in common is the separation between painting and compositing. Rather than paint everything directly into the window, browsers paint into these intermediate surfaces that we call layers.

We have this separation because while painting can be expensive and sometimes hard to do at 60 frames per second, compositing is a relatively simple operation and fairly easy to run on the GPU. Browsers can hope to composite at a solid 60 frames per second and perform painting at a lower frequency if it can’t be done at the full frame rate.

With a compositor, scrolling is only a matter of moving a layer, and various other effects and animations can also be performed by the compositor.

Painting and compositing at different frequencies makes it possible for some of the most important user interactions to stay responsive and smooth even if a page is too complex to fully paint at a high enough frequency. It also prevents long paint times or javascript execution to affect video playback.

Closing note

We just had a very high-level overview of how Gecko renders web content today and we saw that most browsers have some similarities in their overall architecture. Time to have a look at this from a more historical point of view.

Browser rendering engines were initially designed quite a while back, when computers did not necessarily have a lot of cores or a GPU and websites were pretty simple. Computer hardware evolved, the web which was initially a platform to present mostly static documents turned into a real interactive application platform, and as a result browser engines evolved as well.

An example of such evolution is the separation of painting and compositing (which hasn’t always been there). We already mentioned how compositing is an appealing approach when there is a lot of scrolling involved.

With a combination of web content becoming more demanding, computers getting more cores, the decline of Moore’s law and people getting used to browsing the web at 60 frames per seconds, it later became necessary to move compositing to a separate thread to ensure long paint times or JS execution would not cause the compositor to miss frames. These days we are also moving painting itself off of the main thread where JS and layout are performed.

These are very welcome incremental evolutions that are making browsing the web a lot nicer. Overall these evolutions have been mostly about keeping the same drawing model, taking apart and moving pieces in several threads.

What if we designed and built a browser engine from scratch today? Some elements of the rendering pipeline would remain similar while some would certainly be done differently. One of the elements common to most browsers that in my opinion is most showing its age is the way we have been doing painting, and this will be the topic of the next post in this series.

Post scriptum: Want to know more about how web browsers work? have a look at the Gecko overview wiki page.