Doesn’t CSS seem like magic? Well, in this third installment of “URL to Interactive” we’ll look at the journey that your browser goes through to take your CSS from braces to pixels. As a bonus, we’ll also quickly touch on how end-user interaction affects this process. We have a lot of ground to cover, so grab a cup of <insert your favorite drink’s name here>, and let’s get going.

Article Continues Below

Similar to what we learned about HTML in “Tags to DOM,” once CSS is downloaded by the browser, the CSS parser is spun up to handle any CSS that it encounters. This can be CSS within individual documents, inside of <style> tags, or inline within the style attribute of a DOM element. All the CSS is parsed out and tokenized in accordance with the syntax specification. At the end of this process, we have a data structure with all the selectors, properties, and properties’ respective values.

For example, consider the following CSS:

.fancy-button { background: green; border: 3px solid red; font-size: 1em; }

That will result in the following data structure for easy utilization later in the process:

Selector Property Value .fancy-button background-color rgb(0,255,0) .fancy-button border-width 3px .fancy-button border-style solid .fancy-button border-color rgb(255,0,0) .fancy-button font-size 1em

One thing that is worth noting is that the browser exploded the shorthands of background and border into their longhand variants, as shorthands are primarily for developer ergonomics; the browser only deals with the longhands from here on.

After this is done, the engine continues constructing the DOM tree, which Travis Leithead also covers in “Tags to DOM”; so go read that now if you haven’t already, I’ll wait.

Now that we have parsed out all styles within the readily available content, it’s time to do style computation on them. All values have a standardized computed value that we try to reduce them to. When leaving the computation stage, any dimensional values are reduced to one of three possible outputs: auto , a percentage, or a pixel value. For clarity, let’s take a look at a few examples of what the web developer wrote and what the result will be following computation:

Web Developer Computed Value font-size: 1em font-size: 16px width: 50% width: 50% height: auto height: auto width: 506.4567894321568px width: 506.46px line-height: calc(10px + 2em) line-height: 42px border-color: currentColor border-color: rgb(0,0,0) height: 50vh height: 540px display: grid display: grid

Now that we’ve computed all the values in our data store, it’s time to handle the cascade.

Since the CSS can come from a variety of sources, the browser needs a way to determine which styles should apply to a given element. To do this, the browser uses a formula called specificity, which counts the number of tags, classes, ids, and attribute selectors utilized in the selector, as well as the number of !important declarations present. Styles on an element via the inline style attribute are given a rank that wins over any style from within a <style> block or external style sheet. And if a web developer utilizes !important on a value, the value will win over any CSS no matter its location, unless there is a !important inline as well.





To make this clear, let’s show a few selectors and their resulting specificity scores:

Selector Specificity Score li 0 0 0 0 1 li.foo 0 0 0 1 1 #comment li.foo.bar 0 0 1 2 1 <li style="color: red"> 0 1 0 0 0 color: red !important 1 0 0 0 0

So what does the engine do when the specificity is tied? Given two or more selectors of equal specificity, the winner will be whichever one appears last in the document. In the following example, the div would have a blue background.

div { background: red; } div { background: blue; }

Let’s expand on our .fancy-button example a little bit:

.fancy-button { background: green; border: 3px solid red; font-size: 1em; } div .fancy-button { background: yellow; }

Now the CSS will produce the following data structure. We’ll continue building upon this throughout the article.

Selector Property Value Specificity Score Document Order .fancy-button background-color rgb(0,255,0) 0 0 0 1 0 0 .fancy-button border-width 3px 0 0 0 1 0 1 .fancy-button border-style solid 0 0 0 1 0 2 .fancy-button border-color rgb(255,0,0) 0 0 0 1 0 3 .fancy-button font-size 16px 0 0 0 1 0 4 div .fancy-button background-color rgb(255,255,0) 0 0 0 1 1 5

Understanding origins#section5

In “Server to Client,” Ali Alabbas discusses origins as they relate to browser navigation. In CSS, there are also origins, but they serve different purposes:

user: any styles set globally within the user agent by the user;

author: the web developer’s styles;

and user agent: anything that can utilize and render CSS (to most web developers and users, this is a browser).

The cascade power of each of these origins ensures that the greatest power lies with the user, then the author, and finally the user agent. Let’s expand our dataset a bit further and see what happens when the user sets their browser’s font size to a minimum of 2em:

Origin Selector Property Value Specificity Score Document Order Author .fancy-button background-color rgb(0,255,0) 0 0 0 1 0 0 Author .fancy-button border-width 3px 0 0 0 1 0 1 Author .fancy-button border-style solid 0 0 0 1 0 2 Author .fancy-button border-color rgb(255,0,0) 0 0 0 1 0 3 Author .fancy-button font-size 16px 0 0 0 1 0 4 Author div .fancy-button background-color rgb(255,255,0) 0 0 0 1 1 5 User * font-size 32px 0 0 0 0 1 0

Doing the cascade#section6

When the browser has a complete data structure of all declarations from all origins, it will sort them in accordance with specification. First it will sort by origin, then by specificity, and finally, by document order.

Origin ⬆ Selector Property Value Specificity Score ⬆ DocumentOrder ⬇ User * font-size 32px 0 0 0 0 1 0 Author div .fancy-button background-color rgb(255,255,0) 0 0 0 1 1 5 Author .fancy-button background-color rgb(0,255,0) 0 0 0 1 0 0 Author .fancy-button border-width 3px 0 0 0 1 0 1 Author .fancy-button border-style solid 0 0 0 1 0 2 Author .fancy-button border-color rgb(255,0,0) 0 0 0 1 0 3 Author .fancy-button font-size 16px 0 0 0 1 0 4

This results in the “winning” properties and values for the .fancy-button (the higher up in the table, the better). For example, from the previous table, you’ll note that the user’s browser preference settings take precedence over the web developer’s styles. Now the browser finds all DOM elements that match the denoted selectors, and hangs the resulting computed styles off the matching elements, in this case a div for the .fancy-button :

Property Value font-size 32px background-color rgb(255,255,0) border-width 3px border-color rgb(255,0,0) border-style solid

If you wish to learn more about how the cascade works, take a look at the official specification.

CSS Object Model#section7

While we’ve done a lot up to this stage, we’re not done yet. Now we need to update the CSS Object Model (CSSOM). The CSSOM resides within document.stylesheets , we need to update it so that it represents everything that has been parsed and computed up to this point.

Web developers may utilize this information without even realizing it. For example, when calling into getComputedStyle() , the same process denoted above is run, if necessary.

Now that we have a DOM tree with styles applied, it’s time to begin the process of building up a tree for visual purposes. This tree is present in all modern engines and is referred to as the box tree. In order to construct this tree, we traverse down the DOM tree and create zero or more CSS boxes, each having a margin, border, padding and content box.

In this section, we’ll be discussing the following CSS layout concepts:

Formatting context (FC) : there are many types of formatting contexts, most of which web developers invoke by changing the display value for an element. Some of the most common formatting contexts are block (block formatting context, or BFC), flex, grid, table-cells, and inline. Some other CSS can force a new formatting context, too, such as position: absolute , using float , or utilizing multi-column.

: there are many types of formatting contexts, most of which web developers invoke by changing the value for an element. Some of the most common formatting contexts are block (block formatting context, or BFC), flex, grid, table-cells, and inline. Some other CSS can force a new formatting context, too, such as , using , or utilizing multi-column. Containing block : this is the ancestor block that you resolve styles against.

: this is the ancestor block that you resolve styles against. Inline direction : this is the direction in which text is laid out, as dictated by the element’s writing mode. In Latin-based languages this is the horizontal axis, and in CJK languages this is the vertical axis.

: this is the direction in which text is laid out, as dictated by the element’s writing mode. In Latin-based languages this is the horizontal axis, and in CJK languages this is the vertical axis. Block direction: this behaves exactly the same as the inline direction but is perpendicular to that axis. So, for Latin-based languages this is the vertical axis, and in CJK languages this is the horizontal axis.

Resolving auto #section9

Remember from the computation phase that dimension values can be one of three values: auto , percentage, or pixel. The purpose of layout is to size and position all the boxes in the box tree to get them ready for painting. As a very visual person myself, I find examples can make it easier to understand how the box tree is constructed. To make it easier to follow, I will not be showing the individual CSS boxes, just the principal box. Let’s look at a basic “Hello world” layout using the following code:

<body> <p>Hello world</p> <style> body { width: 50px; } </style> </body>

The browser starts at the body element. We produce its principal box, which has a width of 50px, and a default height of auto .

Now the browser moves on to the paragraph and produces its principal box, and since paragraphs have a margin by default, this will impact the height of the body, as reflected in the visual.

Now the browser moves onto the text of “Hello world,” which is a text node in the DOM. As such, we produce a line box inside of the layout. Notice that the text has overflowed the body. We’ll handle this in the next step.

Because “world” does not fit and we haven’t changed the overflow property from its default, the engine reports back to its parent where it left off in laying out the text.

Since the parent has received a token that its child wasn’t able to complete the layout of all the content, it clones the line box, which includes all the styles, and passes the information for that box to complete the layout. Once the layout is complete, the browser walks back up the box tree, resolving any auto or percentage-based values that haven’t been resolved. In the image, you can see that the body and the paragraph is now encompassing all of “Hello world” because its height was set to auto .

Dealing with floats#section10

Now let’s get a little bit more complex. We’ll take a normal layout where we have a button that says “Share It,” and float it to the left of a paragraph of Latin text. The float itself is what is considered to be a “shrink-to-fit” context. The reason it is referred to as “shrink-to-fit” is because the box will shrink down around its content if the dimensions are auto . Float boxes are one type of box that matches this layout type, but there are many other boxes, such as absolute positioned boxes (including position: fixed elements) and table cells with auto -based sizing, for example.

Here is the code for our button scenario:

<article> <button>SHARE IT</button> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam pellentesq</p> </article> <style> article { min-width: 400px; max-width: 800px; background: rgb(191, 191, 191); padding: 5px; } button { float: left; background: rgb(210, 32, 79); padding: 3px 10px; border: 2px solid black; margin: 5px; } p { margin: 0; } </style>

The process starts off by following the same pattern as our “Hello world” example, so I’m going to skip to where we begin handling the floated button.

Since a float creates a new block formatting context (BFC) and is a shrink-to-fit context, the browser does a specific type of layout called content measure. In this mode, it looks identical to the other layout but with an important difference, which is that it is done in infinite space. What the browser does during this phase is lay out the tree of the BFC in both its largest and smallest widths. In this case, it is laying out a button with text, so its narrowest size, including all other CSS boxes, will be the size of the longest word. At its widest, it will be all of the text on one line, with the addition of the CSS boxes. Note: The color of the buttons here is not literal. It is for illustrative purposes only.

Now that we know that the minimum width is 86px, and the maximum width is 115px, we pass this information back to the parent box for it to decide the width and to place the button appropriately. In this scenario, there is space to fit the float at max size so that is how the button is laid out.

In order to ensure that the browser adheres to the standard and the content wraps around the float, the browser changes the geometry of the article BFC. This geometry is passed to the paragraph to use during its layout.

From here the browser follows the same layout process as it did in our first example—but it ensures that any inline content’s inline and block starting positions are outside of the constraint space taken up by the float.

As the browser continues walking down the tree and cloning nodes, it moves past the block position of the constraint space. This allows the final line of text (as well as the one before it) to begin at the start of the content box in the inline direction. And then the browser walks back up the tree, resolving auto and percentage values as necessary.

Understanding fragmentation#section11

One final aspect to touch on for how layout works is fragmentation. If you’ve ever printed a web page or used CSS Multi-column, then you’ve taken advantage of fragmentation. Fragmentation is the logic of breaking content apart to fit it into a different geometry. Let’s take a look at the same example utilizing CSS Multi-column:

<body> <div> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras nibh orci, tincidunt eget enim et, pellentesque condimentum risus. Aenean sollicitudin risus velit, quis tempor leo malesuada vel. Donec consequat aliquet mauris. Vestibulum ante ipsum primis in faucibus </p> </div> <style> body { columns: 2; column-fill: auto; height: 300px; } </style> </body>

Once the browser reaches the multicol formatting context box, it sees that it has a set number of columns.

It follows the similar cloning model from before, and creates a fragmentainer with the correct dimensions to adhere to the authors desire for their columns.

The browser then lays out as many lines as possible by following the same pattern as before. Then the browser creates another fragmentainer and continues the layout to completion.

OK, so let’s recap where we’re at to this point. We’ve taken out all the CSS content, parsed it, cascaded it onto the DOM tree, and completed layout. But we haven’t applied color, borders, shadows, and similar design treatments to the layout–adding these is known as painting.

Painting is roughly standardized by CSS, and to put it concisely (you can read the full breakdown in CSS 2.2 Appendix E), you paint in the following order:

background;

border;

and content.

So if we take our “SHARE IT” button from earlier and follow this process, it will look something like this:





Once this is completed, it is converted to a bitmap. That’s right—ultimately every layout element (even text) becomes an image under the hood.

Concerning the z-index #section13

Now, most of our websites don’t consist of a single element. Moreover, we often want to have certain elements appear on top of other elements. To accomplish this, we can harness the power of the z-index to superimpose one element over another. This may feel like how we work with layers in our design software, but the only layers that exist are within the browser’s compositor. It might seem as though we’re creating new layers using z-index , but we’re not—so what are we doing?

What we’re doing is creating a new stacking context. Creating a new stacking context effectively changes the order in which you paint elements. Let’s look at an example:

<body> <div id="one"> Item 1 </div> <div id="two"> Item 2 </div> <style> body { background: lightgray; } div { width: 300px; height: 300px; position: absolute; background: white; z-index: 2; } #two { background: green; z-index: 1; } </style> </body>

Without z-index utilization, the document above would be painted in document order, which would place “Item 2” on top of “Item 1.” But because of the z-index , the painting order is changed. Let’s step through each phase, similar to how we stepped through our earlier layouts.

The browser starts with the root box; we paint in the background.

The browser then traverses, out of document order to the lower level stacking context (which in this case is “Item 2”) and begins to paint that element following the same rules from above.

Then it traverses to the next highest stacking context (which in this case is “Item 1”) and paints it according to the order defined in CSS 2.2.

The z-index has no bearing on color, just which element is visible to users, and hence, which text and color is visible.

At this stage, we have a minimum of a single bitmap that is passed from painting to the compositor. The compositor’s job is to create a layer, or layers, and render the bitmap(s) to the screen for the end user to see.

A reasonable question to ask at this point is, “Why would any site need more than one bitmap or compositor layer?” Well, with the examples that we’ve looked at thus far, we really wouldn’t. But let’s look at an example that’s a little bit more complex. Let’s say that in a hypothetical world, the Office team wants to bring Clippy back online, and they want to draw attention to Clippy by having him pulsate via a CSS transform.

The code for animating Clippy could look something like this:

<div class="clippy"></div> <style> .clippy { width: 100px; height: 100px; animation: pulse 1s infinite; background: url(clippy.svg); } @keyframes pulse { from { transform: scale(1, 1); } to { transform: scale(2, 2); } } </style>

When the browser reads that the web developer wants to animate Clippy on infinite loop, it has two options:

It can go back to the repaint stage for every frame of the animation, and produce a new bitmap to send back to the compositor.

Or it can produce two different bitmaps, and allow the compositor to do the animation itself on only the layer that has this animation applied.

In most circumstances, the browser will choose option two and produce the following (I have purposefully simplified the amount of layers Word Online would produce for this example):





Then it will re-compose the Clippy bitmap in the correct position and handle the pulsating animation. This is a great win for performance as in many engines the compositor is on its own thread, and this allows the main thread to be unblocked. If the browser were to choose option one above, it would have to block on every frame to accomplish the same result, which would negatively impact performance and responsiveness for the end user.





Creating the illusion of interactivity#section15

As we’ve just learned, we took all the styles and the DOM, and produced an image that we rendered to the end user. So how does the browser create the illusion of interactivity? Welp, as I’m sure you’ve now learned, so let’s take a look at an example using our handy “SHARE IT” button as an analogy:

button { float: left; background: rgb(210, 32, 79); padding: 3px 10px; border: 2px solid black; } button:hover { background: teal; color: black; }

All we’ve added here is a pseudo-class that tells the browser to change the button’s background and text color when the user hovers over the button. This begs the question, how does the browser handle this?

The browser constantly tracks a variety of inputs, and while those inputs are moving it goes through a process called hit testing. For this example, the process looks like this:





The user moves the mouse over the button. The browser fires an event that the mouse has been moved and goes into the hit testing algorithm, which essentially asks the question, “What box(es) is the mouse touching?” The algorithm returns the box that is linked to our “SHARE IT” button. The browser asks the question, “Is there anything I should do since a mouse is hovering over you?” It quickly runs style/cascade for this box and its children and determines that, yes, there is a :hover pseudo-class with paint-only style adjustments inside of the declaration block. It hangs those styles off of the DOM element (as we learned in the cascade phase), which is the button in this case. It skips past layout and goes directly to painting a new bitmap. The new bitmap is passed off to the compositor and then to the user.

To the user, this effectively creates the perception of interactivity, even though the browser is just swapping an orange image to a green one.

Hopefully this has removed some of the mystery from how CSS goes from the braces you’ve written to rendered pixels in your browser.

In this leg of our journey, we discussed how CSS is parsed, how values are computed, and how the cascade actually works. Then we dove into a discussion of layout, painting, and composition.

Now stay tuned for the final installment of this series, where one of the designers of the JavaScript language itself will discuss how browsers compile and execute our JavaScript.