The disembodied head of Matt Brubeck over at limpet.net has been producing an excellent series of blog posts chronicling the exercise of creating a toy layout engine for a web browser. Matt works for Mozilla on Servo, which is about as close to the bright center of the universe as any open source project is these days. He has taken on this exercise to bolster his own mental model of how layout engines work – so that he can more effectively contribute to Servo – and, apparently, to motivate others to share the journey.

Do go read Matt’s first post in the series. He does a fabulous job of explaining the goal and its motivation, and of setting expectations for the scope of the effort. Even if you are not interested in doing this particular exercise, you might find that the way that he thinks about setting up the challenge will help you in other pursuits.

Matt has chosen Rust as the programming language for this exercise, largely because Servo is written in it. 2014 has been an exciting year for emerging programming languages, and Rust may be the single most exhilarating one. However, I have been on a bit of a Swift kick[1] as of late, so I chose to use this as an opportunity to both learn how browser sausage is made and to practice some Swift. My own toy project is called Crow.

I was going to put a litany of caveats here, but I decided to spare you. You’re welcome.

First Step: The Central Model

The central data model for a browser is a tree of nodes representing the HTML document. Of course there is a detailed specification that spells out what a real implementation needs to be like. This is just a learning exercise, so simplicity is a higher priority than completeness.

This might be a good time to open Matt’s article in another browser window so you can compare his Rust implementation to the Swift below.

First we need a Node type so we can build the DOM tree. I’m going to try to use let bindings where I can because I have grown to appreciate immutability wherever I can get it. Each node has a NodeType , and can also have any number of children.

1 2 3 4 5 6 public struct Node { // data common to all nodes: public let children: [Node] // data specific to each node type: public let nodeType: NodeType }

Just like Matt’s toy project, the NodeType is represented as an enum . There are two possible cases here: either the node represents an HTML element concomitant with tag name and a set of attributes, or it represents a string of characters between HTML tags.

1 2 3 4 public enum NodeType { case Element(ElementData) case Text(String) }

When a NodeType represents an HTML element, it needs a data structure to hold the tag name and the attributes. The attributes are represented as a Dictionary<String,String> . Like Matt, we’ll make a type alias for that:

1 2 3 4 5 6 public struct ElementData { public let tagName: String public let attributes: AttrMap } public typealias AttrMap = [String:String]

Finally, Matt offers a pair of Node constructor functions for making instances. Swift has special syntax for constructors, but it does allow us to define them separately from the struct in an extension block.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 extension Node { public init(data: String) { self.children = [] self.nodeType = .Text(data) } public init(name: String, attrs: AttrMap, children: [Node]) { self.children = children let data = ElementData(tagName: name, attributes: attrs) self.nodeType = .Element(data) } }

That’s it for the humble beginnings. Matt offers some ideas for further exercise in his post – and some links to other resources around the web – so you might want to check those out.

Next time I’ll take on the simple HTML parser that builds an actual tree out of the pieces above.

[1] If I had a nickel for every time my dad offered me a motivational “swift kick”, I’d have several nickels.