As we saw in the first post of this series, our browser behaves as follows:

function browser() { while (true) { handleEvents(); // Let's make this faster updateDisplay(); } }

As we discussed, the key to making the browser smooth is to make handleEvents() do less. One way of doing this is to go multi-process.

Going multi-process

Chrome is multi-process. Internet Explorer 4 was multi-process and so is Internet Explorer 8+ do (don’t ask me where IE 5, 6, 7 went). Well, Firefox OS is multi-process, too and Firefox for Android used to be multi-process until we canceled this approach due to Android-specific issues. For the moment, Firefox Desktop is only slightly multi-process, although we are heading further in this direction with project electrolysis (e10s, for short).

In a classical browser (i.e. not FirefoxOS, not the Firefox Web Runtime), going multi-process means running the user interface and system-style code in one process (the “parent process”) and running code specific to the web or to a single tab in another process (the “child process”). Whether all tabs share a process or each tab is a separate process, or even each iframe is a process, is an additional design choice that, I believe, is still open to discussion in Firefox. In FirefoxOS and in the Firefox Web Runtime (part of Firefox Desktop and Firefox for Android), going multi-process generally means one process per (web) application.

Since code is separated between processes, each handleEvents() has less to do and will therefore, at least in theory, execute faster. Additionally, this is better for security, insofar as a compromised web-specific process affords an attacker less control than compromising the full process. Finally, this gives the possibility to crash a child process if necessary, without having to crash the whole browser.

Coding for multi-process

In the Firefox web browser, the multi-process architecture is called e10s and looks as follows:

function parent() { while (true) { handleEvents(); // Some of your code here updateDisplay(); // Just the ui } } function child() { while (true) { handleEvents(); // Some of your code here updateDisplay(); // Just some web } } parent() ||| child()

The parent process and the child process are not totally independent. Very often, they need to communicate. For instance, when the user browses in a tab, the parent needs to change the history menu displayed by the parent process. Similarly, every few seconds, Firefox saves its state to provide quick recovery in case of crash – the parent asks each tab for its information and, once all replies have arrived, gathers them into one data structure and saves them all.

For this purpose, parent and the child can send messages to each other through the Message Manager. A Message Manager can let a child process communicate with a single parent process and can let a parent process communicate with one or more children processes:

// Sender-side messageManager.sendAsyncMessage("MessageTopic", {data}) // Receiver-side messageManager.addMesageListener("MessageTopic", this); // ... receiveMessage: function(message) { switch (message.name) { case "MessageTopic": // do something with message.data // ... break; } }

Additionally, code executed in the parent process can inject code in the child process using the Message Manager, as follows:

messageManager.loadFrameScript("resource://path/to/script.js", true);

Once injected, the code behaves as any (privileged) code in the child process.

As you may see, communications are purely asynchronous – we do not wish the Message Manager to stop a process and wait until another process is done with it tasks, as this would totally defeat the purpose of multi-processing. There is an exception, called the Cross Process Object Wrapper, which I am not going to cover, as this mechanism is meant to be used only during a transition phase.

Limitations

It is tempting to see multi-process architecture as a silver bullet that semi-magically makes Firefox (or its competitors) fast and smooth. There are, however, quite clear limitations to the model.

Firstly, going multi-process has a cost. As demonstrated by Chrome, each process consumes lots of memory. Each process needs to load its libraries, its JavaScript VM, each script must be JIT-ed for each process, each process needs its communication channgel towards the GPU etc. Optimizing this is possible, as demonstrated by FirefoxOS (which runs nicely with 256 Mb) but is a challenge.

Similarly, starting a multi-process browser can be much slower than starting a single-process browser. Between the instant the user launches the browser and the instant it becomes actually usable, many things need to happen: launching the parent process, which in turn launches the children processes, setting up the communication channels, JIT compiling all the scripts that need compilation, etc. The same cost appears when shutting down the processes.

Also, using several processes brings about a risk of contention on resources. Two processes may need to access the disk cache at the same time, or the cookies, or the session storage, or the GPU or the audio device. All of this needs to be managed carefully and can, in certain cases, slow down considerably both processes.

Also, some APIs are synchronous by specifications. If, for some reason, a child process needs to access the DOM of another child process – as may happen in the case of iframes – both child processes need to become synchronous. During the operation, they both behave as a single process, with just extremely slower DOM operations.

And finally, going multi-process will of course not make a tab magically responsive if this tab itself is the source of the slowdown – in other words, multi-process it not very useful for games.

Refactoring for multi-process

Many APIs, both in Firefox itself and in add-ons, are not e10s-compliant yet. The task of refactoring Firefox APIs into something e10s-compliant is in progress and can be followed here. Let’s see what needs to be done to refactor an API for multi-process.

Firstly, this does not apply to all APIs. APIs that access web content for non-web content need to be converted to e10s-style – an example is Page Info, which needs to access web content (the list of links and images from that page) for the purpose of non-web content (the Page Info button and dialog). As multi-process communications is asynchronous, this means that such APIs must be asynchronous already or must be made asynchronous if they are not, and that code that calls these APIs needs to be made asynchronous if it is not asynchronous already, which in itself is already a major task. We will cover making things asynchronous in another entry of this series.

Once we have decided to make an asynchronous API e10s-compliant, the following step is to determine which part of the implementation needs to reside in a child process and which part in the parent process. Typically, anything that touches the web content must reside in the child process. As a rule of thumb, we generally consider that the parent process is more performance-critical than children-processes, so if you have code that could reside either in the child process or in the parent process, and if placing that code in the child process will not cause duplication of work or very expensive communications, it is a good idea to move it to the child process. This is, of course, a rule of thumb, and nothing replaces testing and benchmarking.

The next step is to define a communication protocol. Messages exchanged between the parent process and children processes all need a name. If are working on feature Foo, by conventions, the name of your messages should start with “Foo:”. Recall that message sending is asynchronous, so if you need a message to receive an answer, you will need two messages: one for sending the request (“Foo:GetState”) and one for replying once the operation is complete (“Foo:State”). Messages can carry arbitrary data in the form of a JavaScript structure (i.e. any object that can be converted to/from JSON without loss). If necessary, these structures may be used to attach unique identifiers to messages, so as to easily match a reply to its request – this feature is not built into the message manager but may easily be implemented on top of it. Also, do not forget to take into account communication timeouts – recall that a process may fail to reply because it has crashed or been killed for any reason.

The last step is to actually write the code. Code executed by the parent process typically goes into some .js file loaded from XUL (e.g. browser.js) or a .jsm module, as usual. Code executed by a child process goes into its own file, typically a .js, and must be injected into the child process during initialization by using window.messageManager.loadFrameScript (to inject in all children process) or browser.messageManager.loadFrameScript (to inject in a specific child process).

That’s it! In a future blog entry, I will write more about common patterns for writing or refactoring asynchronous code, which comes in very handy for code that uses your new API.

Contributing to e10s

The ongoing e10s refactoring of Firefox is a considerable task. To make it happen, the best way is to contribute to coding or testing.

What’s next?

In the next blog entry, I will demonstrate how to make front-end and add-on code multi-threaded.