A research team led by Microsoft's Helen Wang recently published a report about an experimental browser prototype called "Gazelle" that uses processes to isolate page content elements originating from different domains. It builds on the concept of multiprocess browsing but uses more fine-grained isolation to expand on the security advantages that are already delivered by existing multiprocess browsing models. But is it an operating system, Microsoft Research's analogue to Google's Chrome OS? Not quite.

Wang's characterization of Gazelle as a "multi-principal operating system" for the Web has been widely misinterpreted by the press. Although Gazelle's architecture is loosely modeled on the underlying concepts of operating system design, it is not actually an operating system, it's not intended to replace Windows, and it won't compete with Chrome OS. It is a browser prototype that runs on Windows Vista, is coded in C#, and has a conventional user interface that is built with .NET's WinForms framework.

Multiprocess browsing, which is supported in Google's Chrome Web browser and recent versions of Internet Explorer, uses separate operating system processes to isolate the rendering of individual pages. As we have recently discussed in our coverage of multiprocess browsing, this approach generally boosts security and stability. It prevents a rendering bug that affects a specific page or plugin from tanking the whole browser.

Multiprocess browsing is advantageous, but it does have some downsides. Processes tend to generate a lot of resource overhead, especially on Windows. In order to minimize the impact of using multiple processes, Chrome and IE try to use some number of processes that provides a good balance between resource efficiency and stability. For example, if you have multiple tabs open that show different pages from the same website, the browser might put them all into one process.

The Gazelle project casts aside that balance and aims to maximize security and stability by using more processes. Instead of just using a separate process for each site or tab, it will use separate processes for individual page content elements that originate from other domains. For example, if you have an iframe in a page, the iframe will be managed and rendered in its own process separate from the rest of the page.

Gazelle uses interprocess communication (IPC) to facilitate interaction between the various pieces of the browser. All communication is mediated by the browser's "kernel," a central component that is also responsible for handling the browser's user interface. All network and filseystem operations such as loading and caching are handled directly by the kernel. Gazelle's approach can theoretically maximize the security of browsing by preventing individual content items (referred to as "principals" in the research paper) from interfering with the operation of others and from accessing the system hardware.

How it works

The research paper provides technical insight into how the prototype was implemented. It's largely a .NET application that uses Internet Explorer's "Trident" rendering engine. It does so by using headless instances of the Windows.Forms.WebBrowser control, a managed code wrapper for the ActiveX WebBrowser component. (This should be familiar to those of you who have made .NET GUI applications with Visual Studio.)

When a page loads, Gazelle instantiates a separate WebBrowser control in a new process for each iframe and other similar foreign content elements. The code running in these processes uses the IViewObject COM interface to generate bitmap images of the rendered content from the WebBrowser controls. These bitmaps are then passed back to the browser's "kernel" component. Gazelle's IPC mechanism uses an XML message format and uses named pipes to transmit the data.

The browser kernel is responsible for composing all of the individual bitmaps from the various processes to generate a rendered page image. It tracks the positioning, dimensions, and stacking order of the page elements that are being rendered in separate processes so that it can assemble the bitmaps accurately as it builds the final page view. The completed image is displayed to the user in a WinForms PictureBox control on a tab in the browser's user interface.

It's clear that a lot of careful thought went into developing the underlying architectural concepts, but it seems like the implementation itself was thrown together in Visual Studio using relatively standard .NET components. This strongly reflects the fact that Gazelle is a research project and not a product.

The researchers ran into several problems while creating their prototype. They were unable to find a practical way to use the WebBrowser APIs to intercept all of the network operations. As a workaround, they configured the WebBrowser controls to use a local Web proxy which they connected to the Gazelle kernel. They also had some problems with page layout, particularly in cases where they were rendering individual off-site images in separate processes. For simplicity, they decided to allow page processes to handle off-site images rather than isolating them in their own processes.

The Gazelle project is currently limited in scope. The researchers were primarily interested in exploring the architectural implications of an extreme approach to multiprocess browsing; they weren't trying to deliver in practice the hypothetical security benefits it could someday offer. It's important to note that they don't actually have code-level sandboxing yet, which means that the underlying platform would still be vulnerable if one of the rendering processes was compromised.

"Our interposition layer ensures that our Trident components are never trusted with sensitive operations, such as network access or display rendering. However, if a Trident renderer is compromised, it could bypass our interposition hooks and compromise other principals using the underlying OS's APIs," the report explains. "To prevent this, we are in the process of implementing an OS-level sandboxing mechanism, which would prevent Trident from directly accessing sensitive OS APIs. The feasibility of such a browser sandbox has already been established in Xax and Native Client."

Native Client is a project that was launched last year by Google to build a browser plugin that will enable secure execution of native code in Web browsers. Microsoft's Gazelle researchers acknowledge that its sandboxing model is viable and could potentially be applicable to some of the same problems that they are trying to solve.

Another obvious limitation of Gazelle is that its process-heavy architecture necessarily introduces additional latency and memory overhead. The researchers provide some tables in their paper to show how it compares to Internet Explorer. Loading Google (no, not Bing) in a new tab in Gazelle took 939 milliseconds and increased the browser's memory footprint by 16MB. By comparison, the same operation in Internet Explorer 7 took 499 milliseconds and only 1.4MB.

Although Gazelle will necessarily always use more resources than a browser that spawns processes less aggressively, the researchers believe that they could eventually improve the performance and memory footprint to the point where it would be acceptable for regular usage. The prototype is basically unoptimized, so there may still be room for meaningful performance improvements.

Gazelle is a fascinating experiment that provides real insight into where multiprocess browsing could potentially go in the future. As computing resources become cheaper, the security benefits of more pervasive compartmentalization will increasingly outweigh the added latency and memory overhead. As that occurs, it's possible that mainstream browsers could start to look more like Gazelle.

Further reading