The responses to benchmarking multiple versions of OpenOffice.org varied. Common responses were oversimplification of the results and some unrealistic expectations. To put that data into perspective, here is a benchmark for Microsoft Word 95 through 2007. That's over a decade years of releases, each of which has definitely become fatter. Before you read the benchmark results below, do you think over the years Word has become slower or faster?

Benchmark test environment

The modest test machine is about three years old.

Operating system: Windows XP SP3 (clean install)

CPU: AMD Athlon XP 3000+

RAM: 768 MB, DDR 333 (PC 2700)

HDD: Maxtor 6Y080L0, IDE, 7200 RPM, 80 GB

Video: Via VT8378 S3 Unichrome IGP at 1024x768

Though this is the same hardware as the OpenOffice.org benchmark, the results are not an apples-to-apples comparison between Microsoft Word and OpenOffice.org Writer because of differences in the test documents, operating system, and test procedure. Check back later for a direct comparison.

The Microsoft dilemma and the problem of quick starters

Benchmarking Microsoft Office has dilemmas. Microsoft Office 95 through Office XP install quick starters under various names in the Startup folder. These include Fast Start and Microsoft Office Startup Assistant (OSA). Their purpose is to improve startup performance by preloading Office into memory because hard drives are slow. Office 2003 and 2007 no longer load the OSA by default. The reason is likely because Windows XP introduced a prefetcher at the operating system level.

Quick starters don't really avoid any work. Quick starters simply hide the appearance of work by performing it before it is requested. To start the application, the operating system has to read the same amount of data from the hard drive (consuming disk I/O resources) and process it (consuming CPU resources). Being proactive is commendable, but there are drawbacks. First, quick starters use precious resources regardless of demand. If you don’t use the preloaded application, the resources that could improve other parts of the system are wasted. Speaking of precious resources, you probably experienced the second problem many times: the sluggish feeling of logging into a Windows XP session with many quick starters and system tray icons firing off simultaneously. Whether you notice them or not (because some load silently in the background), you may be waiting for Adobe Acrobat Reader Speed Launch, Quicktime, Winamp Agent, iTunes Helper, and others quick starters and services to load. It can easily be 30 seconds before the session is usable. A third problem with quick starters is they may be ineffective. Say you start Windows, then browse the web, and then open some PDFs, and then listen to some music. It's likely that by loading programs and accessing stored data, the operating system will eventually swap the preloaded application from RAM to the page file on the hard disk. Then, when you start the preloaded application, the operating system retrieves the file from the hard disk. In conclusion, quick starters are imperfect. Once we all get low-latency, low-seek-time solid-state drives so we can forget about quick starters, but these SSDs are only starting to enter the mainstream.

The goals of this benchmark include being realistic, being fair, being consistent, and measuring the amount of work as it relates to the user experience. If the quick starters were enabled, that would give Word 95 through XP an advantage over 2003 and 2007. Given the above problems and goals, I disabled Fast Start and OSA to better distinguish cold starts versus warm starts. I left Windows Prefetcher on its default settings. In the future when I benchmark OpenOffice.org on Windows, I will generally benchmark OpenOffice.org with its Quickstarter disabled too.

Benchmark test procedure

Very similar to the OpenOffice.org benchmark, this Microsoft Word benchmark uses automation to precisely measure the duration of a series of common operations: starting the application, opening a document, scrolling through from top to bottom, saving the document, and finally closing both the document and application. Automation is much more precise than a human with a stopwatch, and with the small durations in these tests, automation is necessary. Each of the five tests are repeated for 10 iterations. Before each set of 10 iterations, the system reboots. The purpose of rebooting is to measure the difference in cold start performance where information is not yet cached into fast memory. A reboot marks a pass, and there are 15 passes. That means for each version of Word, there are 150 iterations. Multiplying by 5 tests and by 6 versions of Microsoft Word yields 4500 total measurements collected for this article.

Benchmark results

A critical metric to the feeling of overall speed is starting an application on a cold start. The average cold start duration worsened 90% from Word 2003 to Word 2007. That's almost an order of magnitude. (In all the boxplots below, smaller is better.)

Notice the three outliers depicted as dots in the Word 2007 column. That is the Windows XP Prefetch progressively optimizing startup performance. The second cold start was slightly longer, but the third was about half. The fourth start was about a one-fourth of the first duration.

Warm starts from Word 2003 to Word 2007 deteriorated 247% from 0.13 seconds to 0.45 seconds. Also notice the difference for Word 2007 because cold and warm starts is 1.57 seconds - 0.45 seconds = 1.11 seconds. The difference of 1.11 seconds implies that work (71%) is done by the hard drive. To be fair, an application start under two seconds is quick.

Document tests were done with a special set of documents that use a wide variety of word processor features but are still backwards compatible to Word 95. Each version of Word used its own format for a total of three test documents: Word 95, Word 97/2000/XP/2003, and Word 2007.

The Word 97 format introduced multibyte encoding to support a wider variety of human languages: that made documents bigger and increased disk I/O (which is slow). That contributed to performance losses of 74% cold and 18% warm from Word 95 to 97. Word 2007 introduced the compressed .docx format which was smaller on the disk (so less disk I/O), but uncompressing and parsing XML requires much more CPU time. That contributed to performance degradation of 305% cold and 160% warm from Word 2003 to 2007.

Scrolling may measure screen painting, loading dynamic objects, and OLE automation. Again, Word 2007 is slower.

The times for exporting the document are all relatively fast, but performance still declined 114% cold and 58% warm from Word 2003 to 2007.

The times to closing the document and application are all reasonably quick, but guess what? Going from Word 2003 to Word 2007, it took 82% longer on a cold start and 75% longer on a warm start.

In the beginning, what was your guess: did you guess Word 2007 will be faster or slower? Well, here are all the tests rolled up together.

Word 2003 burned 7.05 seconds to run the gamut after a cold start and 6.17 seconds after a warm start. Word 2007 consumed 16.12 and 12.13 seconds (respectively).

Benchmark conclusion

Despite all the tweaks, tricks, hacks, and clever engineering, Microsoft's latest version of Word is a step back in performance. That's generally true of every new major release of any software. That's Wirth's Law. When Windows 95 came out, people joking exaggeratedly that "95" was the number of floppies Windows 95 shipped on. Though Windows 95 actually shipped on many fewer floppies, the point is that Windows 95 was a huge, slower beast compared to its predecessors. (Does anyone remember Windows 3.1, OS/2 Warp, and DESQview?) There were similar complaints for XP, and Vista's performance was the subject of the infamous "Vista Capable" class action lawsuit against Microsoft.

All that to say, whether you are upgrading to the latest Microsoft Office or OpenOffice.org on the same hardware year after year, someday something somewhere will have to give: you'll need to stop upgrading your software or start upgrading your hardware.

Related articles

This article is the fifth in an ongoing series on performance.