You want to hire a new programmer and you have the perfect candidate in mind, your old college roommate, Guillaume Portes. Unfortunately you can’t just go out and offer him the job. That would get you in trouble with your corporate HR policies which require that you first create a job description, advertise the position, interview and rate candidates and choose the most qualified person. So much paperwork! But you really want Guillaume and only Guillaume.

So what can you do?

The solution is simple. Create a job description that is written specifically to your friend’s background and skills. The more specific and longer you make the job description, the fewer candidates will be eligible. Ideally you would write a job description that no one else in the world could possibly match. Don’t describe the job requirements. Describe the person you want. That’s the trick.

So you end up with something like this:

5 years experience with Java, J2EE and web development, PHP, XSLT

Fluency in French and Corsican

Experience with the Llama farming industry

Mole on left shoulder

Sister named Bridgette

Although this technique may be familiar, in practice it is usually not taken to this extreme. Corporate policies, employment law and common sense usually prevent one from making entirely irrational hiring decisions or discriminating against other applicants for things unrelated to the legitimate requirements of the job.

But evidently in the realm of standards there are no practical limits to the application of this technique. It is quite possible to write a standard that allows only a single implementation. By focusing entirely on the capabilities of a single application and documenting it in infuriatingly useless detail, you can easily create a “Standard of One”.

Of course, this begs the question of what is essential and what is not. This really needs to be determined by domain analysis, requirements gathering and consensus building. Let’s just say that anyone who says that a single existing implementation is all one needs to look at is missing the point. The art of specification is to generalize and simplify. Generalizing allows you to do more with less, meeting more needs with fewer constraints.

Let’s take a simplified example. You are writing a specification for a file format for a very simple drawing program, ShapeMaster 2007. It can draw circles and squares, and they can have solid or dashed lines. That’s all it does. Let’s consider two different ways of specifying a file format.

In the first case, we’ll simply dump out what ShapeMaster does in the most literal way possible. Since it allows only two possible shapes and only two possible line styles, and we’re not considering any other use, the file format will look like this:

<document> <shape iscircle="true" isdotted="false"/> <shape iscircle="false" isdotted="true"/> </document>

Although this format is very specific and very accurate, it lacks generality, extensibility and flexibility. Although it may be useful for ShapeMaster 2007, it will hardly be useful for anyone else, unless they merely want to create data for ShapeMaster 2007. It is not a portable, cross-application, open format. It is a narrowly-defined, single application format. It may be in XML. It may even be reviewed by a standards committee. But it is by its nature, closed and inflexible.

How could this have been done in a way which works for ShapeMaster 2007 but also is more flexible, extensible and considerate of the needs of different applications? One possibility is to generalize and simplify:

<document> <shape type="circle" lineStyle="solid"/> <shape type="square" lineStyle="dotted"/> </document>

Rather than hard-code the specific behavior of ShapeMaster, generalize it. Make the required specific behavior be a special case of something more general. In this way we solve the requirements of ShapeMaster 2007, but also accommodate the needs of other applications, such as OpenShape, ShapePerfect and others. For example, it can easily accommodate additional shapes and line styles:

<document> <shape type="circle" lineStyle="solid"/> <shape type="square" lineStyle="dotted"/> <shape type="triangle" lineStyle="dashed"/> </document>

This is a running criticism I have of Microsoft’s Office Open XML (OOXML). It has been narrowly crafted to accommodate a single vendor’s applications. Its extreme length (over 6,000 pages) stems from it having detailed every wart of MS Office in an inextensible, inflexible manner. This is not a specification; this is a DNA sequence.

The ShapeMaster example given above is very similar to how OOXML handles “Art Page Borders” in a tedious, inflexible way, where a more general solution would have been both more flexible, but also far easier to specify and implement. I’ve written on this in more detail elsewhere.

Here are some other examples of where the OOXML “Standard” has bloated its specification with features that no one but Microsoft will be able to interpret:

2.15.3.6 autoSpaceLikeWord95 (Emulate Word 95 Full-Width Character Spacing) This element specifies that applications shall emulate the behavior of a previously existing word processing application (Microsoft Word 95) when determining the spacing between full-width East Asian characters in a document’s content. [Guidance: To faithfully replicate this behavior, applications must imitate the behavior of that application, which involves many possible behaviors and cannot be faithfully placed into narrative for this Office Open XML Standard. If applications wish to match this behavior, they must utilize and duplicate the output of those applications. It is recommended that applications not intentionally replicate this behavior as it was deprecated due to issues with its output, and is maintained only for compatibility with existing documents from that application. end guidance]

(This example and the following examples brought to my attention by this post from Ben at Genii.)

What should we make of that? Not only must an interoperable OOXML application support Word 12’s style of spacing, but it must also support a different way of doing it in Word 95. And by the way, Microsoft is not going to tell you how it was done in Word 95, even though they are the only ones in a position to do so.

Similarly, we have:

2.15.3.26 footnoteLayoutLikeWW8 (Emulate Word 6.x/95/97 Footnote Placement) This element specifies that applications shall emulate the behavior of a previously existing word processing application (Microsoft Word 6.x/95/97) when determining the placement of the contents of footnotes relative to the page on which the footnote reference occurs. This emulation typically involves some and/or all of the footnote being inappropriately placed on the page following the footnote reference. [Guidance: To faithfully replicate this behavior, applications must imitate the behavior of that application, which involves many possible behaviors and cannot be faithfully placed into narrative for this Office Open XML Standard. If applications wish to match this behavior, they must utilize and duplicate the output of those applications. It is recommended that applications not intentionally replicate this behavior as it was deprecated due to issues with its output, and is maintained only for compatibility with existing documents from that application. end guidance]

Again, in order to support OOXML fully, and provide support for all those legacy documents, we need to divine the behavior of exactly how Word 6.x “inappropriately” placed footnotes. The “Standard” is no help in telling us how to do this. In fact it recommends that we don’t even try. However, Microsoft continues to claim that the benefit of OOXML and the reason why it deserves ISO approval is that it is the only format that is 100% backwards compatible with the billions of legacy documents. But how can this be true if the specification merely enumerates compatibility attributes like this without defining them ? Does the specification really specify what it claims to specify?

The fact that this and other legacy features are dismissed in the specification as “deprecated” is no defense. If a document contains this element, what is a consuming application to do? If you ignore it, the document will not be formatted correctly. It is that simple. Deprecated doesn’t mean “not important” or “ignorable”. It just means that new documents authored in Office 2007 will not have it. But billions of legacy documents, when converted to OOXML format, may very well have them. How well will a competing word processor do in the market if it cannot handle these legacy tags?

So I’d argue that these legacy tags are some of the most important ones in the specification. But they remain undefined, and by this ruse Microsoft has arranged things so that their lock on legacy documents extends to even when those legacy documents are converted to OOXML. We are ruled by the dead hand of the past.

Let’s go back even further in time to Word 5.0:

2.15.3.32 mwSmallCaps (Emulate Word 5.x for the Macintosh Small Caps Formatting) This element specifies that applications shall emulate the behavior of a previously existing word processing application (Microsoft Word 5.x for the Macintosh) when determining the resulting formatting when the smallCaps element (§2.3.2.31) is applied to runs of text within this WordprocessingML document. This emulation typically results in small caps which are smaller than typical small caps at most font sizes. [Guidance: To faithfully replicate this behavior, applications must imitate the behavior of that application, which involves many possible behaviors and cannot be faithfully placed into narrative for this Office Open XML Standard. If applications wish to match this behavior, they must utilize and duplicate the output of those applications. It is recommended that applications not intentionally replicate this behavior as it was deprecated due to issues with its output, and is maintained only for compatibility with existing documents from that application. end guidance]

You’ll need to take my word for it that “This emulation typically results in small caps which are smaller than typical small caps at most font sizes” falls well short of the level of specificity and determinism that is typical of ISO specifications.

Further:

2.15.3.51 suppressTopSpacingWP (Emulate WordPerfect 5.x Line Spacing) This element specifies that applications shall emulate the behavior of a previously existing word processing application (WordPerfect 5.x) when determining the resulting spacing between lines in a paragraph using the spacing element (§2.3.1.33). This emulation typically results in line spacing which is reduced from its normal size. [Guidance: To faithfully replicate this behavior, applications must imitate the behavior of that application, which involves many possible behaviors and cannot be faithfully placed into narrative for this Office Open XML Standard. If applications wish to match this behavior, they must utilize and duplicate the output of those applications. It is recommended that applications not intentionally replicate this behavior as it was deprecated due to issues with its output, and is maintained only for compatibility with existing documents from that application. end guidance]

So not only must an interoperable OOXML implementation first acquire and reverse-engineer a 14-year old version of Microsoft Word, it must also do the same thing with a 16-year old version of WordPerfect. Good luck.

My tolerance for cutting and pasting examples goes only so far, so suffice it for me to merely list some other examples of this pattern:

lineWrapLikeWord6 (Emulate Word 6.0 Line Wrapping for East Asian Text)

(Emulate Word 6.0 Line Wrapping for East Asian Text) mwSmallCaps (Emulate Word 5.x for Macintosh Small Caps Formatting)

(Emulate Word 5.x for Macintosh Small Caps Formatting) shapeLayoutLikeWW8 (Emulate Word 97 Text Wrapping Around Floating Objects)

(Emulate Word 97 Text Wrapping Around Floating Objects) truncateFontHeightsLikeWP6 (Emulate WordPerfect 6.x Font Height Calculation)

(Emulate WordPerfect 6.x Font Height Calculation) useWord2002TableStyleRules (Emulate Word 2002 Table Style Rules)

(Emulate Word 2002 Table Style Rules) useWord97LineBreakRules (Emulate Word 97 East Asian Line Breaking)

(Emulate Word 97 East Asian Line Breaking) wpJustification (Emulate WordPerfect 6.x Paragraph Justification)

(Emulate WordPerfect 6.x Paragraph Justification) shapeLayoutLikeWW8 (Emulate Word 97 Text Wrapping Around Floating Objects)

This is the way to craft a job description so you hire only the person you earmarked in advance. With requirements like the above, no others need apply.

As I’ve stated before, if this were just a Microsoft specification that they put up on MSDN for their customers to use, this would be par for the course, and not worth my attention. But this is different. Microsoft has started calling this a Standard, and has submitted this format to ISO for approval as an International Standard. It must be judged by those greater expectations.

Update:

1/14/2007 — This post was featured on Slashdot on 1/4/07 where you can go for additional comments and debate. I’ve summarized the comments and provided some additional analysis here.

2/16/2007 — fixed some typo’s, tightened up some of the phrases.