If you follow commits closely, via source-changes@ or otherwise, you may already know that mandoc has grown another useful feature. Ingo Schwarze sent us this very nicely formatted article about the new mandoc to markdown converter:

I just committed a new mandoc(1) output formatter to OpenBSD-current, for converting manual pages written in the mdoc(7) markup language to markdown. The point is that in some contexts, documentation authors are required by third-party policies to provide markdown versions of their documentation. This new output mode allows them to maintain only one copy of their documentation in the well-known, simple, and high quality mdoc(7) language while still providing markdown versions for the purposes where those are required, which may for example include pasting them into Wikis. Thanks to Reyk@ Flöter (OpenBSD) and to Vsevolod@ Stakhov (FreeBSD) for suggesting such an output mode, and to Kristaps@ Dzonsons (bsd.lv) for contributing several ideas to this writeup.

The reason for providing this output mode is not that i consider markdown a good, or even a half-decent, markup language. Quite to the contrary, I hereby offcially declare it the shittiest markup language i have seen so far. Basically, it hasn't any strong point whatsoever, but the downsides are numerous, scary, and cover practically every relevant aspect:

Lack of expressiveness:

Markdown is pitifully weak and powerless even by its own standard, which is: make formatting easy for anything that can be expressed in a plain-text email. For example, it doesn't provide any syntax for definition lists ( <dl> in HTML, .Bl -tag in mdoc(7), .TP in man(7)) even though such lists can easily be written in a plain-text email.

Context sensitivity:

The syntax and semantics are extremely context sensitive. Almost every token can take completely different meanings depending on where it appears.

Ambiguity:

The syntax for emphasis by enclosing in asterisks or underscores is terribly ill-designed because it gives rise to no end of ambiguity — and not just the classic example of long_var_name , but also confusion about start and end tags. For example, **bold***italic* works as expected, but if you add another **bold** , as in **bold***italic***bold** , it may become <strong>bold<strong><em>italic</em></strong>bold</strong> , at least with some markdown compilers.

Mixup of semantic and presentational markup:

You can't switch off filling (which is a presentational manipulation) without getting <code> tags (which is semantic markup). You can't get indentation (presentational!) without either <code> or <blockquote> (both semantic). Admittedly, early versions of HTML had similar problems. For example, <i> was originally designed to be presentational; in HTML 5, it is now properly semantic, and the presentational aspects are relegated to CSS, where they belong. Kristaps summed this up succinctly: "HTML 5 is (kinda) semantic; markdown is not." In theory, HTML code generated from markdown input could be improved if parser maintainers would choose to generate HTML output that is less encumbered with unintended semantic connotations. But Kristaps tells me parser maintainers rarely do that, for two reasons. Through inertia, most CSS files for markdown-generated HTML now expect these cruddy HTML constructs. And so do some tools that check the output of markdown-to-HTML converters for "correctness", checking that the emitted tags agree with tradition rather than checking whether they make sense semantically.

Lack of independence:

Markdown is not at all a self-contained language. It allows embedding arbitrary HTML code, both at the block and at the flow level. That makes writing any parser for it very hard because you basically have to include a full HTML parser and then add context sensitive complications on top of it. You also have to worry about all the security caveats of HTML. For example, HTML allows embedding Javascript, so you get to implement a Javascript interpreter as well, and to secure it. Fortunately, i did not have to implement a markdown parser, mandoc(1) only needs to write markdown, not read it. Reading markdown code is the job for lowdown(1). So far, so bad: you get all the downsides of HTML for sure. But you get almost none of the benefits of HTML because markdown imposes lots of arbitrary and crippling restrictions on how you can use HTML. For example, inside unfilled text, you can neither use named or numbered character references, nor flow-level elements like <em> , nor even native markdown formatting instructions like ** . You can't use any block-level HTML elements inside any text that is to be indented. You can't use any kind of markdown formatting inside block-level HTML elements. As an example, even if you are willing to write definition lists in HTML syntax, their list items cannot contain nested markdown lists or displays, nor can the items of markdown lists contain definition lists. While markdown list elements can contain paragraph breaks, that no longer works when the list as a whole is indented. In that case, a paragraph break terminates the list. And so on and so forth, no end of traps here... Of course, you can work around such nesting restrictions by writing all parent and child elements of the HTML block you want to nest in HTML rather than in markdown syntax, even if markdown syntax exists for these parent and child elements when they appear in isolation. But that mostly defeats the purpose of the whole exercise, making you wonder why you ever chose markdown over HTML in the first place. In addition, markdown was originally intended for autogenerating exactly one target language: HTML. Having only one target language in mind when designing a new meta-language is obviously already a bad idea, but choosing HTML as a target language is even worse, because HTML is notoriously difficult to translate into other formats. So even leaving the many design failures listed above aside, the basic approach of mainly targetting HTML already curtails most of the potential benefit of inventing a simpler markup language.

Syntax inspired by Whitespace:

A line break without a paragraph break requires whitespace at the end of the preceding line, but the number of trailing blanks is semantically significant: there must be at least two. So, the two line endings " foo " and " foo " have different meaning.

Lack of standardization:

The most official reference manual for markdown is the original one written by John Gruber in 2004. It is unmaintained since that time and leaves various ambiguities, such that different parsers tend to parse input somewhat differently in detail. In a language starved for features, that's particularly unfortunate because you usually can't use any alternative syntax to avoid the ambiguities because usually there aren't any alternatives at all.

Lack of extensibility: