I thought it would be fun to document the smallest possible valid HTML documents for each version, so here goes :)

ISO/IEC 15445:2000, also known as “ISO HTML”: 113 bytes

<!DOCTYPE html PUBLIC"ISO/IEC 15445:2000//DTD HTML//EN"><html><head><title></title></head><body><p></body></html>

The DOCTYPE can also be written as <!DOCTYPE html PUBLIC "ISO/IEC 15445:2000//DTD HyperText Markup Language//EN"> , but that obviously requires more characters.

Although it tricks the W3C Validator, the space following PUBLIC can be omitted as long as no system identifier is used.

Start and end tags for <html> , <head> , <body> are required, as well as a block-level element as body content. The end tag for the <p> element can be omitted, though.

HTML 2.0: 58 bytes

<!DOCTYPE html PUBLIC"-//IETF//DTD HTML 2.0//EN"><title//x

Other than the DOCTYPE, only the <title> element is required, as well as some body content (in this case, the text “x”). The start and end tags for <html> , <head> and <body> may be omitted. (Browsers automatically create these elements.)

You may have noticed the use of <title// instead of <title></title> here. This is a markup minimization feature of SGML named “SHORTTAG NETENABL IMMEDNET”. NET stands for Null End Tag. Basically, this allows shortening tags surrounding a text value. The first slash ( / ) in <title// stands for the NET-enabling “start-tag close” (NESTC), and the second slash stands for the NET. If you wanted to add some content to the <title> element, you could theoretically use <title/Foo/ instead of ( <title>Foo</title> ).

Note that the following version (54 bytes) seems to have the same effect, according to the W3C Validator:

<!DOCTYPE html PUBLIC"-//IETF//DTD HTML//EN"><title//x

HTML 3.2: 63 bytes

<!DOCTYPE html PUBLIC"-//W3C//DTD HTML 3.2 Final//EN"><title//x

Note that the DOCTYPE for HTML 3.2 and older versions doesn’t really have an effect on your document; browsers still enter quirks mode.

HTML 4.0 Strict: 59 bytes

<!DOCTYPE html PUBLIC"-//W3C//DTD HTML 4.0//EN"><title//<p>

In HTML4, the body content must contain a block-level element — just text content won’t do. For that reason, an empty <p> element is used.

HTML 4.01 Transitional: 71 bytes

<!DOCTYPE html PUBLIC"-//W3C//DTD HTML 4.01 Transitional//EN"><title//x

Note that we’re not using the full document type declaration; the system identifier (the URL part that theoretically allows user agents to download the document type definition and any needed entity sets) is optional, so it’s been omitted here.

HTML 4.01 Transitional requires body content, but accepts text content; a block-level element in the <body> isn’t needed.

HTML 4.01 Frameset: 84 bytes

<!DOCTYPE html PUBLIC"-//W3C//DTD HTML 4.01 Frameset//EN"><title//<frameset/<frame>/

The full DOCTYPE is <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd"> , but the system identifier may be omitted.

As you can see, we’re using the same SGML trick as before ( <frameset/<frame>/ ) — only this time we’re actually adding content to the wrapper element.

In HTML 4.01 Frameset, the <frameset> element must have a <frame> child element. XHTML 1.0 Frameset does not have this requirement.

HTML 4.01 Strict: 60 bytes

<!DOCTYPE html PUBLIC"-//W3C//DTD HTML 4.01//EN"><title//<p>

HTML 4.01 + RDFa 1.0: 69 bytes

<!DOCTYPE html PUBLIC"-//W3C//DTD HTML 4.01+RDFa 1.1//EN"><title//<p>

The full DOCTYPE is <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/html401-rdfa-1.dtd"> , but the system identifier may be omitted.

HTML 4.01 + RDFa 1.1: 69 bytes

<!DOCTYPE html PUBLIC"-//W3C//DTD HTML 4.01+RDFa 1.1//EN"><title//<p>

The full DOCTYPE is <!DOCTYPE html PUBLIC"-//W3C//DTD HTML 4.01+RDFa 1.1//EN" "http://www.w3.org/MarkUp/DTD/html401-rdfa11-1.dtd"> , but the system identifier may be omitted.

XHTML Basic 1.0: 41 bytes

<html><head><title/></head><body/></html>

The DOCTYPE — in this case, <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN" "http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd"> — is optional in all XHTML versions, assuming the document is served with the correct Content-Type: application/xhtml+xml header. (That’s a bold assumption.) Note that the xmlns attribute on the root <html> element isn’t required in this version of XHTML.

Body content is optional, too.

You may notice the use of <title/> here instead of <title></title> . This is the XHTML equivalent of <title// in HTML serializations. Remember when we talked about SGML, and how HTML defined both its NET and NETSC with a / ? The only difference here is that XML defines NESTC with a / , and NET with an > (angled bracket).

XHTML Basic 1.1: 41 bytes

<html><head><title/></head><body/></html>

The DOCTYPE, <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.1//EN" "http://www.w3.org/TR/xhtml-basic/xhtml-basic11.dtd"> is optional — again, assuming the file is served with the correct MIME type.

XHTML 1.0 Transitional: 78 bytes

<html xmlns="http://www.w3.org/1999/xhtml"><head><title/></head><body/></html>

The DOCTYPE, <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> , is optional.

XHTML 1.0 Frameset: 82 bytes

<html xmlns="http://www.w3.org/1999/xhtml"><head><title/></head><frameset/></html>

The DOCTYPE, <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd"> , is optional.

XHTML 1.0 Strict: 78 bytes

<html xmlns="http://www.w3.org/1999/xhtml"><head><title/></head><body/></html>

The DOCTYPE, <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> , is optional.

XHTML + RDFa 1.1: 78 bytes

<html xmlns="http://www.w3.org/1999/xhtml"><head><title/></head><body/></html>

The DOCTYPE, <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.1//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-2.dtd"> , is optional.

XHTML 1.1: 78 bytes

<html xmlns="http://www.w3.org/1999/xhtml"><head><title/></head><body/></html>

The DOCTYPE, <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> , is optional.

HTML5: 15 bytes

<!DOCTYPE html>

That’s right — there’s no <title> element! When a higher-level protocol provides title information, e.g. in the subject line of an email or when HTML is used as an email authoring format, the <title> element may be omitted.

In all other situations, this is the smallest possible HTML5 document (31 bytes):

<!DOCTYPE html><title>x</title>

Sadly, the SGML trick we used before ( <title// ) is not allowed in HTML5 anymore. Even if it was, we still couldn’t use it, because HTML5 requires a non-empty content value for the <title> element if it is used. The reasoning behind this is obvious: if you leave the <title> element empty, it means the document doesn’t need a title, in which case you should simply omit the <title> element entirely (as explained above).

Note that body content is not required.

XHTML5: 44 bytes

<html xmlns="http://www.w3.org/1999/xhtml"/>

XHTML5 doesn’t require a DOCTYPE. Just like in HTML5, there are cases where a <title> element is not needed. Body content is optional, too.

(Use validator.nu to confirm this; the W3C validator would fall back to XHTML 1.0 Transitional if you tried to validate this.)

Disclaimer

It’s very likely that I missed a possible “optimization”. Please leave a comment if you have any corrections or other feedback!

Update: I’ve set up a repository on GitHub to collect the smallest possible syntactically valid files. Pull requests welcome!