Author’s Note: I’m currently in the process of migrating old blog posts to this new system. That may mean some links, syntax highlighting, and other details are broken or missing temporarily. Sorry for the inconvenience!

So what is HTML anyway?

2016-12-19

We kicked off this blog relaunch by talking about the limitations of plain text files on the web. Sure, there were style issues, but historically one of the biggest problems was the lack of links!

Today's post comes at you not as a text file, but as a true, fancy HTML file! Doesn't it feel nice and shiny? We used an <h1> header tag at the top! And an <a> anchor tag to link to the first post. Cutting-edge stuff, truly!

One key difference you may pick up on, however, has to do with the text formatting. In a text file, you hit the Enter/Return key, and it adds space. With an HTML file, whitespace is "collapsed". This means that while it won't squish words together, it doesn't respect multiple spaces either. In fact, this very paragraph has tons of empty space, which you can see for yourself by viewing this page as a text file.

You'll also note that the font is quite different. When viewing a text file, the browser will use what's known as a *monospace* font. This means that each character takes up the same width. The "w" character is just as wide as an "i". This lets us be particular about character alignment. Let me force HTML to do that for us:

iiiii - See? mmmmm - It matches up!

By default though, HTML files don't use a monospace font

iiiii - See?

mmmmm - Not so much this time!

So by moving to this cool HTML thing, we've actually *lost* some expressivity! Being able to link things is cool, but even our cool ASCII characters won't render too well with html.

_ _ _ _ | | | | |__ ___ | |__ | | | | '_ \ _____ / _ \| '_ \ | |_| | | | |_____| (_) | | | | \___/|_| |_| \___/|_| |_|

Or, as they say in plain HTML:

_ _ _ _ | | | | |__ ___ | |__ | | | | '_ \ _____ / _ \| '_ \ | |_| | | | |_____| (_) | | | | \___/|_| |_| \___/|_| |_|

Nah, that's not entirely fair. I can add <br> tags to force it to keep the "newline" characters:

_ _ _ _

| | | | |__ ___ | |__

| | | | '_ \ _____ / _ \| '_ \

| |_| | | | |_____| (_) | | | |

\___/|_| |_| \___/|_| |_|





But that lack of monospaced font though - we better hurry up and figure out why to care about HTML then!

HyperText Markup Language

HTML stands for HyperText Markup Language. That sounds an awful lot like a markup language for hypertext. I'm not sure what those terms really mean just yet, but let's assume that our hunch is correct and try to answer the first part: what the heck is a markup language?

According to Wikipedia, it's a "system for annotating a document in a way that is syntactically distinguishable from the text.". So think about the red pen marks your English teacher used to scrawl on your essays. It gives information *about* the document, in a way that's not going to be confused with the essay itself. Only in this case, there's that word "syntactically". Syntax has to do with the *arrangement* of the words. So to put that all together, a markup language is a way of arranging words to *describe* parts of a document.

Okay, so what's HyperText?

Nobody knows!

Okay, I lied. Hypertext just means text that can have links. Links to other pages, or links to more information in the text itself. The root of the word gives it away a little bit. "Beyond text". Think of it as Text: The Next Generation.

So HTML is pretty straightforward:

Text

That can have links

And you can put certain words in certain places to annotate it

I don't know about you, but "annotate" is a bulky word. It calls to mind "annotated guides" and that sort of thing. It makes it seem fancier than it is. Annotations are just notes. We can make notes about parts of our document.

In the next blog post, we'll look at some of those systems for making notes, and see if we can beat the formatting and readability of our first text file. All progress is good progress, right?

***

←Previous Post | Next Post→

</body> </html>