Building a sound, intuitive message editor for use everywhere in Intercom proved to be a much more challenging, time-consuming, and expensive task than we ever imagined.

I explained how we overcame those hurdles and the importance of this investment at a recent Intercom event, On Product. What follows is an edited transcript of my talk, along with a handful of relevant presentation slides.

Up until mid-2014, Intercom used a simple text area to create new messages. But this text area was just that, for text only – no formatting, no images, and no rich media. We supported a mixture of markdown and HTML to get around this, but there were problems: it was tricky to compose, you needed to be somewhat technical, and it was very easy to make mistakes. It was also difficult to render consistently on all devices. Mobile applications, for example, need a WebView to render HTML.

We can build it better

Around this time, we kicked off what we thought would be a small project, to build a better editor. To start, we defined a few guiding principles:

Everyone should be able to create beautiful messages. Messages should be composed of simple constructs; constraining options would help users craft effective and consistent messages. Messages should look great on all platforms, so no HTML.

We then came up with a simple data format we call “Blocks”. These have a flat structure and support a limited set of types, including headers, paragraphs, images, buttons, and videos.

Above is a JSON representation of those Blocks. We would use this format for both database storage and client application APIs.

Our initial prototype used ContentEditable as the input surface. This is very straightforward; you add the property tag to an HTML element and it gains super powers. You can paste into it, and it supports lists, along with bold and italic formatting. You also get undo/redo support and keyboard shortcuts.

It does, however, produce HTML. But, we’d be fine as long as we could convert that HTML into Blocks format – or so we thought. We kicked off a six-week project to build this new editor but soon realized there are a great many quirks with ContentEditable. For example, when a user presses a return key in an empty ContentEditable element:

With Firefox you get a <br> tag. IE gives you two paragraph tags. With Chrome it’s two <div> and two <br> tags. It turns out there are no real standards with ContentEditable. As we built our editor we discovered many more weird cases. We’d find a quirk, create a workaround, and find another quirk. It was rinse and repeat.