One of the presentations I was most disappointed to miss at OSCON was Evan Henshaw-Plath and Kellan Elliott-McCrea‘s talk entitled “Beyond REST? Building Data Services with XMPP PubSub.” Not because I’ve somehow given up on the REST architectural approach that we’ve been arguing on behalf since, oh, forever. No, I’m fascinated by XMPP because of the difficulties some startups – Twitter being the canonical example – have had scaling in certain dimensions. Dimensions that might be in XMPP’s sweet spot.

But also because we’re generally different-tools-for-different-jobs kind of folks. And XMPP is most certainly a different tool, as Kellan and Rabble’s presentation reminds us. Consider the proposition, for instance, of AMQP internally to AtomPub/XMPP externally.

No, my interest in XMPP is not a binary, mutually exlusive REST and/or, but rather a recognition that the technology that is more than instant messaging might be poised for a role in more than just instant messaging.

Nor is this some unique obsession of mine; smarter folks like Dion (Google/Ajaxian) and Matt (WordPress) were similarly intrigued by Rabble and Kellan’s deck. As was, most interestingly, Joshua Schachter of del.icio.us and Yahoo fame. It was his post, in fact, that triggered the interview that follows.

In it, Joshua pushes back on XMPP as the solution publish/subscribe architectural dilemma, proposing as an alternative a simple HTTP based callback system that would retrieve fragments of a syndicated format (RSS being the example). Creatively monikered PIMP (PIMP Is Mostly Push). It’s sort of a HTTP based alternative to this, in fact. The discussion that ensued in the comments to the post was really fascinating. At least to me.

Enough so that I decided to ping a friend of mine who knows a bit about XMPP with a few questions. This friend, for the sake of full disclosure, also happens to work for a RedMonk client (Jabber), so feel free to read into that whatever you like.

Thus we have a Q&A, finally, that is not with myself. That’s right, we’ve actually got a second party for you this time. The interviewee, in this case, is none other than the CTO of Jabber, Joe Hildebrand (second from the right here). I hope the below contributes to the discussion that we’re seeing now; if it does, thank Joe. It it doesn’t, blame me. One last quick note: the questions may seem oddly redundant in light of the above intro: that was because they were written first. I chose not to edit my questions so as to not affect Joe’s answers.

SOG: For the folks out there that might not be intimately familiar with XMPP, could you explain it and its value in a few sentences?

JH: The Extensible Messaging and Presence Protocol (XMPP) is a standardized protocol for:

Streaming chunks of XML at Internet scale

Exchanging information about the availability of endpoints (presence)

Addressing messages to those endpoints

Originally known as the “Jabber Protocol”, XMPP was originally designed by Jeremie Miller (http://en.wikipedia.org/wiki/Jeremie_Miller) in 1998 in reaction to his perceived need for bringing together all of the proprietary IM services. Instead of writing a multi-protocol client, he decided to put all of the interoperability on a server where the protocol definitions could be updated centrally. He needed a protocol to get to that server, and since XML was hot at the time, I now have to deal with XML namespace madness. 🙂

XMPP has been standardized by the IETF in two main RFCs. RFC 3920 (http://www.xmpp.org/rfcs/rfc3920.html) specifies the lower-level streaming mechanism, addressing, framing, authentication, encryption, and internationalization. RFC 3921

(http://www.xmpp.org/rfcs/rfc3921.html) specifies how you build a presence and messaging system on top of the streaming layer. Both of these RFCs are in the final stages of incorporating implementation experience at the moment; the “bis” drafts with all of the clarifications and improvements can be found at xmpp.org (http://www.xmpp.org/internet-drafts/draft-saintandre-rfc3920bis-06.html, http://www.xmpp.org/internet-drafts/draft-saintandre-rfc3921bis-06.html).

The RFCs give you enough protocol for a basic IM service, but one of the key goals of XMPP is extensibility. With a very small number of delivery semantics, you can solve *many* different kinds of problems, many of which are not IM-related at all. A good number of these extensions have been standardized at the XMPP Standards Foundation as XMPP Extension Protocols (XEPs)(http://www.xmpp.org/extensions/). The process for adding and modifying these standardized extensions is completely open; we encourage anyone with comments or contributions to join the appropriate mailing list (http://mail.jabber.org/mailman/listinfo/standards).

However, keep in mind that many folks create their own protocol extensions. There is no requirement for your extension to be public, documented, or open in any way, unless you want to be able to interoperate easily with software you didn’t write. That’s a business decision you can make; XMPP software tends to be really good at ignoring extensions it doesn’t understand, so you won’t break the network by innovating.

[ed: that was a few sentences? ;)]

SOG: Interest in XMPP seems to be peaking at the moment, from IBM’s Sam Ruby making it one of his long bets to Evan Henshaw-Plath and Kellan Elliott-Mcrea’s presentation at OSCON last week. What is it about XMPP that’s making it, like Hansel, so hot right now?

JH: Many people are coming to the conclusion that HTTP polling doesn’t work.It doesn’t scale, and it can’t deal with high-flow feeds without missing data that scrolls off the end of your feed during the interval between polls. I would say that Twitter is a pretty extreme example of this problem, but there are lots of other sites have starting to want to allow third-parties to get access to their information with short latency. As an example, imagine you wanted to have a site that did cool visualization of all of the changes to Wikipedia, as the changes were made.

Since there are more sites interested in sharing information than ever before, we’re seeing many people come to this same conclusion at the same time, looking for an open, standards-based, widely-implemented solution to the problem. Ergo: XMPP buzz.

SOG: Are there any current obvious problems or issues with web services or applications that you see on a daily basis that you think could be ameliorated via the usage of XMPP?

JH: Let’s dig into the scale issue. If you want to make HTTP work for these sorts of solutions, there are a couple of paths:

Poll very frequently. This adds a *horrific* amount of load to your web server.

Keep a connection open to the HTTP server with a request that only gets answered when there is new data (sometimes known as the “Comet” approach). Many web servers are not optimized for keeping large numbers of sockets open for days. Unless you are using something like BOSH (XEP-0206: http://www.xmpp.org/extensions/xep-0206.html), writing the server-side of this can be difficult, particularly when it comes to addressing the correct endpoint.

Use some sort of HTTP callback mechanism, where you give the sender the URL of a service to call when there is new information. This requires the receiver to be hosted on an externally-reachable web server, for your server to have some large amount of uptime, and for the sender to be able to send large numbers of updates frequently. To send these updates, you need to figure out which connection you’re going to send on if you’re already connected, do the TCP three-way handshake and SSL negotiation if you’re not connected. At the end of all of that, you still have to layer on some sort of authentication and/or authorization system for the sender and receiver to have any idea who the other end is.

If you want to use XMPP, it’s more straightforward:

Have an XMPP server on both the sender and receiver side (this might be collapsed down to one server if you like)

Receiver registers interest in notifications. This might be a presence subscription, a XEP-0060 (http://www.xmpp.org/extensions/xep-0060.html) subscription, a web page, or whatever.

Sender sends, potentially only when the other side is available.

Receiver receives, whenever they are available.

Encryption, authentication, authorization and the like are all handled by the XMPP layer. Both endpoints know the address/identity of the other side, and can rely on the XMPP layer to do whatever system policy says to enforce those.

Yes, getting your XMPP server to scale requires a good deal of effort, but we sell one of those, so I know it’s possible. 🙂

SOG: Speaking of, did you get a chance to check out that presentation, entitled “Beyond REST? Building data services with XMPP?” If so, what did you think?

JH: I saw the deck, but didn’t see the presentation in person, unfortunately. I got to spend some time with Kellan and Rabble at the XMPP Summit at OSCON. It’s really cool to see folks from the HTTP community get involved in the XMPP community. Great ideas come out of the clash of cultures.

I think the folks that say “It’s just a socket, get over it!” don’t quite understand all of what XMPP has to offer. It’s a socket with framing and Layer-7, identity-based routing, where the other side can go up and down without interrupting the flow of information.

SOG: As a counterpoint to that presentation, have you had a chance to check out Joshua Schachter’s pushback, “beyond rest,” in which he proposes an HTTP callback system that would fall between full polling and persistent pub/sub?

JH: Yeah. It’s totally possible to make an HTTP callback system work – after all, there are a couple of WS-* standards that talk about how to do this. I assert that it will be a lot more difficult than people expect to make it scale and to get the security right.

SOG: One of the reasons that Joshua’s post resonated with many of the developers we work with is the perceived complexity of XMPP. The spec is big and heavy – bloated, even, some argue – with many namespaces that are irrelevant to many use cases [see Russell Beattie for a counterpoint]. How do you think developers should address this? Libraries? Subsetting?

JH: If you look at RFC 3920 there isn’t really anything superfluous. Authentication is more… comprehensive than some would like because the IETF pushed us into a dependency on SASL (like IMAP, POP3, and SMTP now use). 3921 has a couple of superfluous things that been removed in 3921bis, like session establishment and privacy lists. I think when you say “the spec”, mostly people are complaining about the list of XEPs. Today there are 247 of them, some of which have been deprecated.

Are there too many XEPs? Possibly. But that’s one of the consequences of an open process. Could we do a better job of helping people get started? Absolutely. One of the things we talked about at the Summit was how to give developers a gentle introduction. Probably such a “Dive into XMPP” intro should cover the building-block XEPs, like 4 (Data Forms), 30 (Service Discovery), 45 (Multi-User Chat), 50 (Ad-Hoc Commands), 60 (Publish/Subscribe), 115 (Entity Capabilities), and 163 (Personal Eventing). We do try hard to layer correctly, reuse where

possible, and make clients as easy as we can to write.

Are a couple of the XEPs are too large to get your hands around easily as a newbie. 45 and 60 are my prime targets here. One of the reasons they are so large, however, is that every possible error condition that has been found is in the spec, along with an example of each success case and error case. As a developer, I love lots of examples. I copy them from the spec and insert them into my code as comments. That being said, we’re talking about trying to pull some of the more advanced features out of some of these specs, so that there can be a kernel that is easier to learn and implement on its own.

If people want to complain about particular specs, the place to do it is in on the standards list. No fair just writing a troll on your blog and hoping someone will wander by to fix your problem.

On the topic of libraries, there is probably an existing library for XMPP for almost any programming language, operating system, or environment (http://www.jabber.org/libraries has some of those). Please think carefully before embarking on writing your own — writing a *good* library can take a while, mostly because writing multi-threaded asynchronous network code is difficult no matter what the protocol. If you use a good library, you can be up and running quite quickly.

SOG: What do you think would help to accelerate the adoption of XMPP, whether that’s in the commercial marketplace or with individual developers?

JH: The first step is realizing that you have a problem that XMPP can solve: it’s not just for IM. We’re making inroads on that with the current hype cycle. The next is helping people to get up and running more quickly so that their first experiences are positive. Libraries, docs, chat rooms, consulting services, and books might all make that process easier.

Things you might not know about that will help:

Jabber Inc. will be announcing two commercially-supported client libraries (for .Net and AJAX) in the next couple of weeks.

Docs and gentle introductions were an action item out of the Summit. If you’d like to help, contact stpeter.

Join the jdev chatroom (xmpp:[email protected]?join) from any XMPP server that has federation enabled. Ask questions about protocol, implementations, libraries, or whatever there.

Consulting services: your Jabber Inc. salesperson can get you access to services from installation/configuration through solution design all the way to implementation of custom solutions, depending upon what you need. I’m sure that there are others in the community that would be willing to offer you expertise as well.

There are several books already on the market, most of which are out of date in a variety of ways. There are rumors that a new one is on the way, though.

In all, XMPP has been around for almost 10 years. It is open, standardized, robust, widely-implemented, and straightforward to understand compared to most other protocols in this space. Yes, it’s different than HTTP, or anything else a web developer might have run in to. That’s because it solves a problem that isn’t easy to solve in the traditional HTTP world: bi-directional, asynchronous, short-latency structured communications.