Save the Internet with rev="canonical" 10 Apr 2009

Related: A rev="canonical" HTTP Header Slashdot: Note that rev="canonical" (reverse link) and rel="canonical" (forward link) indicate the same relationship in opposite directions. Also, be careful not to make the assumption that shorter URLs are always better. Obviously, I prefer the URL I'm using, but if you require a shorter one, please use http://tr.im/revcanonical. (I use rev="canonical" to indicate this preference, which is what this post is all about.) For more information about my obsession with URLs, see URL Vanity and URLs Can Be Beautiful. Thanks for reading!

There's a new proposal ("URL shortening that doesn't hurt the Internet") floating around for using rev="canonical" to help put a stop to the URL-shortening madness. It sounds like a pretty good idea, and based on some discussions on IRC this morning, I think a more thorough explanation would be helpful. I'm going to try.

The premise is pretty simple. In order to avoid the great linkrot apocalypse, we can opt to specify short URLs for our own pages, so that compliant services (adoption is still low, because the idea is pretty fresh) will use our short URLs instead of TinyURL.com (or some other third-party alternative) replacements.

This is easiest to explain with an example. I have an article about CSRF located at the following URL:

http://shiflett.org/articles/cross-site-request-forgeries

I happen to think this URL is beautiful. :-) Unfortunately, it is sure to get mangled into some garbage URL if you try to talk about it on Twitter, because it's not very short. I really hate when that happens. What can I do?

If rev="canonical" gains momentum and support, I can offer my own short URL for people who need one. Perhaps I decide the following is an acceptable alternative:

http://shiflett.org/csrf

Here are some clear advantages this URL has over any TinyURL.com replacement:

The URL is mine. If it goes away, it's my fault. (Ma.gnolia reminds us of the potential for data loss when relying on third parties.)

The URL has meaning. Both the domain (shiflett.org) and the path (csrf) are meaningful.

Because the URL has meaning, visitors who click the link know where they're going.

I can search for links to my content; they're not hidden behind an indefinite number of short URLs.

There are other advantages, but these are the few I can think of quickly.

With rev="canonical" , I can indicate my preferred short URL for the canonical one. I just have to hope the idea catches on.

First, I need to make sure my short URL redirects to the canonical URL. I can do this with PHP:

<?php header('Location: http://shiflett.org/articles/cross-site-request-forgeries', TRUE, 301); ?>

This results in a 301 (permanent) redirect, which is what I want. (Thanks to Vanessa's comment, I have learned that this is interpreted the same as rel="canonical" .)

With my short URL redirecting to the canonical one, I just need to add rev="canonical" to the canonical (long) URL:

<link rev="canonical" href="http://shiflett.org/csrf" />

If Twitter adopts this, then whenever someone uses the canonical URL, Twitter will replace it with my preferred short URL instead of some TinyURL.com garbage. Wouldn't that be nice?

There is some confusion between rev="canonical" and rel="alternate shorter" . The former means the current URL is the canonical equivalent of the URL in the href attribute. (Thus, it is the opposite of rel="canonical" .) The latter indicates the same thing but also means the URL in the href attribute is shorter. In practice, all you really need is rev="canonical" , as indicated by Dopplr's support:

<link rev="canonical" href="http://dplr.it/brooklyn" /><!-- http://revcanonical.appspot.com/ -->

There is a tool you can use to test Dopplr's implementation, test my example, or test your own.

I like to give credit where credit is due, so I asked Kellan Elliott-McCrea (@kellan) to tell us about the idea's history:

The idea emerged in conversation between myself, Les Orchard, and Kevin Marks. (Rafe Colburn suggested something similar about 2 years ago.) Niall Kennedy and Shawn Medero provided useful comments. I just documented and wrote the code.

It is already being supported by Dopplr, PHP.net, Ars Technica, and Flickr. Let's hope Twitter jumps on the bandwagon soon!