The rise of Twitter and other microblogging systems with constrained character counts has led to renewed interest in Web services that shorten URLs. Support for these services is often integrated into desktop client applications so that users can take advantage of the functionality without having to open a browser window.

Most desktop clients, however, make users jump through a few extra hoops in order to shorten a URL. For example, Seesmic makes users click a toolbar button and then paste the link into a popup dialog. Gwibber, my microblogging client for Linux, avoids the extra step by automatically shortening URLs when they are pasted directly into the message textbox. This seems to be a popular feature and I've been asked to explain how it works on several occasions. In this tutorial, I'll show you how to intercept and manipulate text as it is being pasted into a GTK+ textbox.

My first instinct when I started implementing this feature was to use the GTK+ entry widget's paste-clipboard signal, which is emitted when the user pastes content into a text entry field. The callback function would extract text from the clipboard using something like gtk.Clipboard().wait_for_text() , detect if its a URL, and then act accordingly. There are a few problems with this approach, but the most significant issue is that the paste-clipboard signal doesn't get emitted when the user pastes from the selection clipboard. It will detect a standard ctrl+v operation, but not a middle-click paste.

In practice, it's actually easier to intercept paste events in a GTK+ entry widget by trapping the insert-text signal. It will work consistently regardless of how the user puts content into the textbox. Another advantage of using insert-text is that it eliminates the need to manually extract the text content from the clipboard because it provides a string with the inserted text as a parameter to the callback function.

The following example shows how to bind a callback to the signal and print the intercepted text to the console:

import gtk w = gtk.Window() w.connect("destroy", gtk.main_quit) def on_insert_text(entry, text, tlen, pos): print text e = gtk.Entry() e.connect("insert-text", on_insert_text) w.add(e) w.show_all() gtk.main()

Now that we can get the text as it is being inserted into the entry, the rest is easy. We need to analyze the text to determine if it is a URL. For the purposes of our demo, we can do this with a regular expression:

re.match("^https?://[^ ]+", text)

Note that we include the "s?" to make sure we properly handle https URLs in addition to conventional http URLs. Before we shorten the URL, we will also need to check the length to make sure that it really needs shortening. If it's less than 20 characters, then we can just let it go through as-is. We can do that with the third callback parameter, which contains the length of the inserted string.

def on_insert_text(entry, text, tlen, pos): if re.match("^https?://[^ ]+", text) and tlen > 20: # shorten the URL

There are a multitude of different shortening services and each one has its own distinct API. I tend to favor is.gd, because it's got particularly short URLs and it's API is very easy to use. All we have to do is call is.gd/api.php and pass the full URL as the value of the longurl key and it will spit back out a shortened version:

apiurl = "http://is.gd/api.php?" + urllib.urlencode(dict(longurl=text)) shorturl = urllib2.urlopen(apiurl).read()

We have to use urllib.urlencode on the full URL so that is.gd can handle it properly. Then we use urllib2.urlopen to request the shortened version from is.gd. It will return the shortened text in the body of the response, so we can pull it out with the read() method.

The next step is inserting the shortened URL into the textbox after we have retrieved it. To do that, we use the gtk.Editable.insert_text method. The Editable.insert_text method requires two arguments: the text to insert and the position at which to insert it. We want to insert it at the current position. The insert-text callback provides the position as a parameter, but unfortunately the Python bindings don't handle it properly. It's stored in a gpointer which makes it a bit of a pain to get the data out. Instead of using that callback parameter, we can just use the Editable.get_position method.

def on_insert_text(entry, text, tlen, pos): if re.match("^https?://[^ ]+", text) and tlen > 20: apiurl = "http://is.gd/api.php?" + urllib.urlencode(dict(longurl=text)) shorturl = urllib2.urlopen(apiurl).read() entry.insert_text(shorturl, entry.get_position())

This works, but it also insert the original URL and not just the shortened version. In order to prevent the insert-text signal from completing and inserting the originally pasted text, we have to kill the signal emission. You can do this with the stop_emission method.

entry.stop_emission("insert-text")

Now the only other problem with our example implementation of this feature is that the text cursor remains at the point where the text is inserted. This is inconsistent with how text pasting is generally supposed to work. The program needs to move the text cursor position to the end of the short URL in the entry after it is inserted. This is done with Editable.set_position , but it doesn't work if you do it during the insert-text signal handler. To make it work, we can just use gobject.idle_add to make it run when the signal handler is finished and returns control back over to the GTK+ main loop.

gobject.idle_add(entry.set_position, entry.get_position() + len(shorturl))

The following is the full code of the example:

import gtk, gobject import urllib, urllib2, json, re w = gtk.Window() w.connect("destroy", gtk.main_quit) def on_insert_text(entry, text, tlen, pos): if re.match("^https?://[^ ]+", text) and tlen > 20: apiurl = "http://is.gd/api.php?" + urllib.urlencode(dict(longurl=text)) shorturl = urllib2.urlopen(apiurl).read() entry.insert_text(shorturl, entry.get_position()) gobject.idle_add(entry.set_position, entry.get_position() + len(shorturl)) entry.stop_emission("insert-text") e = gtk.Entry() e.connect("insert-text", on_insert_text) w.add(e) w.show_all() gtk.main()

This is a pretty simple implementation and there are a lot of ways that it could be embellished to make it more functional. For example, you could add support for rev=canonical shortening so that it will automatically use the site operator's preferred short URL instead of is.gd in cases where one is specified. It would also be wise to add proper error handling, like a try/except block that will cause the original full URL to be inserted in the event that the URL shortening service fails.