Python String Functions

What use are python function decorators? Here is something fun: we can use them for speedy string templates. What follows is a further development of the templet python string-template idea. I find these techniques useful when building HTML and XML pages.

Here is a @stringfunction function decorator:

from templet import stringfunction @stringfunction def poem(jumper, jumpee="moon"): "The $jumper jumped over the $jumpee."

The @stringfunction decorator transforms "poem" into the following function:

def poem(jumper, jumpee="moon"): out = [] out.append("The ") out.append(str(jumper)) out.append(" jumped over the ") out.append(str(jumpee)) out.append(".") return ''.join(out)

The decorator saves quite a bit of typing! In this implementation of @stringfunction, handy ${...} and ${{...}} syntax allows us to embed arbitrary python code, so we can make our templates as powerful as we need them to be. If we want to introduce complicated logic, our code can use "out.append(text)" to build up the string by hand.

The templates we get this way have all their parameters nicely-declared so they are easy-to-call and so misspellings get caught. These function templates are also far faster than the most popular template solutions, benchmarking more than four times faster than Django templates, python's string.Template templates, or the v1 templet classes. And we can use them by importing just one small python module, with no hacking of the import path mechanism or other major surgery.

Does Python Need Standard Templates?

One of the reasons for Ruby's popularity is that there is a simple, natural-feeling templating sytem that is part of the ruby standard library. ERB goes beyond simple substitution, and gives you a way to embed ruby in a string.

Python's built in %-string interpolation and string.Template don't go far enough, because they provide no clean solution for composition and control flow. The first time you find yourself going through the parameters of a long and complicated %, trying to figure out which ones need to be cgi.escaped and which ones don't, you will know what I mean. In 2007, we know that sometimes you really do want some code inside your strings, and that model-view religion should not be an orthodoxy. In the Python world, we have Django templates, Cheetah, and other solutions that have tried to fill the gap.

But, to my taste, none of these systems feel minimal, simple, or particularly natural for python. When I am doing weekend hacking and need a quick HTML UI, I don't really want to introduce a big templating package or a new language with unfamiliar syntax like {{ var|foo:"bar" }}. I just want to be able to put some python in my strings.

A Longer Example

The base-class based template approach I previously discussed is very natural for building HTML UI, but its use of exec (and unbound variables) during template expansion makes it slightly slower than other template solutions.

The decorator approach is much faster. It is also almost as nice for large templates.

Here is a longer example that shows @stringfunction being used in a slightly less trivial context, with a little bit of code and composition:

from templet import stringfunction import cgi # An example bit of data to display class FavoritesInfo: def __init__(self, user): self.user = user self.favorites = [] def addFavorite(self, url, description=None): self.favorites.append((url, description)) @stringfunction def favoritesPage(info): r""" <html> <head><title>${info.user}'s Favorites</title></head> <body> <h1>${cgi.escape(info.user)}'s Favorites</h1> <ol> ${{ for fav in info.favorites: out.append('<li>') out.append(buildLink(fav[0], fav[1])) }} </ol> </body> </html> """ @stringfunction def buildLink(url, description): """<a href="${cgi.escape(url, quote='"')}">$description</a>""" info = FavoritesInfo('Dave') info.addFavorite('http://popurls.com/', 'PopURLs') info.addFavorite('http://news.google.com/', 'Google News') print favoritesPage(info)

How it Works

The @stringfunction decorator is a subversion of python's documentation strings. The decorator transforms a function's __doc__ string into executable python code, and then returns a function with the same signature as the original function, but with the code replaced by the generated code.

The templet module code below supports @stringfunction and a similar @unicodefunction decorator. It clocks in at a brief 64 lines of code.

The full version here also includes unit tests and more docs, as well as including support for the StringTemplate and UnicodeTemplate base classes that I have discussed previously.

import sys, re, inspect class _TemplateBuilder(object): __pattern = re.compile(r"""\$( # Directives begin with a $ \$ | # $$ is an escape for $ [^\S

]*

| # $

is a line continuation [_a-z][_a-z0-9]* | # $simple Python identifier \{(?!\{)[^\}]*\} | # ${...} expression to eval \{\{.*?\}\} | # ${{...}} multiline code to exec <[_a-z][_a-z0-9]*> | # $<sub_template> method call )(?:(?:(?<=\}\})|(?<=>))[^\S

]*

)? # eat some trailing newlines """, re.IGNORECASE | re.VERBOSE | re.DOTALL) def __init__(self, constpat, emitpat, callpat=None): self.constpat, self.emitpat, self.callpat = constpat, emitpat, callpat def __realign(self, str, spaces=''): """Removes any leading empty columns of spaces and an initial empty line""" lines = str.splitlines(); if lines and not lines[0].strip(): del lines[0] lspace = [len(l) - len(l.lstrip()) for l in lines if l.lstrip()] margin = len(lspace) and min(lspace) return '

'.join((spaces + l[margin:]) for l in lines) def build(self, template, filename, s=''): code = [] for i, part in enumerate(self.__pattern.split(self.__realign(template))): if i % 2 == 0: if part: code.append(s + self.constpat % repr(part)) else: if not part or (part.startswith('<') and self.callpat is None): raise SyntaxError('Unescaped $ in ' + filename) elif part.endswith('

'): continue elif part == '$': code.append(s + self.emitpat % '"$"') elif part.startswith('{{'): code.append(self.__realign(part[2:-2], s)) elif part.startswith('{'): code.append(s + self.emitpat % part[1:-1]) elif part.startswith('<'): code.append(s + self.callpat % part[1:-1]) else: code.append(s + self.emitpat % part) return '

'.join(code) def _templatefunction(func, listname, stringtype): globals, locals = sys.modules[func.__module__].__dict__, {} if '__file__' not in globals: filename = '<%s>' % func.__name__ else: filename = '%s: <%s>' % (globals['__file__'], func.__name__) builder = _TemplateBuilder('%s.append(%%s)' % listname, '%s.append(%s(%%s))' % (listname, stringtype)) args = inspect.getargspec(func) code = [ 'def %s%s:' % (func.__name__, inspect.formatargspec(*args)), ' %s = []' % listname, builder.build(func.__doc__, filename, ' '), ' return "".join(%s)' % listname] code = compile('

'.join(code), filename, 'exec') exec code in globals, locals return locals[func.__name__] def stringfunction(func): """Function attribute for string template functions""" return _templatefunction(func, listname='out', stringtype='str') def unicodefunction(func): """Function attribute for unicode template functions""" return _templatefunction(func, listname='out', stringtype='unicode')

An explanation of _templatefunction, which drives the code generation:



First it grabs the global namespace within which the original function was defined. We need that because if you write code that accesses another symbol in that module (e.g., an imported module name, or another function in that module), we want to resolve it in that namespace. It also makes a temporary (empty) local namespace for the generated code.

Then it generates a pseudo-filename to use when generating error messages and so on. This includes the original module's filename, if we can find it.

Then we generate code. We begin by generating a "def" statement exactly like the original. The trick is to use inspect.getargspec(func) and inspect.formatargspec(args) to grab and reproduce the signature for the function. The body of the code is generated by the _TemplateBuilder class. That class just rips apart templates using regular expressions, realigns spaces, and emits appropriate "out.append" code. In our generated function, we bracket this code with "out = []" and "return ''.join(out)".

Finally, we use "exec" to define the generated function, and then we grab the generated function out of the temporary namespace to return it.



Is this Pretty or Ugly?

The @stringfunction decorator produces string functions that are somewhat reminiscent of PTL-style templates, allowing templates to be declared with normal python function syntax. Yet it does it in a much simpler way, without introducing a whole new python compiler, using just a single small module.

Part of the reason @stringfunction is so easy to implement is that it subverts Python's documentation strings for its own tricky ends. To me it feels like, for string template functions, a string that appears as the function body makes "more sense" as a template string than as a documentation string, and that the template string is a fine self-documenting description anyway. But maybe misusing documentation strings like this is distasteful.

For the pythonista's out there: does this strike you as the right way to string templates in python? It is certainly fast and simple. I think it is pretty readable, too.

Posted by David at March 11, 2007 01:37 PM

