Computing Thoughts

Python Decorators III: A Decorator-Based Build System

by Bruce Eckel

October 26, 2008



Summary

Most build systems start out with dependencies, then realize they need language features and eventually discover they should have started with language design.


I've used make for many years. I only used ant because it produced faster Java builds. But both build systems started out thinking the problem was simple, and only later discovered that you really need a programming language to solve the build problem. By then it was too late. As a result you have to jump through annoying hoops to get things done.

There have been efforts to create build systems on top of languages. Rake is a fairly successful domain-specific language (DSL) built atop Ruby. And a number of projects have been created with Python.

For years I've wanted a system that was just a thin veneer on Python, so you get some support for dependencies but effectively everything else is Python. This way, you don't need to shift back and forth between Python and some language other than Python; it's less of a mental distraction.

It turns out that decorators are perfect for this purpose. The design I present here is just a first cut, but it's easy to add new features and I've already started using it as the build system for The Python Book, so I'll probably need to add more features. Most importantly, I know I'll be able to do anything that I want, which is not always true with make or ant (yes, you can extend ant but the cost of entry is often not worth the benefit).

While the rest of the book has a Creative Commons Attribution-Share Alike license, this program only has a Creative Commons Attribution license, because I'd like people to be able to use it under any circumstances. Obviously, it would be ideal if you make any improvements that you'd contribute them back to the project, but this is not a prerequisite for using or modifying the code.

Syntax The most important and convenient thing provided by a build system is dependencies. You tell it what depends on what, and how to update those dependencies. Taken together, this is called a rule, so the decorator will also be called rule. The first argument of the decorator is the target (the thing that needs to be updated) and the remaining arguments are the dependencies. If the target is out of date with the dependencies, the function code is run to bring it up to date. Here's a simple example that shows the basic syntax: @rule("file1.txt") def file1(): "File doesn't exist; run rule" file("file1.txt", 'w') The name of the rule is file1 because that's the function name. In this case, the target is "file1.txt" and there are no dependencies, so the rule only checks to see whether file1.txt exists, and if it doesn't it runs the function code, which brings it up to date. Note the use of the docstring; this is captured by the build system and describes the rule on the command line when you say build help (or anything else the builder doesn't understand). The @rule decorators only affect the functions they are attached to, so you can easily mix regular code with rules in the same build file. Here's a function that updates the date stamp on a file, or creates the file if it doesn't exist: def touchOrCreate(f): # Ordinary function "Bring file up to date; creates it if it doesn't exist" if os.path.exists(f): os.utime(f, None) else: file(f, 'w') A more typical rule is one that associates a target file with one or more dependent files: @rule("target1.txt","dependency1.txt","dependency2.txt","dependency3.txt") def target1(): "Brings target1.txt up to date with its dependencies" touchOrCreate("target1.txt") This build system also allows multiple targets, by putting the targets in a list: @rule(["target1.txt", "target2.txt"], "dependency1.txt", "dependency2.txt") def multipleBoth(): "Multiple targets and dependencies" [touchOrCreate(f) for f in ["target1.txt", "target2.txt"]] If there is no target or dependencies, the rule is always executed: @rule() def clean(): "Remove all created files" [os.remove(f) for f in allFiles if os.path.exists(f)] The alFiles array is seen in the example, shown later. You can write rules that depend on other rules: @rule(None, target1, target2) def target3(): "Always brings target1 and target2 up to date" print target3 Since None is the target, there's nothing to compare to but in the process of checking the rules target1 and target2, those are both brought up to date. This is especially useful when writing "all" rules, as you will see in the example.

Builder Code By using decorators and a few appropriate design patterns, the code becomes quite succinct. Note that the __main__ code creates an example build.py file (containing the examples that you see above and more), and the first time you run a build it creates a build.bat file for Windows and a build command file for Unix/Linux/Cygwin. A complete explanation follows the code: # builder.py import sys, os, stat """ Adds build rules atop Python, to replace make, etc. by Bruce Eckel License: Creative Commons with Attribution. """ def reportError(msg): print >> sys.stderr, "Error:", msg sys.exit(1) class Dependency(object): "Created by the decorator to represent a single dependency relation" changed = True unchanged = False @staticmethod def show(flag): if flag: return "Updated" return "Unchanged" def __init__(self, target, dependency): self.target = target self.dependency = dependency def __str__(self): return "target: %s, dependency: %s" % (self.target, self.dependency) @staticmethod def create(target, dependency): # Simple Factory if target == None: return NoTarget(dependency) if type(target) == str: # String means file name if dependency == None: return FileToNone(target, None) if type(dependency) == str: return FileToFile(target, dependency) if type(dependency) == Dependency: return FileToDependency(target, dependency) reportError("No match found in create() for target: %s, dependency: %s" % (target, dependency)) def updated(self): """ Call to determine whether this is up to date. Returns 'changed' if it had to update itself. """ assert False, "Must override Dependency.updated() in derived class" class NoTarget(Dependency): # Always call updated() on dependency def __init__(self, dependency): Dependency.__init__(self, None, dependency) def updated(self): if not self.dependency: return Dependency.changed # (None, None) -> always run rule return self.dependency.updated() # Must be a Dependency or subclass class FileToNone(Dependency): # Run rule if file doesn't exist def updated(self): if not os.path.exists(self.target): return Dependency.changed return Dependency.unchanged class FileToFile(Dependency): # Compare file datestamps def updated(self): if not os.path.exists(self.dependency): reportError("%s does not exist" % self.dependency) if not os.path.exists(self.target): return Dependency.changed # If it doesn't exist it needs to be made if os.path.getmtime(self.dependency) > os.path.getmtime(self.target): return Dependency.changed return Dependency.unchanged class FileToDependency(Dependency): # Update if dependency object has changed def updated(self): if self.dependency.updated(): return Dependency.changed if not os.path.exists(self.target): return Dependency.changed # If it doesn't exist it needs to be made return Dependency.unchanged class rule(object): """ Decorator that turns a function into a build rule. First file or object in decorator arglist is the target, remainder are dependencies. """ rules = [] default = None class _Rule(object): """ Command pattern. name, dependencies, ruleUpdater and description are all injected by class rule. """ def updated(self): if Dependency.changed in [d.updated() for d in self.dependencies]: self.ruleUpdater() return Dependency.changed return Dependency.unchanged def __str__(self): return self.description def __init__(self, *decoratorArgs): """ This constructor is called first when the decorated function is defined, and captures the arguments passed to the decorator itself. (Note Builder pattern) """ self._rule = rule._Rule() decoratorArgs = list(decoratorArgs) if decoratorArgs: if len(decoratorArgs) == 1: decoratorArgs.append(None) target = decoratorArgs.pop(0) if type(target) != list: target = [target] self._rule.dependencies = [Dependency.create(targ, dep) for targ in target for dep in decoratorArgs] else: # No arguments self._rule.dependencies = [Dependency.create(None, None)] def __call__(self, func): """ This is called right after the constructor, and is passed the function object being decorated. The returned _rule object replaces the original function. """ if func.__name__ in [r.name for r in rule.rules]: reportError("@rule name %s must be unique" % func.__name__) self._rule.name = func.__name__ self._rule.description = func.__doc__ or "" self._rule.ruleUpdater = func rule.rules.append(self._rule) return self._rule # This is substituted as the decorated function @staticmethod def update(x): if x == 0: if rule.default: return rule.default.updated() else: return rule.rules[0].updated() # Look up by name for r in rule.rules: if x == r.name: return r.updated() raise KeyError @staticmethod def main(): """ Produce command-line behavior """ if len(sys.argv) == 1: print Dependency.show(rule.update(0)) try: for arg in sys.argv[1:]: print Dependency.show(rule.update(arg)) except KeyError: print "Available rules are:

" for r in rule.rules: if r == rule.default: newline = " (Default if no rule is specified)

" else: newline = "

" print "%s:%s\t%s

" % (r.name, newline, r) print "(Multiple targets will be updated in order)" # Create "build" commands for Windows and Unix: if not os.path.exists("build.bat"): file("build.bat", 'w').write("python build.py %1 %2 %3 %4 %5 %6 %7") if not os.path.exists("build"): # Unless you can detect cygwin independently of Windows file("build", 'w').write("python build.py $*") os.chmod("build", stat.S_IEXEC) ############### Test/Usage Examples ############### if __name__ == "__main__": if not os.path.exists("build.py"): file("build.py", 'w').write('''\ # Use cases: both test code and usage examples from builder import rule import os @rule("file1.txt") def file1(): "File doesn't exist; run rule" file("file1.txt", 'w') def touchOrCreate(f): # Ordinary function "Bring file up to date; creates it if it doesn't exist" if os.path.exists(f): os.utime(f, None) else: file(f, 'w') dependencies = ["dependency1.txt", "dependency2.txt", "dependency3.txt", "dependency4.txt"] targets = ["file1.txt", "target1.txt", "target2.txt"] allFiles = targets + dependencies @rule(allFiles) def multipleTargets(): "Multiple files don't exist; run rule" [file(f, 'w') for f in allFiles if not os.path.exists(f)] @rule(["target1.txt", "target2.txt"], "dependency1.txt", "dependency2.txt") def multipleBoth(): "Multiple targets and dependencies" [touchOrCreate(f) for f in ["target1.txt", "target2.txt"]] @rule("target1.txt","dependency1.txt","dependency2.txt","dependency3.txt") def target1(): "Brings target1.txt up to date with its dependencies" touchOrCreate("target1.txt") @rule() def updateDependency(): "Updates the timestamp on all dependency.* files" [touchOrCreate(f) for f in allFiles if f.startswith("dependency")] @rule() def clean(): "Remove all created files" [os.remove(f) for f in allFiles if os.path.exists(f)] @rule() def cleanTargets(): "Remove all target files" [os.remove(f) for f in targets if os.path.exists(f)] @rule("target2.txt", "dependency2.txt", "dependency4.txt") def target2(): "Brings target2.txt up to date with its dependencies, or creates it" touchOrCreate("target2.txt") @rule(None, target1, target2) def target3(): "Always brings target1 and target2 up to date" print target3 @rule(None, clean, file1, multipleTargets, multipleBoth, target1, updateDependency, target2, target3) def all(): "Brings everything up to date" print all rule.default = all rule.main() # Does the build, handles command-line arguments ''') The first group of classes manage dependencies between different types of objects. The base class contains some common code, including the constructor which you'll note is automatically called if it is not explicitly redefined in a derived class (a nice, code-saving feature in Python). Classes derived from Dependency manage particular types of dependency relationships, and redefine the updated() method to decide whether the target should be brought up to date with the dependent. This is an example of the Template Method design pattern, where updated() is the template method and _Rule is the context. If you want to create a new type of dependency -- say, the addition of wildcards on dependencies and/or targets -- you define new Dependency subclasses. You'll see that the rest of the code doesn't require changes, which is a positive indicator for the design (future changes are isolated). Dependency.create() is what I call a Simple Factory Method, because all it does is localize the creation of all the subtypes of Dependency. Note that forward referencing is not a problem here as it is in some languages, so using the full implementation of Factory Method given in GoF is not necessary and also more complex (this doesn't mean there aren't cases that justify the full-fledged Factory Method). Note that in FileToDependency we could assert that self.dependency is a subtype of Dependency, but this type check happens (in effect) when updated() is called. The rule Decorator The rule decorator uses the Builder design pattern, which makes sense because the creation of a rule happens in two steps: the constructor captures the decorator arguments, and the __call__() method captures the function. The Builder product is a _Rule object, which, like the Dependency classes, contains an updated() method. Each _Rule object contains a list of dependencies and a ruleUpdater() method which is called if any of the dependencies is out of date. The _Rule also contains a name (which is the decorated function name) and a description (the decorated function's docstring). (The _Rule object is an example of the Command pattern). What's unusual about _Rule is that you don't see any code in the class which initializes dependencies, ruleUpdater(), name, and description. These are initialized by rule during the Builder process, using Injection. The typical alternative to this is to create setter methods, but since _Rule is nested inside rule, rule effectively "owns" _Rule and Injection seems much more straightforward. The rule constructor first creates the product _Rule object, then handles the decorator arguments. It converts decoratorArgs to a list because we need it to be modifiable, and decoratorArgs comes in as a tuple. If there is only one argument it means the user has only specified the target and no dependencies. Because Dependency.create() requires two arguments, we append None to the list. The target is always the first argument, so pop(0) pulls it off and the remainder of the list is dependencies. To accommodate the possibility that the target is a list, single targets are turned into lists. Now Dependency.create() is called for each possible target-dependency combination, and the resulting list is injected into the _Rule object. For the special case when there are no arguments, a None to None Dependency is created. Notice that the only thing the rule constructor does is sort out the arguments; it has no knowledge of particular relationships. This keeps special knowledge within the Dependency hierarchy, so adding a new Dependency is isolated within that hierarchy. A similar guideline is followed for the __call__() method, which captures the decorated function. We keep the _Rule object in a static list called rules, and the first thing to check is whether any of the rule names are duplicated. Then we capture and inject the name, documentation string, and the function itself. Note that the Builder "product", the _Rule object, is returned as the result of rule.__call__(), which means that this object -- which doesn't have a __call__() method -- is substituted for the decorated function. This is a slightly unusual use of decorators; normally the decorated function is called directly, but in this case the decorated function is never called directly, but only via the _Rule object.

Running a Build The static method main() in rule manages the build process, using the helper method update(). If you provide no command-line arguments, main() passes 0 to update(), which calls the default rule if one has been set, otherwise it calls the first rule that was defined. If you provide command-line arguments, it passes each one (in order) to update(). If you give it an incorrect argument (typically help is reserved for this), it prints each of the rules along with their docstrings. Finally, it checks to see that a build.bat and build command file exists, and creates them if it doesn't. The build.py produced when you run builder.py the first time can act as a starting point for your build file.

Improvements As it stands, this system only satisfies the basic needs; it doesn't have, for example, all the features that make does when it comes to manipulating dependencies. On the other hand, because it's built atop a full-powered programming language, you can do anything else you need quite easily. If you find yourself writing the same code over and over, you can modify rule() to reduce the duplicated effort. If you have permission, please submit such modifications back for possible inclusion.

Next In the last installment of this series (chapter), we'll look at class decorators and whether you can decorate an object.

Talk Back!

Have an opinion? Readers have already posted 19 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever Bruce Eckel adds a new entry to his weblog, subscribe to his RSS feed.

About the Blogger

Bruce Eckel (www.BruceEckel.com) provides development assistance in Python with user interfaces in Flex. He is the author of Thinking in Java (Prentice-Hall, 1998, 2nd Edition, 2000, 3rd Edition, 2003, 4th Edition, 2005), the Hands-On Java Seminar CD ROM (available on the Web site), Thinking in C++ (PH 1995; 2nd edition 2000, Volume 2 with Chuck Allison, 2003), C++ Inside & Out (Osborne/McGraw-Hill 1993), among others. He's given hundreds of presentations throughout the world, published over 150 articles in numerous magazines, was a founding member of the ANSI/ISO C++ committee and speaks regularly at conferences.

This weblog entry is Copyright © 2008 Bruce Eckel. All rights reserved.