I have always been fascinated by the idea of plugins - user-developed modules that are not part of the core application, but that nevertheless allow extending the application's capabilities. Many applications above a certain size allow some level of customization by users. There are many different approaches and many names for it (extensions, scripting interface, modules, components); I'll simply say "plugins" from now on.

The fun thing about plugins is that they cross application and language domains. You can find plugin infrastructures for everything ranging from IDEs, to web servers to games. Plugins can be developed in language X extending an application mainly based on language Y, for a wide variety of X and Y.

My plan is to explore the design space of plugin infrastructures, looking at various implementation strategies and existing solutions in well-known applications. But for that, I need to first describe some basic terms and concepts - a common language that will let us reason about plugins.

Example - plugins for a Python application

I'll start with an example, by presenting a simple application and a plugin infrastructure for it. Both the application and plugins will be coded in Python 3.

Let's start by introducing the task. The example is a small but functional part of some kind of a publishing system, let's say a blogging engine. It's the part that turns marked-up text into HTML. To borrow from reST, the supported markup is:

before markup :role:` text ` after markup

Here "role" defines the mark-up type, and "text" is the text to which the mark-up is applied. Sample roles (again, from reST interpreted roles) are code , math or superscript .

Now, where do plugins come in here? The idea is to let the core application do the text parsing, leaving the specific role implementation to plugins. In other words, I'd like to enable plugin writers to easily add roles to the application. This is what the idea of plugins is all about: instead of hard-coding the application's functionality, let users extend it. Power users love customizing applications for their specific needs, and may improve your application beyond your original intentions. From your point of view, it's like getting work done for free - a win-win situation.

Anyway, there are a myriad ways to implement plugins in Python . I like the following approach:

class IPluginRegistry ( type ): plugins = [] def __init__ (cls, name, bases, attrs): if name != 'IPlugin' : IPluginRegistry.plugins.append(cls) class IPlugin ( object , metaclass=IPluginRegistry): def __init__ ( self , post= None , db= None ): """ Initialize the plugin. Optinally provide the db.Post that is being processed and the db.DB it belongs to. """ self .post = post self .db = db """ Plugin classes inherit from IPlugin. The methods below can be implemented to provide services. """ def get_role_hook ( self , role_name): """ Return a function accepting role contents. The function will be called with a single argument - the role contents, and should return what the role gets replaced with. None if the plugin doesn't provide a hook for this role. """ return None

A plugin is a class that inherits from IPlugin . Some metaclass trickery makes sure that by the very act of inheriting from it, the plugin registers itself in the system.

The get_role_hook method is an example of a hook. A hook is something an application exposes, and plugins can attach to. By attaching to a hook (in our case - implementing the get_role_hook method), the plugin can let the application know it wants to participate in the relevant task. Here, a plugin implementing the hook will get called by the application to find out which roles it supports.

Here is a sample plugin:

class TtFormatter (IPlugin): """ Acts on the 'tt' role, placing the contents inside <tt> tags. """ def get_role_hook ( self , role_name): return self ._tt_hook if role_name == 'tt' else None def _tt_hook ( self , contents): return '<tt>' + contents + '</tt>'

It implements the following transformation:

text :tt:` in tt tag ` here

to:

text <tt>in tt tag</tt> here

As you can see, I chose to let the hook return a function. This is useful since it can give the application immediate indication of whether the plugin supports some role at all (if it returns None , it doesn't). The application can also cache the function returned by plugins for more efficient invocation later. There are, of course, many variations on this theme. For example, the plugin could return a list of all the roles it supports.

Now it would be interesting to see how plugins are discovered, i.e. how does the application know which plugins are present in the system? Again, Python's dynamism lets us easily implement a very flexible discovery scheme:

def discover_plugins (dirs): """ Discover the plugin classes contained in Python files, given a list of directory names to scan. Return a list of plugin classes. """ for dir in dirs: for filename in os.listdir( dir ): modname, ext = os.path.splitext(filename) if ext == '.py' : file , path, descr = imp.find_module(modname, [ dir ]) if file : # Loading the module registers the plugin in # IPluginRegistry mod = imp.load_module(modname, file , path, descr) return IPluginRegistry.plugins

This function can be used by the applications to find and load plugins. It gets a list of directories in which to look for Python modules. Each module is loaded, which executes the class definitions within it. Those classes that inherit from IPlugin get registered with IPluginRegistry , which can then be queried.

You will notice that the constructor of IPlugin takes two optional arguments - post and db . For plugins that have more than just the most basic capabilities, the application should also expose an API to itself which would let the plugins query and manipulate it. The post and db arguments do that - each plugin will get a Post object that represents the blog post it operates upon, as well as a DB object that represents the main blog database.

To see how these can be used by a plugin, let's add another hook to IPlugin :

def get_contents_hook ( self ): """ Return a function accepting full document contents. The functin will be called with a single argument - the document contents (after paragraph splitting and role processing), and should return the transformed contents. None if the plugin doesn't provide a hook for this role. """ return None

This hook allows plugins to register functions that transform the whole contents of a post, not just text marked-up with roles . Here's a sample plugin that uses it:

class Narcissist (IPlugin): def __init__ ( self , post, db): super ().__init__(post, db) self .repl = '<b>I ({0})</b>' .format( self .post.author) def get_contents_hook ( self ): return self ._contents_hook def _contents_hook ( self , contents): return re.sub( r'\bI\b' , self .repl, contents)

As its name suggests, this is a plugin for users with narcissistic tendencies. It finds all the occurrences of "I" in the text, adds the author name in parens and puts it in bold. The idea here is to show how the post object passed to the plugin can be used to access information from the application. Exposing such details to plugins makes the infrastructure extremely flexible.

Finally, let's see how the application actually uses the plugins. Here's a simple htmlize function that gets a post and db objects, as well as a list of plugins. It does its own transformation of the post contents by enclosing all paragraphs in <p>...</p> tags and then hands the job over to the plugins, first running the role-specific hooks and then the whole contents hooks :

RoleMatch = namedtuple( 'RoleMatch' , 'name contents' ) def htmlize (post, db, plugins=[]): """ pass """ contents = post.contents # Plugins are classes - we need to instantiate them to get objects. plugins = [P(post, db) for P in plugins] # Split the contents to paragraphs paragraphs = re.split( r'



+' , contents) for i, p in enumerate (paragraphs): paragraphs[i] = '<p>' + p.replace( '

' , ' ' ) + '</p>' contents = '



' .join(paragraphs) # Find roles in the contents. Create a list of parts, where each # part is either text that has no roles in it, or a RoleMatch # object. pos = 0 parts = [] while True : match = ROLE_REGEX.search(contents, pos) if match is None : parts.append(contents[pos:]) break parts.append(contents[pos:match.start()]) parts.append(RoleMatch(match.group( 1 ), match.group( 2 ))) pos = match.end() # Ask plugins to act on roles for i, part in enumerate (parts): if isinstance (part, RoleMatch): parts[i] = _plugin_replace_role( part.name, part.contents, plugins) # Build full contents back again, and ask plugins to act on # contents. contents = '' .join(parts) for p in plugins: contents_hook = p.get_contents_hook() if contents_hook: contents = contents_hook(contents) return contents def _plugin_replace_role (name, contents, plugins): """ The first plugin that handles this role is used. """ for p in plugins: role_hook = p.get_role_hook(name) if role_hook: return role_hook(contents) # If no plugin handling this role is found, return its original form return ':{0}:` {1} `' .format(name, contents)