Posted: Mon, 9 July 2007 | permalink | No comments

I've been working (part-time) on an existing Rails application for the last month or so, getting comfortable with it, modifying some sections to support needed functionality, and beginning an all-out, carpet-bombing "shock and awe" assault on it's test suite. However, a significant portion of the system is (or rather was, since I've already gutted a lot of it) a chunk of rather messy and unpleasant auto-generated code.

I've always been a bit iffy about code generation tools, ever since my first encounters with them whilst working on some $DEITY-awful access application years ago. (First encounters with anything Microsoft usually have that effect on me, though). Since then, I've never found any reason to change my mind, and dealing with this particular chunk of auto-generated code has pushed me over the edge into "wild-eyed extremist" territory.

Please note that when I say "code generator", I'm talking here about programs or wizards or whatever that create a block of code that you're then expected to tweak to suit your conditions. If it's just a mechanical transformation of an input format to an output format, where a human is never expected to maintain the output by hand, that's not code generation in my book -- that's just a compiler being used for it's intended purpose. I like compilers. I'm fairly sure you do too. Code generators, though? EVIL.

Consider what you get out of your generator: A big blob of code in your application that you then have to understand and maintain. It's like you gave a commit bit to some random person who dropped a 5000 line change into your tree and then ran away into the night, cackling.

If there's bugs in the generated code (and remember, there's always bugs in software), when they get fixed upstream you're up the creek unless you haven't modified the code at all (and thus can regenerate), or you're willing to apply the necessary patches to your modified version, often by hand. Hand-patching is error-prone, and if you've modified the code at the same place that has to be patched, things can get unpleasant.

In the case of the generator I'm dealing with at the moment, I'm not even blessed with it being decent code. It's ugly code, in every possible sense of the word. There's inconsistent formatting, poor commenting, it uses some totally whack coding constructs[1], the hard-coded text strings are ungrammatical, it's got very few (and poorly covering) test cases, the methods are huge and unwieldy, and it's got security bugs (not so hot for something that's supposed to be guarding your application). In short, it looks like some quick-and-dirty internal hack that escaped one night and is now wreaking havoc on the world.

There's probably some really good code generators out there. I've been poking around a bit, and there's some written for Rails by people who I know produce good code. However, no matter how good the author of the code generator is, the very best you can hope for is solid, clean code that is the equivalent of some portion of your application having been coded by some member of your team who you (and nobody else) has ever met, who didn't follow your team's coding conventions, and who has never looked at any other part of your application to make sure that everything hung together in a consistent manner. Which you now have to maintain.

I've read a lot of books, articles, blog posts, and mailing list messages, and participated in no shortage of "why software sucks" discussions with other programmers, but I've never heard anyone advocate that the way to clean code, minimal bugs, and happy programmers was to drop a big chunk of unknown code right in the middle of things.

Before you start charging up the e-mail-delivered tasers, ready to fry my delicate braincells for the twin crimes of Xenocodophobia and a vicious case of Not Invented Here syndrome, though, please understand that I'm not against code reuse in the right place. That right place is a library, component, or other abstract, documented, fairly static, and presumably-replaceable chunk of code. By using a library, you get a relatively simple interface to somebody else's complexity. This is a good thing, for all the reasons that everyone should already be able to recite. Items such as a comprehensible interface, minimal learning curve, and so on. Things you don't get with generated code because, pretty much by definition, the entire chunk of code is you "interface".

There are some times, though, when what you want to do can't be done in a library. Either the thing has too many tentacles to be able to easily encapsulate behind an abstracted interface, or you need to modify it's behaviour in too many places to be able to get away with a few simple callbacks to get the job done. Hence, it might be tempting to fire up the code generator to get yourself out of a hole quickly.

Don't do it though. Don't succumb to the temptation of saving yourself a little bit of time by firing up a code generator. What you're doing there is just making pain for yourself, by inheriting a big chunk of code that you don't fully understand, probably doesn't solve your problem particularly well, and which you (or whichever poor slob gets the job after you) will have to maintain into perpetuity -- most likely by gradually rewriting large pieces of it so that it fits your application better.

The output of a code generator is likely to be the least maintainable code in your tree -- either by virtue of the fact that it's crap code, or at the very least because you just don't understand it. The general principle of good codebase hygiene is that you get rid of unmaintainable code quickly, so it doesn't infect everything else (a la the "broken window theory"). With that in mind, why would you ever consider deliberately introducing a huge chunk of unmaintainable code into your application?

The closest I can come to a defence of code generators is something like Rails' scaffolding, where the controller and views for a set of simple CRUD operations on a table are auto-generated. The key word there, though, is scaffolding. It's very name suggests that it's temporary, to be replaced with something more robust and useful as soon as is reasonable. The default scaffolding is ugly as sin, besides, which encourages you to replace it, and -- as of Rails 1.1 (I think), there's a 'scaffold' method which creates the scaffolding output at runtime, so you never, ever need to generate scaffolding code any more.

Anyone planning on producing any significant piece of their application through the use of code generators is, in my book, committing a crime against software decency.

1. My favourite whack code construct thus far has been this gem:

case somevar when 17 do_something_here() end