Java Build Tools: Ant vs. Maven

TRANSLATIONS: Spanish (gracias, José!)

UPDATED 2010-01-06: linked to demonstration of the “10 minute mvn clean build” problem, and added notes about: slow build times, excessive memory use, bad test result output, untrusted repository artifacts, and external configuration files.

UPDATED 2010-02-21: linked to Spanish translation (courtesy of José Manuel Prieto)

The best build tool is the one you write yourself. Every project’s build process is unique, and often individual projects need to be built multiple different ways. It is impossible for tool authors to anticipate every build’s requirements, and foolhardy to try (Apache developers: take note). The best any tool can do is provide a flexible library of reusable tasks that can easily be adapted to your needs, but even that is insufficient. Off-the-shelf tasks never suit your project perfectly. You will waste countless hours struggling to make those tasks do exactly what you need, only to give up and write a plugin instead. Writing your own custom build tool is quick and easy, and requires less maintenance than you fear. Don’t be afraid: builds should fit your project, not the other way around.

If you don’t want to write your own build tool, then you should use Rake. Rake is the best existing build tool for Java projects. Rake provides a bunch of standard methods to perform common build tasks, and anything else can be quickly implemented in Ruby. Writing build scripts in a real programming language gives Rake a huge advantage over other tools. There are other advantages, too, but none are as important.

So, you should write custom build tools for your projects. If you don’t want to, then you should switch to Rake. If you can’t switch, you should lobby for the right to switch. If politics drives technology decisions, if you will never be allowed to switch, then quit your job or leave the project.

If you lack the courage to quit, then use Ant. Ant is the second best existing build tool for Java projects. Although inferior to Rake, Ant is still a great build tool. Ant is mature and stable, it is fast, and it comes with a rich library of tasks. Ant makes it possible (but not at all easy) to script rich, complex builds processes custom-tailored to your project.

So, write your own build tool, or else switch to Rake, or fight to switch to Rake, or quit and go some place where you can use Rake. And if all else fails, use Ant until you can find a new job somewhere else that uses Rake.

That’s it! Those are the only choices I can recommend! Because you never, ever, under any circumstances want to use Maven!

Maven builds are an infinite cycle of despair that will slowly drag you into the deepest, darkest pits of hell (where Maven itself was forged). You will initially only spend ten minutes getting Maven up and running, and might even be happy with it for a while. But as your project evolves, and your build configuration grows, the basic pom.xml that you started with will prove inadequate. You will slowly add more configuration to get things working the way you need, but there’s only so much you can configure in Maven. Soon, you will encounter Maven’s low glass ceiling for the first time. By “encounter,” I mean “smash your head painfully against.” By “for the first time,” I mean “you will do this repeatedly and often in the future.” Eventually, you’ll figure out some convulted pom.xml hackery to work around your immediate issue. You might even be happy with Maven again for a while… until another limitation rears its ugly little head. It’s a lot like some tragic Greek myth, only you are the damned soul and the eternity of suffering is your build process.

Seriously. Maven is a horrible implementation of bad ideas. I believe someone, somewhere had (perhaps still has) a vision for Maven that was sensible, if not seductive. But the actual implementation of Maven lacks any trace of such vision. In fact, everything in Maven is so bad that it serves as a valuable example of how not to build software. You know your build is awesome when it works the opposite of Maven.

Consider the test results output from Maven’s Surefire plugin. Everything seems fine as long as all of your tests are passing, but Surefire reports are a nightmare to debug when things go wrong! The only information logged to the console is the name of the failing test class. You must manually cross-reference that name with a log file written in the target/surefire-reports/ directory, but those logs are written one per test class! So, if multiple test classes fail, you must separately check multiple log files. It seems like a minor thing, but it quickly adds up to a major annoyance and productivity sink.

Maven advocates claim their tool embraces the principle of Convention Over Configuration; Maven advocates are liars. The only convention Maven supports is: compile, run unit tests, package .jar file. Getting Maven to do anything else requires configuring the conventions. Want to package a .war file? Configure it. Want to run your application from the command line? Configure it. Want to run acceptance tests or functional tests or performance tests with your build, too? You can configure it, but it involves not running your unit tests, or not running them during the conventional unit test phase of your build process, or… Want to generate code coverage metrics for your project? You can configure that, too, but your tests will run twice (or only once, but not during the conventional unit test phase), and sometimes it reports 0% code coverage despite the comprehensive test suite.

Speaking of configuration, Maven has the worst configuration syntax since Sendmail: alternating normal form XML. As a consequence, Maven configuration is verbose, difficult to read and difficult to write. Things you can do in one or two lines of Ruby or XML with Rake or Ant require six, seven, eight lines of pom.xml configuration (assuming it’s even possible with Maven).

There’s nothing consistent about Maven’s configuration, either. Some things are configured as classpath references to .properties files bundled in .jar files configured as dependencies, some things are configured as absolute or relative paths to files on disk, and some things are configured as system properties in the JVM running Maven. And some of those absolute paths are portable across projects because Maven knows how to correctly resolve them, but some are not. And sometimes Maven is smart enough to recursively build projects in the correct order, but sometimes it’s not.

And some things aren’t even configured in the pom! Some things, like Maven repositories, servers, and authentication credentials, are configured in settings.xml. It is perfectly reasonable to want to keep user’s passwords out of pom.xml files which will be checked into the project’s version control repository. But Maven’s solution is terrible: all this configuration goes in a settings.xml file that lives outside of any project’s directory. You can’t directly share any of this configuration between your desktop and laptop, or with other developers, or with your project’s build servers. But it is automatically shared with every single Maven project you work with, and potentially every single Maven project every user on that machine works with. When a new developer joins your project, they must manually merge the necessary configuration into their existing settings.xml. When a new agent is added to your build server farm, the necessary configuration is manually merged into its existing settings.xml. Ditto for when you migrate to a new machine. And when any of this configuration needs to be updated, it must be manually updated on every single machine! This was a solved problem before Maven came along, too: properties files. Project teams can put generic configuration like this in a properties file which is checked in to version control, and individual developers can override that information in local properties file which are not checked in to version control.

All this stuff in Maven — the conventions, the configuration, the process — is governed by “The Maven Way”. Unfortunately, “The Maven Way” is undocumented. You can catch fleeting glimpses of it by trawling the Maven documentation, searching the Google, or buying books written by Maven developers. The other way you encounter “The Maven Way” is by tripping over (or smashing against) its invisible boundaries. Maven was not built to be flexible, and it does not support every possible build process. Maven was built for Apache projects, and assumes every project’s build process mirrors Apache’s own. That’s great news for open-source library developers who volunteer on their own time and to whom “release” means “upload a new .zip file to your website for others to manually find, download, and add to their own projects.” It sucks for everyone else. While Rake and Ant can accommodate every build process, Maven can’t; it is possible, and in fact quite likely, that Maven just doesn’t support the way you want to build your software.

And Maven’s dependency management is completely, entirely, irrevocably broken. Actually, I take that back; Maven’s strategy of downloading ibiblio to the user’s home directory and then dumping everything on the classpath is incredibly stupid and wrong and should never be confused with “dependency management.” I recently worked on a Maven project which produced a 51 MB .war file; by switching to Ant with hand-rolled dependency management, we shrunk that .war file down to 17 MB. Hrmmm… 51 – 17 = 34 = 17 × 2, or: 2/3 of the original bulk was useless crap Maven dumped on us.

Extraneous dependencies don’t just eat up disk space, they eat up precious RAM, too! Maven is an all-around memory hog. Relatively simple projects, with only a parent pom and a few sub-modules, require extensive JVM memory tuning with all those fancy JAVA_OPTS settings you typically only see on production servers. Things are even worse if your Maven build is integrated with your IDE. It’s common to set your JVM’s max heap size to several hundred megabytes, the max permgen size to a few hundred megabytes, and enable permgen sweeping so classes themselves are garbage collected. And all this just to build your project, or work with Maven in your IDE!

Funny story: on that same project I once endured a ten minute “mvn clean” build because Maven thought it needed yet more crap in order to “rm -rf ./target/” (see a similar example: http://gist.github.com/267553). Actually, there’s nothing funny about that story; trust me: you don’t want a build tool which automatically downloads unresolved dependencies before cleaning out your build output directories. You don’t want a build tool which automatically downloads unresolved dependencies, PERIOD! Automatically downloading unresolved dependencies makes your build process nondeterministic! Good ol’ nondeterminism: loads of fun in school, not so fun at work!

And all that unnecessary, unwanted network chatter takes time. You pay a performance penalty for Maven’s broken dependency management on every build. Ten minute clean builds are horrible, but adding an extra minute to every build is even worse! I estimate the average additional overhead of Maven is about one minute per build, based on the fact that the one time I switched from Maven to Ant the average build time dropped from two and a half minutes to one and a half. Similarly, the one time I switched from Ant to Maven the average build time increased from two minutes to three.

You have no control over, and limited visibility into, the dependencies specified by your dependencies. Builds will break because different copies of Maven will download different artifacts at different times; your local build will break again in the future when the dependencies of your dependencies accidentally release new, non-backwards compatible changes without remembering to bump their version number. Those are just the innocent failures, too; the far more likely scenario is your project depends on a specific version of some other project which in turn depends on the LATEST version of some other project, so you still get hosed even when downstream providers do remember to bump versions! Every release of every dependency’s dependencies becomes a new opportunity to waste several hours tracking down strange build failures.

But Maven is even worse than that: not only does Maven automatically resolve your project’s dependencies, it automatically resolves its own plugins’ dependencies, too! So now not only do you have to worry about separate instances of Maven accidentally downloading incompatible artifacts (or the same instance downloading incompatible artifacts at different times), you also have to worry about your build tool itself behaving differently across different machines at different times!

Maven’s broken dependency management is also a gaping security hole, since it is currently impossible in Maven to determine where artifacts originally came from and whether or not they were tampered with. Artifacts are automatically checksummed when they are uploaded to a repository, and Maven automatically verifies that checksum when it downloads the artifact, but Maven implicitly trusts the checksum on the repository it downloaded the artifact from. The current extent of Maven artifact security is that the Maven developers control who has write access to the authoritative repository at ibiblio. But there is no way of knowing if the repository you download all your dependencies from was poisoned, there is no way of knowing if your local repository cache was poisoned, and there is no way of knowing which repository artifacts in your local repository cache came from or who uploaded them there.

These problems are not caused by careless developers, and are not solved by using repository managers to lock down every artifact Maven needs. Maven is broken and wrong if it assumes humans never make mistakes. Maven is broken and wrong if it requires users to explicitly specify every version of every dependency, and every dependency’s dependencies, to reduce the likelihood of downloading incompatible artifacts. Maven is broken and wrong if it requires a third-party tool to prevent it connecting to the big, bad internets and automatically downloading random crap. Maven is broken and wrong if it thinks nothing of slowing down every build by connecting to the network and checking every dependency for any updates, and automatically downloading them. Maven is broken and wrong if it behaves differently on my laptop at the office and at home. Maven is broken and wrong if it requires an internet connection to delete a directory. Maven is broken and wrong.

Save yourself.