Somebody noticed that Lucene doesn’t use Lucene to index its own site; it presents a Google site search like everybody else. And not just that but their sub-project Nutch, which is specifically for web searching, doesn’t use itself. Natch.

We’ve heard this story before, and the defendants have a common response: we would love to use our wonderful web software for our own web site, but—damn thee, cruel fates!—it’s not the right tool for this highly particular job.

How is it that some fancy-pants framework is always the right tool for an abstract job and PHP is the right tool for a real job?

It seems like someone’s selling a load of crap. But in this case, no: Lucene is not a load of crap. It’s a very down to earth project that does something people need and does it well enough to be respected—even outside the Java community. Lucene is just fine, thank you.

So if the project produces a free product that is worth using for individual sites, why don’t they use it for their own site again? Whatever things a personal site search engine is good for (triggered updates, customized results display, not feeding daddy Googlebucks) apply as well to Lucene’s home site. Aren’t they a little enthusiastic about sending their own software out to the world? Probably they are, but the culture of large scale Java projects, and Apache specifically, is against it. Random comment:

Apache.org runs the Apache httpd server and serves static pages. Allowing every sub-project to run its own software on that server would be pretty insane. Look at how many there are.

If it’s “insane” to let programmers of Apache projects run their own software on Apache servers, that doesn’t say much for their revered vetting process for projects and programmers. Yes, system administration isn’t the same as programming—got it—but any project for the web ought to have someone competent in that as well. It matters.

Java deployment is treated as some kind of Apollo mission, to be undertaken only with careful planning and a healthy budget. Open source projects generally have neither of those, so Java projects are left writing code for a void. Good projects get hooked up with a patron company that pays for practical, internal applications to develop alongside the public framework. Bad projects build elaborately useless public frameworks and scoff at wasting their precious brains on mere applications programming. That everyone is scared of Java deployment plays right into their lazy, conceited plans.

And it’s not the case that every project with a complicated deployment reverts to static HTML or PHP for its own webfront. The Hibernate blog runs on Seam. This weblog is rendered using various technologies it hyperventilates over. And freaking Smalltalk programmers—what could possibly be harder to deploy?—host their own site, their “ CMS ” if you must, for the Seaside web framework. The site isn’t always fast, it’s probably not always running, but at least the framework has the courage to show itself.

This isn’t 1999. You don’t need a server dedicated to one Java application; a virtualized server is perfectly adequate. Take one meaty server and create virtuals for every project to use, or not use, as suits its disposition. Certainly the exposed site should be an Apache Web Server, but there’s this little thing called mod_proxy_ajp —perhaps the Apache project has heard about it? (If not, it’s discussed in the updated Databinder deployment tutorial.) You can isolate software.

Anyway, eating other people’s dog food is disgusting.