Java’s loose threads

Waffle is horrified that thread-blocking network operations are de rigueur in Java and Android:

I’d like to hear from any Android developers whether there’s any architectural reason to actually do a new thread. Conventional wisdom and past experience, including several operating systems with millions of users, says you don’t have to do it. Are there any constraints I’m missing, or is it that the APIs (or Java itself) are so inept that most people can’t bother getting asynchrony right anyway?

He’s right of course, but I feel compelled to offer a vain defense of Java.

Back when they were designing the platform, language, and libraries, using a virtual machine for what aspired to be a mainstream software platform was bold. Sun’s engineers were faced with more tough problems than they could solve in the first few releases.

In terms of I/O their biggest concern was (apparently) that naive coders would read huge amounts of data into memory. So they designed input and output stream classes and used them throughout the standard libraries. Java does more than any other language I’ve used to try to force you to use streams. It’s caused plenty of griping about how much code it takes to read in a file (which you know to be small) compared to platforms that allow you to easily read inputs into strings.

When they were designing these libraries, they had to write them for a VM that did not expose the lower level interfaces required for non-blocking I/O. It wouldn’t do that for several major revisions, as it turned out. So blocking I/O was baked into streams which were baked into everything.

For the sake of argument, imagine that the original platform designers’ main concern was not memory consumption by a small number of I/O processes but, instead, thread overhead by a great number of I/O processes. Perhaps they could have baked a very limited non-blocking interface for network I/O into the 1.0 VM. Their library interfaces might have accepted objects with ‘connectionDidFinishLoading’ callbacks. (These would have been tedious to create before inner and anonymous classes.) They might have not spent so much time writing stream classes and integrating them throughout the standard library.

In that alternate history, Java 1.0 would have suffered a more egregious scaling problem: the inability to work with data of significant size.

This is of course a false dilemma. You can have non-blocking network interfaces that don’t allocate the internet to RAM, and there are many. Java’s NIO interface is one. What I am suggesting is that the original contraints and priorities of the platform designers lead them to build things a certain way, which was good enough at the time but difficult to evolve. It would be easier for us now if they had supported non-blocking I/O from the beginning, at whatever cost.

But sadly, Java’s designers failed to predict that high-frequency, high-latency, small-payload web APIs would be so wildly popular almost 20 years later. Maybe AppKit’s designers at NeXT did anticipate that (to the joy of iPhone users today), or maybe they just had different contraints and priorities. I don’t think anyone was being particularly daft.

The Java user community (which, lawyers aside, includes Android) failed to pick up the NIO ball and make it happen when the opportunity finally presented itself. I’m culpable here myself. Dispatch uses the default, blocking Apache Http Components client. I wanted Dispatch’s foundation to be reliable and capable, so I chose the most conventional and highly regarded Java HTTP client around. And in the grand tradition of Java, DefaultHttpClient goes to great lengths to promote streams but is ambivalent about blocking threads.

If I’d known then what I know now, I would have used Http Components’ less conventional NIO extensions module. It would have been more work at the time, less work in the long run. But hilariously, doing so would have resulted in Dispatch being incompatible with Android, the platform that prompted this story. There is a cost to bucking convention, even when convention in wrong.

The good news is that Dispatch’s interface has always been callback-by-default, and a few versions back it was supplemented with an async interface (that–cover your ears–uses threads) so it’s already able to return futures. When non-blocking is integrated, client code should be able to stay as it is or opt-in to non-blocking and deal with the fact that Http#apply will return immediately.

I guess I know what I’m doing this weekend.