65K messages/sec

The Netty redesign of riemann-java-client made it possible to expose an end-to-end asynchronous API for writes, which has a dramatic improvement on messages with a small number of events. By introducing a small queue of pipelined write promises, riemann-clojure-client can now push 65K events per second, as individual messages, over a single TCP socket. Works out to about 120 mbps of sustained traffic.

I'm really happy about the bulk throughput too: three threads using a single socket, sending messages of 100 events each, can push around 185-200K events/sec, at over 200 mbps. That throughput took 10 sockets and hundreds of threads to achieve in earlier tests.

This isn't a particularly useful feature as far as clients go; it's unlikely most users will want to push this much from a single client. It is critical, however, for optimizing Riemann's server performance. The server, running the bulk test, consumes about 115% CPU on my 2.5Ghz Q8300. I believe this puts a million events/sec within reach for production hardware, though at that throughput CAS contention in the streams may become a limiting factor. If I can find a box (and network) powerful enough to test, I'd love to give it a shot!

This is the last major improvement for Riemann 0.2.0. I'll be focusing on packaging and documentation tomorrow. :)