I recently read a great blog post made by SoundCloud Team. The Article talks about the Software Architecture Evolution.

http://backstage.soundcloud.com/2012/08/evolution-of-soundclouds-architecture/

In the section "Load distribution and a little queue theory" Sean Treadway, talks about queue theory and how to better use the queue.

He Wrote:

We wanted a system that never queued, but if it did queue, the wait time in the queue was minimal. Taking the M/M/c model to the extreme, we asked ourselves “how can we make c as large as possible?” To do this, we needed to make sure that a single Rails application server never received more than one request at a time We added HAProxy into our infrastructure, configuring each backend with a maximum connection count > of 1 and added our backend processes across all hosts, to get that wonderful M/M/c reduction in > resident wait time by queuing the HTTP request until any backend process on any host becomes available

Apparently, they are using HAProxy + Rails Servers ( Maybe Mongrel ). Ok, HAProxy enqueing incoming requests and only dispatching to Mogrel/Thin when it is available.

Maybe could I be completely wrong ;), but Apache + Passenger do the same thing, right ? One queue ( Apache handling incoming requests ) and C Workers ( Child Process )