Or, even better, why your web framework should not adopt a CGI-based API.

For the past few years I have been studying and observing the development of different emerging languages closely with a special focus on web frameworks/servers. Unfortunately, most of the new web frameworks are following the Rack/WSGI specification which may be a mistake depending on the platform you are targeting (particularly true for Erlang and Node.js which have very strong streaming foundations and is by default part of their stack).

This blog post is an attempt to detail the limitations in the Rack/CGI-based APIs that the Rails Core Team has found while working with the streaming feature that shipped with Rails 3.1 and why we need better abstractions in the long term.

Case in study

The use case we have in mind here is streaming. In Rails, we have focused on streaming as a way to quickly return the head of the HTML page to the browser, so the browser can start downloading assets (like javascript and stylesheet) while the server is generating the rest of the page. There is a great entry on Rails weblog about streaming in general and a Railscast if you want to focus on how to use it in your Rails applications. However, streaming is not only limited to HTML responses and can also be useful in API endpoints, for example, to stream search results as they pop-up or to synchronize with mobile devices.

The Rack specification in a nutshell

In Rack, the response is made by an array with three elements: [status, headers, body]

The body can be any object that responds to the method each . This means streaming can be done by passing an object that will, for example, lazily read a file and stream chunks when each is called.

A Rack application is any object that implements the method call and receives an environment hash/dictionary with the request information. When I said above that most new web frameworks are following the Rack specification, is because they are introducing an API similar to this one just described.

The issue

In order to understand the issue, we will consider three entities: the client, the server and the application. The client (for example a browser) sends a request to the server which forwards it to an application. In this case, the server and the application are communicating via the Rack API.

The issue in streaming cases is that, sending a response back from the application does not mean the application finished processing. For example, consider a middleware (a middleware is an object that sits in between the server and our application) that opens up a connection to the database for the duration of the request and cleans it afterwards:

def call(env) connection = DB.checkout_connection env["db.connection"] = connection @app.call(env) ensure DB.checkin_connection connection end

Without streaming, it would work as follow:

The server receives a request and passes it down the stack The request reaches the middleware The middleware checks out the connection The application is invoked, renders a view accessing the database using the connection and returns the rendered view as a string The middleware checks the connection back in The response is sent back to the client

With streaming, this would happen:

The server receives a request and passes it down the stack The request reaches the middleware The middleware checks out the connection The app is called but does not render anything. Instead it returns a lazy object as response body that will stream the HTML page in chunks as the `each` method is called The middleware checks the connection back in Back in the server, we have received the lazy body and will start to stream it While streaming the body, since the body is lazily calculated, now is the time it must access the database. But, since the middleware already checked the connection back in, our code will fail with a “not connected” exception

The first reaction to solve this issue is to ensure that all streaming happens inside the application, i.e. the application would have a mechanism to stream the response back and only when it is done it would return the Rack response back. However, if the application does this, any middleware that desires to modify the header or the response body won’t be able to do so because the response was already streamed from inside the application.

Our work-around for Rails was to create proxies that wrap the response body:

def call(env) connection = DB.checkout_connection env["db.connection"] = connection response = @app.call(env) ResponseProxy.new(response).on_close do DB.checkin_connection connection end end

However, this is inefficient and extremely limited (not all middleware can be converted to such approach). In order for streaming to be successful, the underlying server API needs to acknowledge that the headers and the response body can be sent at different times. Not only that, it needs to provide proper callbacks around the response lifecycle (before sending headers, when the response is closed, on each stream, etc).

The trade-off here is that this can no longer be achieved with an easy API as Rack’s. In general, we would like to have a response objects that provides several life-cycle hooks. For example, the middleware above could be rewritten as:

def call(request, response) connection = DB.checkout_connection request.env["db.connection"] = connection response.on_close { DB.checkin_connection(connection) } @app.call(request, response) end

The Java Servlet specification is a good example of how request and response objects could be designed to provide such hooks.

Other middleware

In the example above I focused on the database connection middleware but this limitation exists, in one way or the other, in the majority of middleware in a stack. For example, a middleware that rescues any exception inside the application to render a 500 page also needs to be adapted. Other middleware simply won’t work. For instance, Rails ships with a middleware that provides an ETag header based on the body which has to be disabled when streaming.

Looking back

Does this mean moving to Rack was a mistake? Not at all. Rack appeared when the web development Ruby community was fragmented and the simplicity of the Rack API made it possible to unify the different web frameworks and web servers available. Looking back, I would take the standardization provided by Rack any day regardless of the limitations it brings. Now that we have a standard, we are already working on addressing such issues, which leads us to…

Looking forward

Streaming will become more and more important. While working with HTML streaming requires special attention both technically and also in terms of usability, as outlined in Rails’ documentation, API endpoints could benefit from it with basically no extra cost. Not only that, features in HTML5 like server-sent events could easily be built on top of streaming without requiring a specific end-point in your application to handle them.

While CGI was originally streaming friendly, the abstractions we built on top of it (like middleware) are not. I believe web frameworks should be moving towards better connection/socket abstractions and away from the old CGI-based APIs, which served us well but it is finally time for us to let it go.

PS: Thanks to Aaron Patterson (who has also written about this issue in his blog), Yehuda Katz, Konstantin Haase and James Tucker for early review and feedback.

F.A.Q.

This section was added after the blog post was released based on some common questions.

Q: Isn’t it a bad idea to mix both streaming and non-streaming behavior in the same stack?

That depends on the stack. This is definitely not an issue with Erlang and Node.js since both stacks are streaming based. In Ruby, I believe a threaded jRuby or Thin will allow you to get away with keeping a socket open waiting for responses, but it will probably turn out to be a bad idea with other servers since the process holding the socket won’t be able to respond to any other request.

Q: Is there a need to do everything streaming based when a request/response would be fine?

No, there is no need. The point of the blog post is not to advocate for streaming only frameworks, but simply state that a Rack API may severely limit your streaming capability in case your platform supports it. Personally, I would like to be able to choose and mix both, if my stack allows me to do so.