There’s a dark side to Ruby web applications.

One thing is clear – real time web applications in Ruby are hurting.

We can argue a lot about who it is that allowed themselves to succumb to the dark side. Some argue that Rails are derailing us all and others that socket.io, Faye and node.js are eating away at out sanity, design, performance and feet… but arguing won’t help.

As I pointed out in an issue I submitted in Rack’s GitHub repo, Rack’s design forces a performance penalty on realtime connections and I think I found a solution to work around the penalty…

If you read nicely to the end (or skip to it) I’ll show you the secret of The Priceless Websocket.

If you don’t know what Rack is, you might want to read Gaurav Chande’s blog post about it, or not.

Basically, Rack is the way Ruby applications (Rails / Sinatra etc’) connect to web servers such as Puma, Thin or (my very own) Iodine. HTTP requests are forwarded to Ruby applications using a Rack interface and the web applications use the same Rack interface to send back the responses.

How is Rack hurting Websockets connections?

Well, not intentionally… but the road to… well, you know.

A bit of background

When Rack started out, it was a wonderful blessing. It was back when smart phones where stupid, Ruby was young and the year was (I think) 2007 or 2006.

At the time, the cloud was a floating thing in the sky and no one imagined Facebook might one day use our phone’s camera and microphone to collect information about us while we talk to our neighbors (does Facebook spy on us? their privacy statement reads like Chinese to me).

Rack adopted a lot of it’s models and requirements from what was a thriving technology at the time (may it rest in peace), CGI.

If you don’t know what CGI is, I’m happy for you.

Basically it means that Rack decided that they would provide the Ruby application with information about the request, but it would not provide a response object, rather, the Ruby application’s return value is the response.

I think you can see the problem from here…

This was probably the biggest mistake Rack ever made, but it took a number of years before we started to realize that fact and voices like José Valim started to warn us about Why our web framework should not adopt Rack API.

The design meant that Rack forced the server to wait for Ruby to return a response and only then any data was sent back through the connection.

No streaming, no persistence – a request goes into Ruby and a response is whatever comes out.

Fast Forward: Back To The Now

The CGI based Rack API seems to be here to stay.

Whatever warnings or rebellions we had died in the hard light of practicality and the DRY Ruby culture.

As big a mistake as it might have been (or maybe it wasn’t a mistake, who cares at this point?), we have a decade of code, gems and community support that requires and implements Rack’s design.

But we can still have Websockets, can’t we?

When the Rack team realized the issue was big, very big, an interim (now permanent?) solution came up – connection hijacking .

This was a small piece of Rack API that said “we know Rack enforces limitations, so why don’t we get out of your way? Here, take the raw IO (the TCP/IP socket) and do whatever you want”.

Now we have Websockets (and SSE and HTTP streaming), but at a price – a high price.

Also, this API ( hijacking ) is starting to break apart. HTTP2 compliant servers can’t (or shouldn’t) really support hijacking without running HTTP2 connections to the ground. This means no streaming / SSE on HTTP2 connections, and HTTP2 push seems far away…

The Price We Pay

Here is how Websockets are implemented today and a good hint at the price we pay:

Applications hijack the TCP/IP socket from the web server. Read: We tell the professionals to f#@k 0ff and we take control of the low(er)-level IO.

the TCP/IP socket from the web server. Application run their own, separate, IO handling for the hijacked connections: Read: We run duplicate code and duplicate concern management. The server’s IO handling & our new beautiful idea of how networking should work both run at the same time… …I really hope you know what you’re doing with that IO thingy. Is it supposed to turn purple when you code it this way?

Applications roll out a Websocket protocol parser… often inside the Ruby GIL/GVL (if you don’t know this one, ignore it, just think “bad”). Read: We ditch concurrency in favor of a “lock and wait” design, choosing a slow tool when we have an existing fast tool that was giving us both concurrency and speed just a moment ago.

In practice solutions vary between manageable, terrible and tragic.

No matter how good a solution you find, there is always the price of code duplication and running two different IO polling concerns. This is an immediate cause for memory bloats, excessive context switches and other unwieldy slowdowns.

However, manageable solutions implement a low level IO layer written by network programming savvy developers. Such solutions often use the fresh nio4r gem or the famous (infamous?) EventMachine.

On the other side of the spectrum, you’ll find solutions that pay the ultimate price, using any or all of the following: a thread per Websocket design (god forbid), the select system call (why does everyone ignore the fact that it’s limited to 1024 connections?), blocking write calls until all the data was sent…

… My first implementation was this kind of monstrosity that paid the ultimate price in performance. I copied the design from well meaning tutorials on the internet…

You wouldn’t believe the amount of misguided tutorials on network programming.

It’s worst then I’m really telling you. If you think I’m just ranting, go read through the ActionCable code and see for your self.

The Priceless Websocket

Rack’s API, which forces us to pay such a big price for real time applications, can be easily adjusted to support “priceless” (read: native) websocket connections.

The idea is quite simple – since the Rack response can’t be adjusted without breaking existing code and throwing away a decade’s worth of middleware… why not use Rack’s request object (the famous env ) to implement a solution?

In other words, what if everything we had to do to upgrade from HTTP to Websockets was something like this env[upgrade.websockets] = MyCallbacks.new(env) …?

What can we gain? Well, just for a start:

We can unify the IO polling for all IO objects (HTTP, Websocket, no need for hijack ). This is bigger then you think.

). This is bigger then you think. We can parse websocket data before entering the GIL, winning some of our concurrency back. This also means we can better utilize multi-core CPUs.

We don’t need to know anything about network programming – let the people programming servers do what they do best and let the people who write web applications focus on their application logic.

I’ve got a POC in me pocket

I have a Proof Of Concept in my POCket for you, if you’re running Ruby MRI 2.2+ with OS X (or maybe Linux).

Yes, I know, I tested this proof of concept on one machine (mine) and it requires a Unix OS and Ruby C extension support (so it’s limited to Ruby MRI, I guess), but it works, it’s sweet and it provides a Rack bridge between a C Websocket server and a Ruby application.

The QAD (quick and dirty) rundown

The server pushes the HTTP request to the Ruby application. The Ruby application can’t use the Rack response to ask for a Websocket upgrade (remember, we have middleware, years of code and protective layers that prevent us from doing anything that isn’t a valid HTTP response)… So…

…The Ruby application updates the request (the Rack env Hash) with a Websocket Callback Object – it’s so simple you will cry when you see it.

We’re done. The server simply uses the wonderful qualities of Ruby Meta-Programming to add Websocket functionality to the Websocket Callback Object and we’re off to the real-time arena.

We Want Code, We Want Code…

Here’s my Proof Of Concept, written out as a simple Rack application. If you’re like me – too lazy to copy and paste the code – you can download it from here.

The proof of concept is the ugliest chatroom application I could imagine to write. It uses nothing except Rack and the Iodine web server.

Iodine is a Rack server written in C and it implements the suggested solution using upgrade.websocket and a feature check using upgrade.websocket? (the feature check is only available for upgrade requests).

I’m hopeful that after reading and running the code, you will help me push upgrade.websocket into the Rack standard.

The gemfile

Every Ruby application seems to start with a gemfile these days. They are usually full with goodies…

…but I’ll just put in the server we’re using – it isn’t publicly released, so we need a github link to it. I think that’s all we need really, since it references Rack as a dependency.

Save this as gemfile :

<br /># The Iodine Server gem 'iodine', '~> 0.2.0'

The Rack Application

Now you’ll see the simple magic of a native websocket implementation – no Ruby parser, no Ruby IO management, it’s all provided by the server, no hijacking necessary.

The code is fairly simple, so I’ll add comments as we go.

Rack applications, as a convention, reside in files named config.ru . Save this to config.ru :

<br /># The Rack Application container module MyRackApplication # Rack applications use the `call` callback to handle HTTP requests. def self.call(env) # if upgrading... if env['HTTP_UPGRADE'.freeze] =~ /websocket/i # We can assign a class or an instance that implements callbacks. # We will assign an object, passing it the request information (`env`) env['upgrade.websocket'.freeze] = MyWebsocket.new(env) # Rack responses must be a 3 item array # [status, {http: :headers}, ["response body"]] return [0, {}, []] end # a semi-regualr HTTP response out = File.open File.expand_path('../index.html', __FILE__) [200, { 'X-Sendfile' => File.expand_path('../index.html', __FILE__), 'Content-Length' => out.size }, out] end end # The Websocket Callback Object class MyWebsocket # this is optional, but I wanted the object to have the nickname provided in # the HTTP request def initialize(env) # we need to change the ASCI Rack encoding to UTF-8, # otherwise everything with the nickname will be a binary "blob" in the # Javascript layer @nickname = env['PATH_INFO'][1..-1].force_encoding 'UTF-8' end # A classic websocket callback, called when the connection is opened and # linked to this object def on_open puts 'We have a websocket connection' end # A classic websocket callback, called when the connection is closed # (after disconnection). def on_close puts "Bye Bye... #{count} connections left..." end # A server-side niceness, called when the server if shutting down, # to gracefully disconnect (before disconnection). def on_shutdown write 'The server is shutting down, goodbye.' end def on_message(data) puts "got message: #{data} encoded as #{data.encoding}" # data is a temporary string, it's buffer cleared as soon as we return. # So we make a copy with the desired format. tmp = "#{@nickname}: #{data}" # The `write` method was added by the server and writes to the current # connection write tmp puts "broadcasting #{tmp.bytesize} bytes with encoding #{tmp.encoding}" # `each` was added by the server and excludes this connection # (each except self). each { |h| h.write tmp } end end # `run` is a Rack API command, telling Rack where the `call(env)` callback is located. run MyRackApplication

What does the application do?

The application checks if the request is a websocket upgrade request.

If the request is a websocket upgrade request, the application sets the Websocket Callback Object in the env hash (technically the request data Hash) and sends back an empty response (we could have set cookies or headers if we wanted.

If the request is not a websocket upgrade request, it sends the browser side client file index.html (we’ll get to it in a bit).

That’s it.

The Websocket Callback Object is quite easy to decipher, it basically answers the on_open , on_message(data) , on_shutdown and on_close callbacks. The on_shutdown is the least common, it’s a server side callback for graceful disconnections and it’s called before a disconnection.

I like the snake-case Ruby convention and I thought it would server the names of the callbacks well (instead of the JavaScript way of onclose and onmessage which always annoys me).

There were a number of design decisions I won’t go into here, but most oddities – such as each == “each except self” – have good reasons to them, such as performance and common use patterns.

The Html Client

Every web application needs a browser side client. I know you can write one yourself, but here, you can copy my ugly a$$ version – save it as index.html :

<!DOCTYPE html> <html> <head> <a href="https://ajax.googleapis.com/ajax/libs/jquery/2.1.4/jquery.min.js">https://ajax.googleapis.com/ajax/libs/jquery/2.1.4/jquery.min.js</a> ws = NaN handle = '' function onsubmit(e) { e.preventDefault(); if($('#text')[0].value == '') {return false} if(ws && ws.readyState == 1) { ws.send($('#text')[0].value); $('#text')[0].value = ''; } else { handle = $('#text')[0].value var url = (window.location.protocol.match(/https/) ? 'wss' : 'ws') + '://' + window.document.location.host + '/' + $('#text')[0].value ws = new WebSocket(url) ws.onopen = function(e) { output("<b>Connected :-)</b>"); $('#text')[0].value = ''; $('#text')[0].placeholder = 'your message'; } ws.onclose = function(e) { output("<b>Disonnected :-/</b>") $('#text')[0].value = ''; $('#text')[0].placeholder = 'nickname'; $('#text')[0].value = handle } ws.onmessage = function(e) { output(e.data); } } return false; } function output(data) { $('#output').append("<li>" + data + "</li>") $('#output').animate({ scrollTop: $('#output')[0].scrollHeight }, "slow"); } <style> html, body {width:100%; height: 100%; background-color: #ddd; color: #111;} h3, form {text-align: center;} input {background-color: #fff; color: #111; padding: 0.3em;} </style> </head><body> <h3>The Ugly Chatroom POC</h3> <form id='form'> <input type='text' id='text' name='text' placeholder='nickname'></input> <input type='submit' value='send'></input> </form> $('#form')[0].onsubmit = onsubmit <ul id='output'></ul> </body> </html>

(yes, I’m an expert at CSS and I couldn’t care less about the design for this one)

Running the server

To run the application we just wrote, we need to run two commands from the terminal (in the folder where we put the application files).

Install Iodine and any required gems using*:

bundler install

Run the application (single threaded mode) using:

bundler exec iodine -www . -p 3000

As well as our dynamic web application, this will start a static HTTP file server in the current folder (the -www option pointing at . ), so make sure the folder only has the application there – or cat pictures, the internet loves cat pictures.

Now visit localhost:3000 and see what we’ve got.

A nice experiment is to run the server using multi threads ( -t # ) and multi processes ( -w # ). You’ll notice that memory barriers for forked processes prevent websocket broadcasting from reaching websockets connected through a different process. Maybe try using a number of open browser windows to grab a few different processes.

bundler exec iodine -www ./ -t 16 -w 4

You can benchmark against Puma or whatever so make sure the HTTP service isn’t affected by this added server side requirement. It’s true that the server added a review to the request as well as the original review for the response, but it doesn’t seem to induce a performance hit.

* If you got funny errors while trying to compile Iodine, it might not be supported on your system. Make sure you’re running a Linux / OS X / BSD operating system with the latest clang or gcc compilers. Neither Solaris nor Windows are supported at the moment (they have very different IO routines).

A few pointers (not C pointers)

You may have noticed that the on_message(data) provides a very temporary data string. The C layer will immediately recycle the memory for the next incoming message (not waiting for the garbage collector).

This is perfect for the common JSON messages that are often parsed and discarded and it enhances Websocket performance, but t’s less comfortable for this chatroom demo where we need to create a new string before broadcasting.

Try benchmarking Iodine against other servers that don’t provide native websockets and meta-programming facilities (i.e. against Puma or Thin), I think you’ll be surprised, especially when using more then a single worker process and a few extra threads.

To run the application with Puma: bundler exec puma -w 4 -p 3000

To run the application with Iodine: bundler exec iodine -t 16 -w 4 -p 3000

Benchmark using ab: ab -n 100000 -c 20 -k http://127.0.0.1:3000/

Benchmark using wrk: wrk -c20 -d4 -t4 http://localhost:3000/

How many concurrent connections can they handle? How much memory do they use?

You may say I’m a dreamer…

…but I really hope I’m not the only one.

Ruby is a beautiful, beautiful language. I’m sad to see so many people complain about how hard and non-performant it is to write real-time web applications in Ruby.

I’ve opened an issue ar Rack’s GitHub repo trying to explain the upside for this type of addition to the Rack Specification, but it’s been a few weeks and no one from the team seems to have looked at it.

If you like the idea, please visit the issue and add your voice.

Update

I’m thankful for all the positive feedback and attention this blog post received. It sadly brought to sharp relief the slowness with things are (not) moving with regards to the Rack issue… Perhaps the right thing would be to open issues with the other servers (Puma, Thin, Unicorn etc’) and ask them to implement the upgrade.websocket workflow.

I edited the post to change the proposed rack.websocket name I used before to upgrade.websocket , since this will allows future protocols to be placed under the same “namespace”… upgrade.tcp anyone?

Update2

When I first published this article, Iodine was still in development.

I tested Iodine quite a lot and I think I worked out most of the issues that might have affected it and I think it could be used for production.

Iodine had been released as a gem.

Also, I fixed some changes that WordPress introduced independently (their new editor is terrible).