Discussion on Reddit

I will continue to expand the features in the pagelet_rails gem. Since it offers a new way of composition in Rails I've found that there are so many new things we can do. In this post, I will focus on the parallel rendering. Yes, that's right, parallel rendering in Rails. Although, it may sound advanced the concepts behind it are extremely simple. This post however is only relevant to web page rendering.

Update 2: This post was featured in RubyWeekly #318

Background

When we want to run things in parallel with Ruby we only have two options - threads and processes.

Threads don't give us much in MRI because of the Global Interpreter Lock (GIL). Although, threads do have the advantage in applications with a lot of IO calls. However, web servers like puma and passenger take advantage of that and can serve multiple concurrent requests at the same time.

Processes don't have the GIL problem, but they do suffer from high memory usage. It's common to have multiple instances of the application running and to distribute incoming requests between all of them. The advantage being that processes can easily scale horizontally to multiple machines.

In this post, I will focus on processes and leave the threads discussion for my future posts.

What is a pagelet?

Since I use that word a lot, it's important to explain what I mean. By "pagelet" I mean the part of the web page including any data required for rendering it, including its view template. If you've used cells gem you can think of "pagelet" as a synonym for "cell". Please refer to the gem documentation for more details.

Ajax

The first and the easiest way to get parallel rendering is to use Ajax. Instead of fetching data and rendering all of it in the one request, we can render an empty page without any data. Then we can render each individual portion of data within it's own ajax request and place into the document.

<body> <%= pagelet :pagelets_account_info , remote: :ajax %> </body>

Whilst this is the easiest way, it's not the most efficient for multiple reasons.

Because the request is initiated by the browser thus creating a large delay between the initial request and the ajax request Network overhead is multiplied for each request. The slower the visitor's connection the slower response can become.

Despite all of the negatives of using Ajax, I believe Ajax is still a good option for its simplicity.

Server Side Includes

Most of those negatives are related to network overhead and could be eliminated with Server Side Includes or SSI in short. The way it works - instead of the browser initiating requests and having an extra delay for the connection, the request is sent by the web server (like nginx or apache). Web servers are usually located close to your web application therefore making a fast connection between web server and web application.

Seems like Server Side Includes exist for 19 years, but haven't been used for the last 15 years. The reason for that it became obsolete and was replaced by scripting languages like PHP and ASP in early 2000s. Since it's an old technology it is supported across the five major web servers. There are little reasons to use SSI today, except special cases when we can benefit from it, as I will demonstrate below.

This is a simple example of including one page inside another:

<body> <!--#include virtual="/account_info.html" --> </body>

There is however a small setup required to get SSI working.

You will need to have the requests go through a web server like nginx Server Side Includes feature has to be enabled in the web server

Once you've got SSI working you can start using pagelet_rails gem with SSI rendering mode.

<body> <%= pagelet :pagelets_account_info , remote: :ssi %> </body>

Why Server Side Includes?

Since Server Side Includes is an old technology, why do I talk about it? Because SSI allows us to redistribute request processing into multiple ruby processes and achieve real parallel execution. SSI complements concurrency issues in ruby. This a good example of synergy: old technology + slow ruby = fast response time.

Benchmark

I've run some basic benchmarks to get an idea of the performance improvements. While it's impossible to predict how big the performance boost will be for a particular page, the goal is to give you an idea of what to expect.

All benchmarks were done with the Apache Bench tool. The aggregated data consist of 100 benchmarks, each with 50 requests. All 100 benchmarks differ in 3 ways:

1. Number of pagelets

Each pagelet represents the data which could be fetched and rendered independently. Values are from 1 to 10.

2. Delay for each pagelet

This simulates the time spent retrieving the data from a database. It's assumed that we are not bound by the CPU. Values are 0, 10, 20, 50, 100 and 200ms.

3. Rendering mode

The different rendering modes

view partial

pagelets inline

pagelets SSI (Server Side Include)

Benchmark Results

Labels on the horizontal axis for example 7 x 50ms , mean 7 pagelets with a 50ms delay each. Sorted by size.

![](image (1).png)

It's very unlikely that pages will have more than 5 pagelets, so let's eliminate the rest.

![](image (2).png)

A few observations:

view partial is always faster than pagelets inline . The higher number of pagelets the bigger the difference between them.

pagelets SSI is the slowest for 1 pagelet

By doubling the delay, the response times for the view partial and the pagelets inline are multiplied by the number of pagelets used. However, pagelets SSI response time grows linearly with the used delay.



Let's sort them by delay:

![](image (3).png)

Interestingly those with the pagelets SSI don't show a difference with how many pagelets are rendered with the same delay. This is because it's processed in parallel and the slowest time for one of them always wins.

With 0ms delay both pagelets are much slower, as there is the extra overhead in rendering.



Let's eliminate size 1 and delay 0ms as we have concluded there is no benefit. Also, let's sort by view partial response time and convert the lines for better visibility.

![](image (4).png)

With the increasing load the rate of growth for the pagelets SSI is significantly smaller than others.



Conclusion

While the benchmark does not show the benefits you would get in a real application, it clearly shows the scaling factor of using Server Side Includes. The more pagelets you have or the slower the pagelets are, the higher overall performance improvement.

Because of extra overhead in the pagelets, theoretically your overall throughput (requests per second) will slightly drop. However, it will result in significant improvements in response time for pages with 2 or more pagelets. In a real application there could also be additional overhead to verify a logged in user for each pagelet.

Finally, the rendering in parallel with threads is also another possibility, but I haven't managed to get that working as yet. The good thing is once Ruby 3 is released with the "Guilds" feature, it will be possible to switch Server Side Includes rendering with Guilds with minimal changes to the application.

Discussion on Reddit

Big thanks to Alexia McDonald for editing this post.

Update 1

Removed point about ajax requests concurrency based on comment by nateberkopec, as original information was incorrect and new is considered irrelevant.