9 million hits/day with 120 megs RAM Wednesday, 31 August 2011

Here’s a quick summary if you haven’t time to read the whole thing:

Solaris 5.11 (virtual: Joyent SmartMachine)

PHP 5.3.6 with PHP-FPM: 4 instances running, 10meg APC cache

nginx 0.8.53

Pax 1.0 (my silly self-coded website software… and yes, oops there’s already software with that name)

120 megs of RAM used

Load tested using blitz.io: 9 million+ daily hit capability

The point: I’m not doing anything exotic. I’m doing this as a hobby. This type of performance should be the rule, not the exception, for small websites. Many sites need some improvement to get to that point.

History

My site has only been linked to by John Gruber’s Daring Fireball twice. In 2007, I wrote a piece about Wozniak’s Prius’ top speed. In 2009, I wrote about the sad state of statistical analysis in tech journalism.

He even liked my site’s design! Geek excitement! Sorry. Anyhow…

While Mr. Gruber’s site does tend to crash those he links, my server was thankfully spared the full onslaught of the Daring Fireball audience — the topics I addressed were minor, transient little additions to the dialog between Mr. Gruber and his readers. So, I survived those bursts of traffic. But early this year, I got to thinking: what if my muse humored me and I actually produced something popular? Could my server get the required number of pages onto people’s screens without melting or exploding?

So, in January, I began to refocus my coding efforts on the software powering this website.

Thank You, Shopify

My first goal was to get my PHP execution time down into the realm of Daring Fireball’s. If you pull up the markup on DF’s front page, you’ll notice a commented bit about how long it took to produce:

<!-- 0.0003 seconds -->

After checking this for about 9 months, I can tell you this almost always reads that number: 300 microseconds. This is about one third the time a camera flash illuminates. That’s, well, pretty quick. When I started, my software was taking about 0.25 seconds (250 000 microseconds) to produce the front page of my website. I needed to improve performance by over 800x.

I had already written a nice little PHP class to cache using either APC or memcached, but I had been stymied by how to expire things correctly. Doing this for a hobby, and therefore not being steeped in the best practices of caching, Tobias Lütke’s article The Secret to Memcached hit me like a FREAKIN’ THUNDERBOLT:

At the beginning of each request we load a shop object which we pick depending on the incoming host name. We use the fact that we always load this shop model anyways and add versioning to it. This version column is incremented every time we want to sweep all caches.

AHA. And it works beautifully. Whenever anything in the DB is updated, I update the cache by incrementing that version number; because it is incorporated into all cache ids, all cache ids change. Expired cache items are never explicitly marked as such, they are simply no longer accessed and rotated out when the cache fills up.

Of course, in retrospect, it makes sense to let the cache itself manage rotating out expired items, but it took me a while to realize that. And of course, you don’t understand something until you think it’s obvious. Anyhow, requests that come in while the versioning is being updated still load the stale version. A new cache id is produced because the incremented version number is hashed in.

My blog is quite light on input… and traffic, so worrying about cache stampedes is a bit much right now. After a few weeks running, PHP’s APC gives a nice hit/miss ratio:

This caching (I’m using APC right now) got my page load times down to about 170 microseconds for most pages, and 400 microseconds for the front page, which takes some time to set a cookie or two. The reason for those cookies follows.

Faking Dynamic Features Using Inline Caching

The title of this section could be yet another preposterous acronym: FDFUIC. What it means is, caching can be aggressive but still deliver dynamic features to your visitor.

This was the challenge: I wanted to give each user a personal update on what was added to my website since they last visited. But I wanted to cache only one version of the front page… and serve it to everyone. These two goals seem mutually exclusive. They aren’t. Here’s the solution (scalability notes after the implementation):

PHP : set a cookie recording the time of the user’s visit. Set it to expire in X days. PHP : when assembling the front page, prepend it with all the comments (hidden from view) left on the site in the past X days. My X value for this site is 60 days. PHP : add date information in the ISO8601 format to each hidden comment:

<time datetime="2011-08-27T19:15:38Z" pubdate style="display:none"> JavaScript: load the cookie saved in (1). JS : inspect all comment nodes. If the <time> of a node is after the cookie time, then change the style to make that comment visible. JS : discard the comments with <time> s before the cookie time. JS : check the other items on the front page (continue through the DOM and check each <article> node). Mark nodes with a red dot if their <time> s are after the cookie time.

Interesting tidbit here. I actually used a modified version of John Resig’s “Pretty Date” code snippet, one he put together to live update time on nodes in a twitter clone he was thinking about. The final function I ended up with is available here.

An image follows to explain how it all comes together:

So, if you want to show someone what is new since they have been gone, your first instinct may be to do it dynamically. My point here: for small sites, that’s not always the best solution. Here, we use the fact that the site is small to our advantage: we can easily prepend 60 days worth of comments, but we don’t have too much spare processing power or RAM to dynamically assemble the front page for every user AND maintain robust performance.

Scalability note: if you have a higher traffic website, perhaps you should only set the expiration time to 5 days. Then you won’t be prepending your front page with a lot of unnecessary data/comments (from the other 55 days). If the user visits less frequently than every 5 days, well, then they have a lot to catch up on anyway, and you might as well not overload them with new stuff.

Trial Run

This past May, Hacker News picked up a long piece I wrote about my efforts to improve infinite scroll. I was thrilled that it was pretty popular. I was thrilled my server didn’t melt! However, it came close.

I checked my running processes and found that Apache’s MaxClients parameter was not at all the right fit for my little 256-megs-of-RAM server.

nginx, PHP-FPM

After a few days of research, I installed nginx and PHP-FPM. Unlike the Apache client explosion that happened under load, I get much better control over processes with this set-up. PHP-FPM is set to a max_children of 6 and as I write this has 4 processes running.

nginx, of course, is a beast (in the best possible way: rock solid, low memory usage).

A little tidbit about how PHP & nginx communicate: instead of using a port (with the corresponding overhead), nginx is communicating with PHP over a Unix socket. The relevant part of the config files are as follows.

PHP-FPM:

listen = /tmp/php5-fpm.sock

nginx’s PHP location block:

fastcgi_pass unix:/tmp/php5-fpm.sock;

Fast fast fast.

Memory Use

With a little prstat -Z -s size (remember, this is Solaris), my RSS is currently at 115megs. I’ve run the following rush at blitz.io:

--pattern 1-250:60 -T 4000 -r california http://tumbledry.org/

Yes, the timeout is increased. Give me a break: I can’t make miracles!

I never knew servers could be this efficient. I have a lot to learn.

23 comments left