The Kick-Ass Guide to Deploying Nodejs Web Apps in Production

You just spent the last couple weeks coding up a great new nodejs web app. You finally got all the bugs worked out (at least those you can see) and are ready to ship and have your code shine in front of the whole world. Then it hits you, how do you go from running your nodejs web app on your laptop to putting it up on the cold, hard internet.

No thread, no service, no problem?

So what are the problems? First, unlike PHP where you simply upload your code to the server and hit refresh, Nodejs apps should be run as a service (I'll explain why later on). That means there needs to be some way to convert your app into a service and then start it.

Second, apps sometimes crash. Yes, I know this is hard to believe but it does happen. Again, with multi-threaded environments like PHP, this is no problem. If a user, in a PHP environment, experiences an application crash, they can simply start a new session. Not so with nodejs. Nodejs, you see, is single threaded. This means each and every user is tethering and balancing on this single, tenuous thread. Normally, this isn't a problem (and even has its advantages) but all it takes is one bone-head to crash the thread and all of us are thrown into the darkness of the abyss.

Lastly, there is the issue of getting code from your development system to the production server. Of course, you could just FTP the code but then remember that node apps are deployed as services which means each time you copy new code over, you need to stop and restart the service to take advantage of the new code. Instead of fiddling around with FTP and services, wouldn't it be much better to be able to deploy changes to production with a single push of a button and see the changes in seconds?

Your mission, should you accept it

The goal for this article is to show how to deploy updates to a production nodejs site in a manner that is simple, fail-safe and gurantees minimal downtime. When pushing code to production, my goal is to reduce the possibility of error as much as is humanly possible. Scripting most of the process ensures some day I don't, in a hurried moment, forget to run some command and screw up my deployment. Once I get a process that works, I write a script to automate it.

Now, however, careful you are, sooner or later, something is going to go wrong. When that happens, you need to be able to revert quickly to a previous working state. This is why you should be using version control. You are using version control, right? Since we are using version control to maintain our code, it makes sense to use it to push code to the production server as well.

Who this guide is for

You should be able to follow this guide easily if you have, at least a basic knowledge of Linux, Git and Nginx (you should at least know enough to install these software). Also, it is going to help if you are not religious (by which I mean, you care more about doing stuff that works over doing things the One True WayTM).

Prerequisites

You will need the following setup

Production server

Development Server

Both servers should have the following services installed and running

Nodejs

Git

nginx

Server side up

On the production server, set up two directories, one for the production code and the other for the production repository

mkdir /srv/www/mysite.com // production code mkdir /srv/repo/mysite.com.git // production git repository

initialize the repository as a bare repository i.e. no source files, just version control

cd /srv/repo/mysite.com.git git init --bare

create or modify the post-receive hooks of the repository

cd /srv/repo/mysite.com.git/hooks touch post-receive nano post-receive

post-receive

#!/bin/sh error_exit () { echo "$1" 1>&2 exit 1 } git --work-tree=/srv/www/mysite.com --git-dir=/srv/repo/mysite.com.git checkout -f cd /srv/www/mysite.com || error_exit "error changing directory!. now exiting..." npm install || error_exit "error running npm install! now exiting ..."

make post-receive executable

chmod +x post-receive

So, quick explanation on what is happening. The post-receive hook script runs every time the Git repo is updated. Hence, whenever code is pushed to the Git repository, the post-receive script copies it over to the production code directory. Next, the script runs npm to install or update any modules that may be needed.

Local development server

Now let us move over to the development server. Initialize your Git installation

git config --global user.name "my_name" git config --global user.email "my_email"

Start tracking your project with Git

cd /sites/myproject git init

Instruct Git to ignore node_modules directory, log files and any files specific to your local installation. In order for Git to ignore files in a directory, you need to specify them in the .gitignore file (create one if necessary)

/sites/myproject/.gitignore

/sites/myproject/node_modules

NOTE: If you have a directory, say "/logs", that contains all the log files and you instruct Git to ignore all the files in it by including the following in .gitignore

/sites/myproject/logs/*

Git will ignore all the files in the log directory but will also ignore the log directory as well. This is because Git is a file tracking system not a directory tracking system. If a directory is empty (as far as Git is concerned), it will be ignored. In order to maintain your directory structure, place another .gitignore file inside the log directory.

/sites/myproject/logs/.gitignore

* !.gitignore

This tells Git to ignore every file in this directory except for any files named .gitignore. Since there is at least one file being tracked by Git in this directory (the .gitignore file), the folder structure will be maintained in code commits

Version control

Now we need to set up a remote repository to keep our code. This way if your development system gets run over by a bus, your code and, most importantly, your version control history is safe. For this, we will use GitHub (just because all the cool kids are using it). There are lots of other equally good alternatives, for example BitBucket, so just pick the one that you prefer.

Let's tell Git where to find the remote repository

git remote add origin git@github.com:your_github_name/mysite.git // let's push all the code now git add -A git commit -m 'my first commit' git push -u origin --all # pushes up the repo and its refs for the first time git push -u origin --tags # pushes up any tags

Now your project is in the GitHub repository for safe keeping and Git is keeping track of any code changes to the development system.

Since we are going to use Git to push code to the production server, we need to tell Git where to find the production server repository

/* * make sure to replace "my_user_name" with your ssh user name for * the production server and "1.1.1.1" with the IP address or hostname * of your production server */ git remote add production my_user_name@1.1.1.1/srv/repo/mysite.com.git

Now, push your code to the production server

git push production master

Great! The code is on the production server but now what?

Call to service (we are back to the production server for this section)

Remember I said we need to run node apps as services because we want to be able to restart the app in the event of a crash? Why not just use a node module like forever-monitor to handle restarts? This is a trick question because you have to ask yourself what happens when forever-monitor crashes (and you know it will)? Do you get another system to restart forever-monitor? What restarts that system? Unless you believe in turtles all the way down, it is clear that down this path, madness lies.

The solution, of course, is to deploy your node app as a service. This way you can control them with the system manager just like any other linux service. We are going to use Upstart to manage our node app.

Now, those of you who know about the Great System Manager Wars just did a spit take when I mentioned Upstart. For those others who don't know (or more likely were busy getting on with, like their lives), over the last couple years, there raged a great debate about which system management tool would replace the ageing init.d system on Linux. In the red corner was Upstart (championed mainly by Canonical, the company behind the popular Ubuntu distribution) and in the blue corner was Systemd (championed primarily by pretty much everyone else). The battle was bitter and lots of mean things were said by both sides but eventually, Systemd was declared the winner. So why am I still using Upstart?

I use Upstart for the best of reasons; it works. Also, my servers (which run on Ubuntu 14.04) come with Upstart turned on as the default system manager. I could spend an hour switching over to Systemd but, given previous experience, I would likely screw something up horribly and spend several hours fixing the problem. Even worse, I could screw something up subtly and have intermittent problems with my setup for the next couple weeks. And after all this hassle, I would still end up exactly where I am right now; with a system manager that stops, starts and restart my node process, so no, I am not going to switch any time soon. If your distribution has Systemd running, feel free to use it instead of Upstart.

To create a service using Upstart, create a file mysite.com.conf

cd /etc/init/ touch mysite.com.conf nano mysite.com.conf

mysite.com.conf

# Upstart job definition for mysite.com description "mysite.com" author "my_name" # start the service when system starts start on runlevel [2345] # stop the service when system is shutdown stop on runlevel [06] # prepare the environment # Create directories for logging and process management # Change ownership to the user running the process pre-start script mkdir -p /var/opt/node mkdir -p /var/opt/node/log mkdir -p /var/opt/node/run chown -R my_name:my_name /var/opt/node end script # if the process quits unexpectedly, trigger a respawn respawn env NODE_ENV=production env PORT=3000 # start the process exec start-stop-daemon --start --chuid my_name --make-pidfile --pidfile /var/opt/node/run/mysite.com.pid --exec /usr/bin/node -- /srv/www/mysite.com/app.js >> /var/opt/node/log/mysite.com.log 2>&1

Now, you have a service, mysite.com, that can be controlled from the command line

start mysite.com // start service stop mysite.com // stop service restart mysite.com // stop and then start service

All is good so far but if you remember, the post-receive script for our production repository dutifully copies the code across to the production folder. However, nodejs apps compile at runtime and so any changes to the code are not automatically reflected in the running application unless the app is restarted. So, let's update the post-receive script to restart the service after any new code is committed

cd /srv/repo/mysite.com.git/hooks nano post-receive

post-receive

#!/bin/sh error_exit () { echo "$1" 1>&2 exit 1 } git --work-tree=/srv/www/mysite.com --git-dir=/srv/repo/mysite.com.git checkout -f cd /srv/www/mysite.com || error_exit "error changing directory!. now exiting..." npm install || error_exit "error running npm install! now exiting ..." # restart service /sbin/restart mysite.com

Depending on your setup, you may need to run the "restart" command under sudo. In order to have that run with no issues in a script, you should edit the sudoers file to grant the user account running the script the ability to run sudo on this particular command (and this comman ONLY) without requiring a password.

NOTE: While this is normally a safe thing to do, if you do it wrong, you can introduce significant security holes into your setup.

To edit sudoers, use visudo

sudo visudo

/etc/sudoers

# # This file MUST be edited with the 'visudo' command as root. # # Please consider adding local content in /etc/sudoers.d/ instead of # directly modifying this file. # # See the man page for details on how to write a sudoers file. # Defaults env_reset Defaults mail_badpass Defaults secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" # Host alias specification # User alias specification User_Alias WEB_APP_MANAGER = myname # Cmnd alias specification # User privilege specification root ALL=(ALL:ALL) ALL # Members of the admin group may gain root privileges %admin ALL=(ALL) ALL # Allow members of group sudo to execute any command %sudo ALL=(ALL:ALL) ALL # Allow web app managers to start, stop and restart web app services without requiring a password # needed for managing services in git hooks post-receive WEB_APP_MANAGER ALL = NOPASSWD: /sbin/start mysite.com, /sbin/stop/mysite.com, /sbin/restart mysite.com # See sudoers(5) for more information on "#include" directives: #includedir /etc/sudoers.d

Now, the post-receive script will no longer require a password to run the sudo command to stop, start or restart the node app service

Gentlemen, start your nginx

Now comes the choice of webserver to use. Nodejs is a peculiar programming environment in that it actually comes with its own built-in webserver. All it takes is a few lines of code and nodejs handles all of your web traffic

// set up nodejs web server for example.com var http = require('http') , express = require('express') ; var app = express(); http.createServer(app).listen(80, 'example.com', function(){ // process http requests here res.send("Wasssup!"); });

Since nodejs kindly does all this web server-ing for us, why bother with nginx, another web server? Here's why

Nodejs is a programming environment that has a web server built in. nginx is purely a web server. Just from that description alone, which one do you think is going to be a better web server? The web server that is bolted on or the one that is carefully maintained by a large and growing community of people passionate about serving web pages? nginx can be configured to serve all the static content (css, js, images, etc) without having to pass the request on to Nodejs at all. Remember what I said earlier about Nodejs being single threaded and all that? Well, I just saved you a bunch of requests tying up resources on that thread. You're welcome. More importantly, running Nodejs behind an nginx proxy allows you the flexibilty to run multiple Nodejs applications from a single server or IP address. Let me explain.

Let's go back to the Nodejs http server I configured earlier

// set up nodejs web server for example.com var http = require('http') , express = require('express') ; var app = express(); http.createServer(app).listen(80, 'example.com', function(){ // process http requests here res.send("Wasssup!"); });

Any requests made to example.com will be processed by this server (and display an annoying catch phrase from a turn of the century beer commercial). What if we want to place another node app on this same server? Port 80 is already taken so we would need to use some other port, say 8080.

http.createServer(app).listen(8080, 'secondSite.com', function(){ // process http requests here res.send("Just do it!"); });

Now, users would have to access our new website with a domain like "secondSite.com:8080". It's hard enough getting people to remember the address of a website, imagine having to convince them to also remember the port number.

Pride and Privileges

By the way, there is one last but very important reason you should not be using the node web server for production. The standard port for http requests is 80 (i.e. a request made to "example.com" is the same as targeting "example.com:80"). However, public facing node apps should not use ports less than 1024. This is because ports under 1024 are privileged, meaning only users with root privileges can access them. If your node app wants to use these ports, you must run it under a user with root access. This means if anyone is able to compromise your website in a way that allows them to run server commands, those commands will be run as root. Not a good scenario (actually, this is a very bad, no good scenario).

Now we have all agreed nginx is the way to go (you could also use Apache but I find it is not as flexible as nginx), let's get set up.

The nginx configuration is in nginx.conf /etc/nginx/nginx.conf

user nginx; worker_processes 2; error_log /var/log/nginx/error.log warn; pid /var/run/nginx.pid; events { worker_connections 1024; } http { proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=one:8m max_size=3000m inactive=600m; proxy_temp_path /var/tmp; include /etc/nginx/mime.types; default_type application/octet-stream; log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log /var/log/nginx/access.log main; sendfile on; #tcp_nopush on; keepalive_timeout 65; # compress http responses gzip on; gzip_http_version 1.1; gzip_comp_level 6; gzip_vary on; gzip_min_length 1000; gzip_proxied any; gzip_types text/plain text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript; gzip_buffers 16 8k; #include /etc/nginx/conf.d/*.conf; # extend nginx configuration include /etc/nginx/sites-enabled/*; }

The first section provide some basic configuration information

user nginx; worker_processes 2; error_log /var/log/nginx/error.log warn; pid /var/run/nginx.pid; events { worker_connections 1024; }

This says the nginx process runs under the user "nginx" and there are two worker processes (roughly, the number of worker processes should be equal to the number of cores on your server).

It defines the lcoation of some log files and limits the number of concurrent connections to 1024.

The next section defines nginx behavior for http requests

http { # proxy setting proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=one:8m max_size=3000m inactive=600m; proxy_temp_path /var/tmp; # mime type processing include /etc/nginx/mime.types; default_type application/octet-stream; log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log /var/log/nginx/access.log main; sendfile on; keepalive_timeout 65; # compress responses gzip on; gzip_http_version 1.1; gzip_comp_level 6; gzip_vary on; gzip_min_length 1000; gzip_proxied any; gzip_types text/plain text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript; gzip_buffers 16 8k; #include /etc/nginx/conf.d/*.conf; # extend nginx configuration include /etc/nginx/sites-enabled/*; }

The first two lines instruct nginx how to cache requests for performance purposes. nginx caching is actually turned off by default and you need to specify the proxy_cache directive in the relevant server sections as required.

The next two define how nginx handles mime types and sets the defaults.

# proxy settings proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=one:8m max_size=3000m inactive=600m; proxy_temp_path /var/tmp; # mime type processing include /etc/nginx/mime.types; default_type application/octet-stream;

The next two lines describe the logging file and format

log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log /var/log/nginx/access.log main;

and ... the next two lines

sendfile on; keepalive_timeout 65;

To be quite honest, I am not quite sure what sendFile does. All I know is, if you are running nginx in a virtual environment (e.g. Virtual Box), you should turn it off otherwise you will have all sorts of strange problems.

keepalive_timeout governs the period of time before nginx closes the connection with the client.

This is followed by instructions to compress responses to save bandwidth and reduce page load time

# compress responses gzip on; gzip_http_version 1.1; gzip_comp_level 6; gzip_vary on; gzip_min_length 1000; gzip_proxied any; gzip_types text/plain text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript; gzip_buffers 16 8k;

the last lines instruct nginx where to find more configuration files.

#include /etc/nginx/conf.d/*.conf; # extend nginx configuration include /etc/nginx/sites-enabled/*;

This is for convenience and extensibility. Each website served by nginx can have a separate configuration file instead of having them all crammed into one humongous, hard to edit configuration file. This way, you can also add, edit and remove each sites' configurations without risk of messing up any of the other's configurations.

To make a configuration file for your web app, create a configuration file in /etc/nginx/sites-available

cd /etc/nginx/sites-available touch mysite.com.conf nano mysite.com.conf

/etc/nginx/sites-available/mysite.com.conf

upstream app_mysite.com { server 127.0.0.1:3000; } server { listen 80; server_name mysite.com; # serve static assets from public folder location ~ ^/(images/|img/|javascript/|js/|css/|stylesheets/|flash/|media/|static/|robots.txt|humans.txt) { root /srv/www/mysite.com; try_files /public$uri /handy/public$uri =404; access_log off; expires max; } location / { proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header Host $http_host; proxy_set_header X-NginX-Proxy true; proxy_pass http://app_mysite.com/; proxy_redirect off; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; } access_log /srv/www/mysite.com/logs/access.log; error_log /srv/www/mysite.com/logs/error.log; }

Let's break this down. Ignore the first section for the moment and jump to the server section

listen 80; server_name mysite.com;

This tells nginx that everything in this section pertains to requests on port 80 addressed to hostname "mysite.com". If there is another server block with a different port number or hostname that matches the request, nginx will use the configuration settings specified in that server block instead of this one.

# serve static assets from public folder location ~ ^/(images/|img/|javascript/|js/|css/|stylesheets/|flash/|media/|static/|robots.txt|humans.txt) { root /srv/www/mysite.com; try_files /public$uri =404; access_log off; expires max; }

This instructs nginx to serve any requests for static files (as defined by any of the following paths; "images/", "/img", "/javascript/", "/js", "/css", "/stylesheets", "/flash", "/media", "/static", "robots.txt", "humans.txt") by finding the matching files in "/srv/www/mysite.com/public". This way, nodejs is freed from having to deal with any requests for static files.

location / { proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header Host $http_host; proxy_set_header X-NginX-Proxy true; proxy_pass http://app_mysite.com/; proxy_redirect off; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; }

This sets up nginx to proxy requests to the destination specified in the section "Upstream app_mysite.com".

upstream app_mysite.com { server 127.0.0.1:3000; }

This section instructs nginx where to proxy the request. In this case, any request for "mysite.com:80" received by nginx is sent to "127.0.0.1:3000". Of course, your nodejs app should be listening on "127.0.0.1:3000" so it can deliver the response.

// set up nodejs web server to run behind nginx proxy var http = require('http') , express = require('express') ; var app = express(); /* enable handy to run behind a proxy e.g. nginx. * comment this out if using the node web server * to serve requests directly */ app.enable('trust proxy'); http.createServer(app).listen(3000, 127.0.0.1, function(){ // process http requests here res.send("Wasssup!"); });

To make this configuration available to nginx, we need to symlink this file to the sites-enabled directory

ln -s /etc/nginx/sites-available/mysite.com.conf /etc/nginx/sites-enabled/mysite.com.conf

For the configuration to take effect, reload nginx configurations

nginx -s reload

That's it. You site is now running on the production server and pushing new code is simple

on development server

cd /sites/myproject git add -A git commit -m 'I just added this cool new feature' git push origin master git push production master

Finally, since I am too lazy to write five commands each time I want to push code to production, I simply write a script, "deploy_to_production" to do it for me.

deploy_to_production.sh

#!/bin/sh error_exit () { echo "$1" 1>&2 exit 1 } echo -n "Enter commit message": read commit_message cd /sites/myproject || error_exit "error changing directory!. now exiting..." git add -A git commit -m "$commit_message" git push origin git push production master

Now, deploying code to production is just one command

deploy_to_production.sh

This will commit the code changes, push them to the remote repository and production server and then restart the production node app in order to take advantage of the change.

Of course, in a real scenario, you would want to set up a "beta" or test site to which you will first deploy and test changes before pushing to the actual production site. You should simply follow these instructions to set up the "beta" site first, push changes to it and then, once you've ascertained that all is well, push code to the actual production site.