Having a development environment setup with a proper provisioning tool is

crucial to improving your workflow. Once you’ve got your virtual machine set

up and ready to

go, you need to have some way of ensuring that it’s set up with the software

you need.

(If you’d like, you can go and clone the

companion repository and

play along as we go.)

For this, my tool of choice is Puppet. Puppet is a bit different from other

provisioning systems in that it’s declarative rather than imperative. What do I

mean by that?

Declarative vs Imperative

Let’s say you’re writing your own provisioning tool from scratch. Most likely,

you’re going to be installing packages such as nginx. With your own provisioning

tool, you might just run apt-get (or your package manager of choice) to

install it:

apt-get install nginx

But wait, you don’t want to run this if you’ve already got it set up, so you’re

going to need to check that it’s not already installed, and upgrade it instead

if so.

if $( which nginx ) then apt-get install nginx else apt-get update nginx end

This is relatively easy for basic things like this, but for more complicated

tools, you may have to work this all out yourself.

This is an example of an imperative tool. You say what you want done, and the

tool goes and does it for you. There is a problem though: to be thorough, you

also need to check that it has actually been done.

However, with a declarative tool like Puppet, you simply say how you want your

system to look, and Puppet will work out what to do, and how to transition

between states. This means that you can avoid a lot of boilerplate and checking,

and instead Puppet can work it all out for you.

For the above example, we’d instead have something like the following:

package {'nginx': ensure => latest }

This says to Puppet: make sure the nginx package is installed and up-to-date. It

knows how to handle any transitions between states rather than requiring you to

work this out. I personally prefer Puppet because it makes sense to me to

describe how your system should look rather than writing separate

installation/upgrading/etc routines.

(To WordPress plugin developers, this is also the same approach that WordPress

takes internally with database schema changes. It specifies what the database

should look like, and dbDelta() takes care of transitions.)

Getting It Working

So, now that we know what Puppet is going to give us, how do we get it set up?

Usually, you’d have to go and ensure that you install Puppet on your machine,

but thankfully, Vagrant makes it easy for us. Simply set your provisioning tool

to Puppet and point it at your main manifest file:

config.provision :puppet => { puppet.manifests_path = "manifests" puppet.manifest_file = "site.pp" puppet.module_path = "modules" #puppet.options = '--verbose --debug' }

What exactly is a manifest? A manifest is a file that tells Puppet what you’d

like your system to look like. Puppet also has a feature called modules that add

functionality for your manifests to use, and I’ll touch on that in a bit, but

just trust this configuration for now.

I’m going to assume you’re using WordPress with nginx and PHP-FPM. These

concepts are applicable to everyone, so if you’re not, just follow along

for now.

First off, we need to install the nginx and php5-fpm packages. The following

should be placed into manifests/site.pp :

package {'nginx': ensure => latest } package {'php5-fpm': ensure => latest }

Each of these declarations is called a resource. Resources are the basic

building block of everything in Puppet, and they declare the state of a certain

object. In this case, we’ve declared that we want the state of the nginx and

php5-fpm packages to be ‘latest’ (that is, installed and up-to-date).

The part before the braces is called the “type”. There are a huge number of

built-in types in

Puppet and we’ll also add some of our own later. The first part inside the

braces is called the namevar and must be unique with the type; that is, you can

only have one package {'nginx': } in your entire project. The part after the

colon is called the attributes of the resource.

Next up, let’s set up your MySQL database. Setting up MySQL is a slightly more

complicated task, since it involves many steps (installing, setting

configuration, importing schemas, etc), so we’ll want to use a module instead.

Modules are reusable pieces for manifests. They’re more powerful than normal

manifests, as they can include custom Ruby code that interacts with Puppet, as

well as powerful templates. These can be complicated to create, but they’re

super simple to use.

Puppet Labs (the people behind Puppet itself) publish the canonical MySQL

module, which is what we’ll be

working with here. We’ll want to clone this into our modules directory, which we

set previously in our Vagrantfile.

$ mkdir modules $ cd modules $ git clone git@github.com:puppetlabs/puppetlabs-mysql.git mysql

Now, to use the module, we can go ahead and use the class. I personally don’t

care about the client, so we’ll just install the server:

class { 'mysql::server': config_hash => { 'root_password' => 'password' } }

(You’ll obviously want to change ‘password’ here to something slightly

more secure.)

MySQL isn’t much use to us without the PHP extensions, so we’ll go ahead and get

those as well.

class { 'mysql::php': require => Package['php5-fpm'], }

Notice there’s a new parameter we’re using here, called require . This tells

Puppet that we’re going to need PHP installed first. Why do we need to do this?

Rearranging Puppets

Puppet is a big fan of being as efficient as possible. For example, while we’re

working on installing MySQL, we can go and start setting up our

nginx configuration.

To solve this, Puppet has the concept of dependencies. If any step depends on a

previous one, you have to specify this dependency explicitly. Puppet

splits running into two parts: first, it does compilation of the resources to

work out your dependencies, then it executes the resources in the order

you’ve specified.

There are two ways of doing this in Puppet: you can specify require or

before on individual resources, or you can specify the dependencies all

at once.

# Individual style class { 'mysql::php': require => Package['php5-fpm'], } # Waterfall style Package['php5-fpm'] -> Class['mysql::php']

I personally find that the require style is nicer to maintain, since you can

see at a glance what each resource depends on. I avoid before for the same

reason, but these are stylistic choices and it’s entirely up to you as to which

you use.

You may have noticed a small subtlety here: the dependencies use a different

cased version of the original, with the namevar in square brackets. For example,

if I declare package {'nginx': } , I refer to this later as Package['nginx'] .

This is a somewhat strange thing to get used to when starting out, but you’ll

quickly get used to it.

(We’ll get to namespaced resources soon such as mysql::db {'mydb': } , and the

same rule applies here to each part of the name, so this would become

Mysql::Db['mydb'] .)

Important note: It’s important not to declare your resources with capitals,

as this actually sets the default attributes. Avoid this unless you’re sure you

know what you’re doing.

Setting Up Our Configuration

We’ve now got nginx, PHP, MySQL and the MySQL extensions installed, so we’re now

ready to start configuring it for our liking. Now would be a great time to try

vagrant up and watch Puppet run for the first time!

Let’s now go and set up both our server directories and the nginx configuration

for them. We’ll use the file type for both of these.

file { '/var/www/vagrant.local': ensure => directory } file { '/etc/nginx/sites-available/vagrant.local': source => "file:///vagrant/vagrant.local.nginx.conf" } file { '/etc/nginx/sites-enabled/vagrant.local': ensure => link, target => '/etc/nginx/sites-available/vagrant.local' }

And the nginx configuration for reference, which should be saved to

vagrant.local.nginx.conf next to your Vagrantfile:

server { listen 80; server_name vagrant.local; root /var/www/vagrant.local; location / { try_files $uri $uri/ /index.php$is_args$args; } location ~ .php { try_files $uri =404; fastcgi_split_path_info ^(.+.php)(/.+)$; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; fastcgi_pass unix:/var/run/php5-fpm.sock; fastcgi_index index.php; include /etc/nginx/fastcgi_params; } }

(This is not the best way to do this in Puppet, but we’ll come back to that.)

Next up, let’s configure MySQL. There’s a mysql::db type provided by the MySQL

module we’re using, so we’ll use that. This works the same way as the file and

package types that we’ve already used, but obviously takes some different

parameters:

mysql::db {'wordpress': user => 'root', password => 'password', host => 'localhost', grant => ['all'], require => Class['mysql::server'] }

Let’s Talk About Types, Baby

You’ll notice that we’ve used two different syntaxes above for the MySQL parts:

class {'mysql::php': } mysql::db {'wordpress': }

The differences here are in how these are defined in the module: mysql::php is

defined as a class, whereas mysql::db is a type. These reflect fundamental

differences in what you’re dealing with behind the resource. Things that you

have one of, like system-wide packages, are defined as classes. There’s only one

of these per-system; you can only really install MySQL’s PHP bindings once.

On the other hand, types can be reused for many resources. You can have more

than one database, so this is set up as a reusable type. The same is true for

nginx sites, WordPress installations, and so on.

You’ll use both classes and types all the time, so understanding when each is

used is key.

Moving to Modules

nginx and MySQL are both set up with our settings now, but it’s not really in a

very reusable pattern yet. Our nginx configuration is completely hardcoded for

the site, which means we can’t duplicate this if we want to set up another site

(for example, a staging subdomain).

We’ve used the MySQL module already, but all of our resources are in our

manifests directory at the moment. The manifests directory is more for the

specific machine you’re working on, whereas the modules directory is where our

reusable components should live.

So how do we create a module? First up, we’ll need the right structure. Modules

are essentially self-contained reusable parts, so there’s a certain structure

we use:

modules/<name>/ – The module’s full directory modules/<name>/manifests/ – Manifests for the module, basically the same

as your normal manifests directory modules/<name>/templates/ – Templates for the module, written in Erb modules/<name>/lib/ – Ruby code to provide functionality for your

manifests

– The module’s full directory

(I’m going to use ‘myproject’ as the module’s name here, but replace that with

your own!)

First up, we’ll create our first module manifest. For this first one, we’ll use

the special filename init.pp in the manifests directory. Before, we used

the names mysql::php and mysql::db , but the MySQL module also supplies a

mysql type. Puppet maps a::b to modules/a/manifests/b.pp , but a class

called a maps to modules/a/manifests/init.pp .

Here’s what our init.pp should look like:

class myproject { if ! defined(Package['nginx']) { package {'nginx': ensure => latest } } if ! defined(Package['php5-fpm']) { package {'php5-fpm': ensure => latest } } }

(We’ve wrapped these in defined() calls. It’s important to note that while

Puppet is declarative, this is a compile-time check. If you’re making

redistributable modules, you’ll always want to use this, as you can’t declare

types twice, and users should be able to redefine these in their manifests.)

Next, we want to set up a reusable type for our site-specific resources. To do

this in a reusable way, we also need to take in some parameters. There’s one

special variable passed in automatically, the $title variable, which

represents the namevar. Try to avoid using this directly, but you can use this

as a default for your other variables.

Declaring the type looks the same as defining a function in most languages.

We’ll also update some of our definitions from before.

define myproject::site ( $name = $title, $location, $database = 'wordpress', $database_user = 'root', $database_password = 'password', $database_host = 'localhost' ) { file { $location: ensure => directory } file { "/etc/nginx/sites-available/$name": source => "file:///vagrant/vagrant.local.nginx.conf" } file { "/etc/nginx/sites-enabled/$name": ensure => link, target => "/etc/nginx/sites-available/$name" } mysql::db {$database: user => $database_user, password => $database_password, host => $database_host, grant => ['all'], } }

(This should live in modules/myproject/manifests/site.pp )

Now that we have the module set up, let’s go back to our manifest for Vagrant

( manifests/site.pp ). We’re going to completely replace this now with

the following:

# Although this is declared in myproject, we can declare it here as well for # clarity with dependencies package {'php5-fpm': ensure => latest } class { 'mysql::php': require => [ Class['mysql::server'], Package['php5-fpm'] ], } class { 'mysql::server': config_hash => { 'root_password' => 'password' } } class {'myproject': } myproject::site {'vagrant.local': location => '/var/www/vagrant.local', require => [ Class['mysql::server'], Package['php5-fpm'], Class['mysql::php'] ] }

Note that we still have the MySQL server setup in the Vagrant manifest, as we

might want to split the database off onto a separate server. It’s up to you to

decide how modular you want to be about this.

There’s one problem still in our site definition: we still have a hardcoded

source for our nginx configuration. Puppet offers a great solution to this in

the form of templates. Instead of pointing the file to a source, we can bring

in a template and substitute variables.

Puppet gives us the template() function to do just that, and automatically

supplies all the variables in scope to be replaced. There’s a great

guide and

tutorial that explain this

further, but most of it is self-evident. The main thing to note is that

template() function’s template location is in the form <module>/<filename> ,

which maps to modules/<module>/templates/<filename> .

Our file resource should now look like this instead:

file { "/etc/nginx/sites-available/$name": content => template('myproject/site.nginx.conf.erb') }

Now, we’ll create our template. Note the lack of hardcoded pieces.

server { listen 80; server_name <%= name %>; root <%= location %>; location / { try_files $uri $uri/ /index.php$is_args$args; } location ~ .php { try_files $uri =404; fastcgi_split_path_info ^(.+.php)(/.+)$; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; fastcgi_pass unix:/var/run/php5-fpm.sock; fastcgi_index index.php; include /etc/nginx/fastcgi_params; } }

(This should be saved to modules/myproject/templates/site.nginx.conf.erb )

Our configuration will now be automatically generated, and the name and location

will be imported from the parameters to the typedef.

If you’d really like to go crazy with this, you can basically parameterise

everything you want to change. Here’s an example from one of mine:

server { listen <%= listen %>; server_name <% real_server_name.each do |s_n| -%><%= s_n %> <% end -%>; access_log <%= real_access_log %>; root <%= root %>; <% if listen == '443' %> ssl on; ssl_certificate <%= real_ssl_certificate %>; ssl_certificate_key <%= real_ssl_certificate_key %>; ssl_session_timeout <%= ssl_session_timeout %>; ssl_protocols SSLv2 SSLv3 TLSv1; ssl_ciphers ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP; ssl_prefer_server_ciphers on; <% end -%> <% if $front_controller %> location / { fastcgi_param SCRIPT_FILENAME $document_root/<%= front_controller %>; <% else %> location / { try_files $uri $uri/ /index.php?$args; index <%= index %>; } location ~ .php$ { try_files $uri =404; fastcgi_split_path_info ^(.+.php)(/.+)$; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; <% end -%> fastcgi_pass <%= fastcgi_pass %>; fastcgi_index index.php; include /etc/nginx/fastcgi_params; } location ~ /.ht { deny all; } <% if $builds %> location /static/builds/ { internal; alias <%= root %>/data/builds/; } <% end -%> <% if include != '' %> <%include.each do |inc| %> include <%= inc %>; <% end -%> <% end -%> }

Notifications!

There’s one small problem with our nginx setup. At the moment, our sites won’t

be loaded in by nginx until the next manual restart/reload. Instead, what we

need is a way to tell nginx that we need to reload when the files are updated.

To do this, we’ll first define the nginx service in our init.pp manifest.

service { 'nginx': ensure => running, enable => true, hasrestart => true, restart => '/etc/init.d/nginx reload', require => Package['nginx'] }

Now, we’ll tell our site type to send a notification to the service when we

should reload. We use the notify metaparameter here, and we’ve already set the

service up above to recognise that as a “reload” command.

file { "/etc/nginx/sites-available/$name": content => template('myproject/site.nginx.conf.erb'), notify => Service['nginx'] } file { "/etc/nginx/sites-enabled/$name": ensure => link, target => "/etc/nginx/sites-available/$name", notify => Service['nginx'] }

nginx will now be notified that it needs to reload when we both create/update

the config, as well as when we actually enable it.

(We need it on the config proper in case we update the configuration in the

future, since the symlink won’t change in that case. The notification relates

specifically to the resource, even if said resource is the link itself.)

We should now have a full installation set up and ready to serve from your

Vagrant install. If you haven’t already, boot up your virtual machine:

$ vagrant up

If you change your Puppet manifests, you should reprovision:

$ vagrant provision

Machine vs Application Deployment

There can be a bit of a misunderstanding as to what should be in your Puppet

manifests. This is something that can be a bit confusing, and I must admit that

I was originally confused as well.

Puppet’s main job is to control machine deployment. This includes things like

installing software, setting up configuration, etc. There’s also the separate

issue of application deployment. Application deployment is all about deploying

new versions of your code.

The part where these two can get conflated is installing your application and

configuring it. For WordPress, you usually want to ensure that WordPress itself

is installed. This is something that is probably outside of your application,

since it’s fairly standard, and it only happens once. You should use Puppet here

for the database configuration, since it knows about the system-wide

configuration which is specific to the machine, not the application.

You probably also want to ensure that certain plugins and themes are enabled.

This is something that should not be handled in Puppet, since it’s part of

your application’s configuration. Instead, you should create a must-use plugin

that ensures these are set up correctly. This ensures that if your app is

updated and rolled out, you don’t have to use Puppet to reprovision your server.

(If you do push this into your Puppet configuration, bear in mind that updating

your application will now involve both deploying the code and reprovisioning the

server.)

Wrapping Up

If you’d like, you can now go and clone the

companion repository and

try running it to test it out.

Hopefully by now you should have a good understanding both of Vagrant and

Puppet. It’s time to start applying these tools to your workflow and adjusting

them to how you want to use them. Keep in mind that rules are made to be broken,

so you don’t have to follow my advice to the letter. Experiment, and have fun!