In this post, we will explore the use of Ansible, the open source community project sponsored by Red Hat, for automating the provisioning, configuration, deployment to, and testing of resources on the Google Cloud Platform (GCP). We will start by using Ansible to configure and deploy applications to existing GCP compute resources. We will then expand our use of Ansible to provision and configure GCP compute resources using the Ansible/GCP native integration with GCP modules.

Red Hat Ansible

Ansible, purchased by Red Hat in October 2015, seamlessly provides workflow orchestration with configuration management, provisioning, and application deployment in a single platform. Unlike similar tools, Ansible’s workflow automation is agentless, relying on Secure Shell(SSH) and Windows Remote Management (WinRM). Ansible has published a whitepaper on The Benefits of Agentless Architecture.

According to G2 Crowd, Ansible is a clear leader in the Configuration Management Software category, ranked right behind GitLab. Some of Ansible’s main competitors in the category includes GitLab, AWS Config, Puppet, Chef, Codenvy, HashiCorp Terraform, Octopus Deploy, and TeamCity. There are dozens of published articles, comparing Ansible to Puppet, Chef, SaltStack, and more recently, Terraform.

Google Compute Engine

According to Google, Google Compute Engine (GCE) delivers virtual machines (VMs) running in Google’s data centers and on their worldwide fiber network. Compute Engine’s tooling and workflow support enables scaling from single instances to global, load-balanced cloud computing.

Comparable products to GCE in the IaaS category include Amazon Elastic Compute Cloud(EC2), Azure Virtual Machines, IBM Cloud Virtual Servers, and Oracle Compute Cloud Service.

Apache HTTP Server

According to Apache, the Apache HTTP Server (“httpd”) is an open-source HTTP server for modern operating systems including Linux and Windows. The Apache HTTP Server provides a secure, efficient, and extensible server that provides HTTP services in sync with the current HTTP standards. The Apache HTTP Server was launched in 1995 and it has been the most popular web server on the Internet since 1996. We will deploy Apache HTTP Server to GCE VMs, using Ansible.

Demonstration

In this post, we will demonstrate two different workflows with Ansible on GCP. First, we will use Ansible to configure and deploy the Apache HTTP Server to an existing GCE instance.

Provision and configure a GCE VM instance, disk, firewall rule, and external IP, using the Google Cloud ( gcloud ) CLI tool; Deploy and configure the Apache HTTP Server and associated packages, using an Ansible Playbook containing an httpd Ansible Role; Manually test the GCP resources and Apache HTTP Server; Clean up the GCP resources using the gcloud CLI tool;

In the second workflow, we will use Ansible to provision and configure the GCP resources, as well as deploy the Apache HTTP Server the new GCE VM.

Provision and configure a VM instance, disk, VPC global network, subnetwork, firewall rules, and external IP address, using an Ansible Playbook containing an Ansible Role, as opposed to the gcloud CLI tool; Deploy and configure the Apache HTTP Server and associated packages, using an Ansible Playbook containing an httpd Ansible Role; Test the GCP resources and Apache HTTP Server using role-based test tasks; Clean up all the GCP resources using an Ansible Playbook containing an Ansible Role;

Source Code

The source code for this post may be found on the master branch of the ansible-gcp-demo GitHub repository.



https://github.com/garystafford/ansible-gcp-demo.git git clone --branch master --single-branch --depth 1 --no-tags \

The project has the following file structure.

.

├── LICENSE

├── README.md

├── _unused

│ ├── httpd_playbook.yml

├── ansible

│ ├── ansible.cfg

│ ├── group_vars

│ │ └── webservers.yml

│ ├── inventories

│ │ ├── hosts

│ │ └── webservers_gcp.yml

│ ├── playbooks

│ │ ├── 10_webserver_infra.yml

│ │ └── 20_webserver_config.yml

│ ├── roles

│ │ ├── gcpweb

│ │ └── httpd

│ └── site.yml

├── part0_source_creds.sh

├── part1_create_vm.sh

└── part2_clean_up.sh

Source code samples in this post are displayed as GitHub Gists which may not display correctly on all mobile and social media browsers, such as LinkedIn.

Setup New GCP Project

For this demonstration, I have created a new GCP Project containing a new service account and public SSH key. The project’s service account will be used the gcloud CLI tool and Ansible to access and provision compute resources within the project. The SSH key will be used by both tools to SSH into GCE VM within the project. Start by creating a new GCP Project.

Add a new service account to the project on the IAM & admin ⇒ Service accounts tab.

Grant the new service account permission to the ‘Compute Admin’ Role, within the project, using the Role drop-down menu. The principle of least privilege (PoLP) suggests we should limit the service account’s permissions to only the role(s) necessary to provision the required compute resources.

Create a private key for the service account, on the IAM & admin ⇒ Service accounts tab. This private key is different than the SSH key will add to the project, next. This private key contains the credentials for the service account.

Choose the JSON key type.

Download the private key JSON file and place it in a safe location, accessible to Ansible. Be careful not to check this file into source control. Again, this file contains the service account’s credentials used to programmatically access GCP and administer compute resources.

We should now have a service account, associated with the new GCP project, with permissions to the ‘Compute Admin’ role, and a private key which has been downloaded and accessible to Ansible. Note the Email address of the service account, in my case, ansible@ansible-gce-demo.iam.gserviceaccount.com ; you will need to reference this later in your configuration.

Next, create an SSH public/private key pair. The SSH key will be used to programmatically access the GCE VM. Creating a separate key pair allows you to limit its use to just the new GCP project. If compromised, the key pair is easily deleted and replaced in the GCP project and in the Ansible configuration. On a Mac, you can use the following commands to create a new key pair and copy the public key to the clipboard.

ssh-keygen -t rsa -b 4096 -C "ansible"

cat ~/.ssh/ansible.pub | pbcopy

Add your new public key clipboard contents to the project, on the Compute Engine ⇒ Metadata ⇒ SSH Keys tab. Adding the key here means it is usable by any VM in the project unless you explicitly block this option when provisioning a new VM and configure a key specifically for that VM.

Note the name, ansible , associated with the key, you will need to reference this later in your configuration.

Setup Ansible

Although this post is not a primer on Ansible, I will cover a few setup steps I have done to prepare for this demo. On my Mac, I am running Python 3.7, pip 18.1, and Ansible 2.7.6. With Python and pip installed, the easiest way to install Ansible in Mac or Linux is using pip.

pip install ansible

You will also need to install two additional packages in order to gather information about GCP-based hosts using GCE Dynamic Inventory, explained later in the post.

pip install requests google-auth

Ansible Configuration

I created a simple Ansible ansible.cfg file for this project, located in the /ansible/inventories/ sub-directory. The Ansible configuration file contains the location of the project’s roles and inventory, which is explained later. The file also contains two configuration items associated with an SSH key pair, which we just created. If your key is named differently or in a different location, update the file (gist).

Ansible has a complete example of a configuration file parameters on GitHub.

Ansible Environment Variables

To decouple our specific GCP project’s credentials from the Ansible playbooks and roles, Ansible recommends setting those required module parameters as environment variables, as opposed to including them in the playbooks. Additionally, I have set the GCP project name as an environment variable, in order to also decouple it from the playbooks. To set those environment variables, source the part0_source_creds.sh script in the project’s root directory, using the source command (gist).

source ./part0_source_creds.sh

GCP CLI/Ansible Hybrid Workflow

Oftentimes, enterprises employ a mix of DevOps tooling to provision, configure, and deploy to compute resources. In this first workflow, we will use Ansible to configure and deploy a web server to an existing GCE VM, created in advance with the gcloud CLI tool.

Create GCP Resources

First, use the gcloud CLI tool to create a GCE VM and associated resources, including an external IP address and firewall rule for port 80 (HTTP). For simplicity, we will use the existing GCP default Virtual Private Cloud (VPC) network and the default us-east1 subnetwork. Execute the part1_create_vm.sh script in the project’s root directory. The default network should already have port 22 (SSH) open on the firewall. Note the SERVICE_ACCOUNT variable, in the script, is the service account email found on the IAM & admin ⇒ Service accounts tab, shown in the previous section (gist).

The output from the script should look similar to the following. Note the external IP address associated with the VM, you will need to reference this later in the post.

Using the gcloud CLI tool or Google Cloud Console, we should be able to view our newly provisioned resources on GCP. First, our new GCE VM, using the Compute Engine ⇒ VM instances ⇒ Details tab.

Next, examine the Network interface details tab. Here we see details about the network and subnetwork our VM is running within. We see the internal and external IP addresses of the VM. We also see the firewall rules, including our new rule, allowing TCP ingress traffic on port 80.

Lastly, examine the new firewall rule, which will allow TCP traffic on port 80 from any IP address to our VM, located in the default network. Note the other, pre-existing rules controlling access to the default network.

The final GCP architecture looks as follows.

GCE Dynamic Inventory

Two core concepts in Ansible are hosts and inventory. We need an inventory of the hosts on which to run our Ansible playbooks. If we had long-lived hosts, often referred to as ‘pets’, who had long-lived static IP addresses or DNS entries, then we could manually add the hosts to a static hosts file, similar to the example below.

[webservers]

34.73.171.5

34.73.170.97

34.73.172.153



[dbservers]

db1.example.com

db2.example.com

However, given the ephemeral nature of the cloud, where hosts (often referred to as ‘cattle’), IP addresses, and even DNS entries are often short-lived, we will use the Ansible concept of Dynamic Inventory.

If you recall we pip installed two packages, requests and google-auth , during our Ansible setup for use with GCE Dynamic Inventory. According to Ansible, the best way to interact with your GCE VM hosts is to use the gcp_compute inventory plugin. The plugin allows Ansible to dynamically query GCE for the nodes that can be managed. With the gcp_compute inventory plugin, we can also selectively classify the hosts we find into Groups. We will then run playbooks, containing roles, on a group or groups of hosts.

To demonstrate how to dynamically find the new GCE host, and add it to a group, execute the following command, using the Ansible Inventory CLI.

ansible-inventory --graph -i inventories/webservers_gcp.yml

The command calls the webservers_gcp.yml file, which contains logic necessary to associate the GCE hosts with the webservers host group. Ansible’s current documentation is pretty sparse on this subject. Thanks to Matthieu Remy for his great post, How to Use Ansible GCP Compute Inventory Plugin. For this demo, we are only looking for hosts in us-east1-b, which have ‘web-’ in their name. (gist).

The output from the command should look similar to the following. We should observe our new VM, as indicated by its external IP address, is assigned to the part of the webservers group. We will use the power of Dynamic Inventory to apply a playlist to all the hosts within the webservers group.

We can also view details about hosts by modifying the inventory command.

ansible-inventory --list -i inventories/webservers_gcp.yml --yaml

The output from the command should look similar to the following. This particular example was run against an earlier host, with a different external IP address.

Apache HTTP Server Playbook

For our first taste of Ansible on GCP, we will run an Ansible Playbook to install and configure the Apache HTTP Server on the new CentOS-based VM. According to Ansible, Playbooks, which are YAML-based, can declare configurations, they can also orchestrate steps of any manual ordered process, even as different steps must bounce back and forth between sets of machines in particular orders. They can launch tasks synchronously or asynchronously. Playbooks are used to orchestrate tasks, as opposed to using Ansible’s ad-hoc task execution mode.

A playbook can be ‘monolithic’ in nature, containing all the required Variables, Tasks, and Handlers, to achieve the desired outcome. If we wrote a single playbook to deploy and configure our Apache HTTP Server, it might look like the httpd_playbook.yml , playbook, below (gist).

We could run this playbook with the following command to deploy the Apache HTTP Server, but we won’t. Instead, next, we will run a playbook that applies the httpd role.

ansible-playbook \

-i inventories/webservers_gcp.yml \

playbooks/ httpd_playbook.yml

Ansible Roles

According to Ansible, Roles are ways of automatically loading certain vars_files, tasks, and handlers based on a known file structure. Grouping content by roles also allows easy sharing of roles with other users. The usage of roles is preferred as it provides a nice organizational system.

The httpd role is identical in functionality to the httpd_playbook.yml , used in the first workflow. However, the primary parts of the playbook have been decomposed into individual resource files, as described by Ansible. This structure is created using the Ansible Galaxy CLI. Ansible Galaxy is Ansible’s official hub for sharing Ansible content.

ansible-galaxy init httpd

This ansible-galaxy command creates the following structure. I added the files and Jinja2 template, afterward.

.

├── README.md

├── defaults

│ └── main.yml

├── files

│ ├── info.php

│ └── server-status.conf

├── handlers

│ └── main.yml

├── meta

│ └── main.yml

├── tasks

│ └── main.yml

├── templates

│ └── index.html.j2

├── tests

│ ├── inventory

│ └── test.yml

└── vars

└── main.yml

Within the httpd role:

Variables are stored in the defaults/main.yml file;

file; Tasks are stored in the tasks/main.yml file;

file; Handles are stored in the handlers/main.yml file;

file; Files are stored in the files/ sub-directory;

sub-directory; Jinja2 templates are stored in the templates/ sub-directory;

sub-directory; Test are stored in the tests/ sub-directory;

sub-directory; Other sub-directories and files contain metadata about the role;

To apply the httpd role, we will run the 20_webserver_config.yml playbook. Compare this playbook, below, with the previous, monolithic httpd_playbook.yml playbook. All of the logic has now been decomposed across the httpd role’s separate backing files (gist).

We can start by running our playbook using Ansible’s Check Mode (“Dry Run”). When ansible-playbook is run with --check , Ansible will not make any actual changes to the remote systems. According to Ansible, Check mode is just a simulation, and if you have steps that use conditionals that depend on the results of prior commands, it may be less useful for you. However, it is great for one-node-at-time basic configuration management use cases. Execute the following command using Check mode.

ansible-playbook \

-i inventories/webservers_gcp.yml \

playbooks/20_webserver_config.yml --check

The output from the command should look similar to the following. It shows that if we execute the actual command, we should expect seven changes to occur.

If everything looks good, then run the same command without using Check mode.

ansible-playbook \

-i inventories/webservers_gcp.yml \

playbooks/20_webserver_config.yml

The output from the command should look similar to the following. Note the number of items changed, seven, is identical to the results of using Check mode, above.

If we were to execute the command using Check mode for a second time, we should observe zero changed items. This means the last command successfully applied all changes and no new changes are present in the playbook.

Testing the Results

There are a number of methods and tools we could use to test the deployments of the Apache HTTP Server and server tools. First, we can use an ad-hoc ansible CLI command to confirm the httpd process is running on the VM, by calling systemctl . The systemctl application is used to introspect and control the state of the systemd system and service manager, running on the CentOS-based VM.

ansible webservers \

-i inventories/webservers_gcp.yml \

-a "systemctl status httpd"

The output from the command should look similar to the following. We see the Apache HTTP Server service details. We also see it being stopped and started as required by the tasks and handler in the role.

We can also check that the home page and PHP info documents, we deployed as part of the playbook, are in the correct location on the VM.

ansible webservers \

-i inventories/webservers_gcp.yml \

-a "ls -al /var/www/html"

The output from the command should look similar to the following. We see the two documents we deployed are in the root of the website directory.

Next, view our website’s home page by pointing your web browser to the external IP address we created earlier and associated with the VM, on port 80 (HTTP). We should observe the variable value in the playbook, ‘Hello Ansible on GCP!’, was injected into the Jinja2 template file, index.html.j2 , and the page deployed correctly to the VM.

If you recall from the httpd role, we had a task to deploy the server status configuration file. This configuration file exposes the /server-status endpoint, as shown below. The status page shows the internal and the external IP addresses assigned to the VM. It also shows the current version of Apache HTTP Server and PHP, server uptime, traffic, load, CPU usage, number of requests, number of running processes, and so forth.

Testing with Apache Bench

Apache Bench ( ab ) is the Apache HTTP server benchmarking tool. We can use Apache Bench locally, to generate CPU, memory, file, and network I/O loads on the VM. For example, using the following command, we can generate 100K requests to the server-status page, simulating 100 concurrent users.

ab -kc 100 -n 100000 http://your_vms_external_ip/server-status

The output from the command should look similar to the following. Observe this command successfully resulted in a sustained load on the web server for approximately 17.5 minutes.

Using the Compute Engine ⇒ VM instances ⇒ Monitoring tab, we see the corresponding Apache Bench CPU, memory, file, and network load on the VM, starting at about 10:03 AM, soon after running the playbook to install Apache HTTP Server.

Destroy GCP Resources

After exploring the results of our workflow, tear down the existing GCE resources before we continue to the next workflow. To delete resources, execute the part2_clean_up.sh script in the project’s root directory (gist).

The output from the script should look similar to the following.

Ansible Workflow

In the second workflow, we will provision and configure the GCP resources, and deploy Apache HTTP Server to the new GCE VM using Ansible. We will be using the same Project, Region, and Zone as the previous example. However this time, we will create a new global VPC network instead of using the default network as before, a new subnetwork instead of using the default subnetwork as before, and a new firewall with ingress rules to open ports 22 and 80. Lastly, will create an external IP address and assign it to the VM.

Provision GCP Resources

Instead of using the gcloud CLI tool, we will use Ansible to provision the GCP resources. To accomplish this, I have created one playbook, 10_webserver_infra.yml , with one role, gcpweb , but two sets of tasks, one to create the GCE resources, create.yml , and one to delete the GCP resources, delete.yml . This is a typical Ansible playbook pattern. The standard file directory structure of the role looks as follows, similar to the httpd role.

.

├── README.md

├── defaults

│ └── main.yml

├── files

├── handlers

│ └── main.yml

├── meta

│ └── main.yml

├── tasks

│ ├── create.yml

│ ├── delete.yml

│ └── main.yml

├── templates

├── tests

│ ├── inventory

│ └── test.yml

└── vars

└── main.yml

To provision the GCE resources, we run the 10_webserver_infra.yml playbook (gist).

This playbook runs the gcpweb role. The role’s default main.yml task file imports two other sets of tasks, one for create and one for delete. Each set of tasks have a corresponding tag associated with them (gist).

By calling the playbook and passing the ‘create’ tag, the role will run apply the associated set of create tasks. Tags are a powerful construct in Ansible. Execute the following command, passing the create tag.

ansible-playbook -t create playbooks/10_webserver_infra.yml

In the case of this playbook, the Check mode, used earlier, would fail here. If you recall, this feature is not designed to work with playbooks that have steps that use conditionals that depend on the results of prior commands, such as with this playbook.

The create.yml file contains six tasks, which leverage Ansible GCP Modules. The tasks create a global VPC network, subnetwork in the us-east1 Region, firewall and rules, external IP address, disk, and VM instance (gist).

If your interested in what is actually happening during the execution of the playbook, add the verbose option ( -v or -vv ) to the above command. This can be very helpful in learning Ansible.

The output from the command should look similar to the following. Note the changes applied to localhost. Since no GCE VM host(s) exist on GCP until the resources are provisioned, we reference localhost. The entire process took less than two minutes to create a global VPC network, subnetwork, firewall rules, VM, attached disk, and assign a public IP address.

All GCP resources are now provisioned and configured. Below, we see the new GCE VM created by Ansible.

Below, we see the new GCE VM’s network interface details console page, showing details about the VM, NIC, internal and external IP addresses, network, subnetwork, and ingress firewall rules.

Below, we see the VPC details showing each of the automatically-created regional subnets, and our new ‘ansible-subnet’, in the us-east1 region, and spanning 14 IP addresses in the 172.16.0.0/28 CIDR (Classless Inter-Domain Routing) block.

To deploy and configure Apache HTTP Server, run the httpd role exactly the same way we did in the first workflow.

ansible-playbook \

-i inventories/webservers_gcp.yml \

playbooks/20_webserver_config.yml

Role-based Testing

In the first workflow, we manually tested our results using a number of ad-hoc commands and by viewing web pages in our browser. These methods of testing do not lend themselves to DevOps automation. A more effective strategy is writing tests, which are part of the role, and maybe run each time the role is applied, as part of a CI/CD pipeline. Each role in this project contains a few simple tests to confirm the success of the tasks in the role. First, run the gcpweb role’s tests with the following command.

ansible-playbook \

-i inventories/webservers_gcp.yml \

roles/gcpweb/tests/test.yml

The playbook gathers facts about the GCE hosts in the host group and runs a total of five test tasks against those hosts. The tasks confirm the host’s timezone, vCPU count, OS type, OS major version, and hostname, using the facts gathered (gist).

The output from the command should look similar to the following. Observe that all five tasks ran successfully.

Next, run the the httpd role’s tests.

ansible-playbook \

-i inventories/webservers_gcp.yml \

roles/httpd/tests/test.yml

Similarly, the output from the command should look similar to the following. The playbook runs four test tasks this time. The tasks confirm both files are present, the home page is accessible, and that the server-status page displays properly. Below, we all four ran successfully.

Making a Playbook Change

To observe what happens if we apply a change to a playbook, let’s change the greeting variable value in the /roles/httpd/defaults/main.yml file in the httpd role. Recall, the original home page looked as follows.

Change the greeting variable value and re-run the playbook, using the same command.

ansible-playbook \

-i inventories/webservers_gcp.yml \

playbooks/20_webserver_config.yml

The output from the command should look similar to the following. As expected, we should observe that only one task, deploying the home page, was changed.

Viewing the home page again, or by modifying the associated test task, we should observe the new value is injected into the Jinja2 template file, index.html.j2 , and the new page deployed correctly.

Destroy GCP Resources with Ansible

Once you are finished, you can destroy all the GCP resources by calling the 10_webserver_infra.yml playbook and passing the delete tag, the role will run apply the associated set of delete tasks.

ansible-playbook -t delete playbooks/10_webserver_infra.yml

With Ansible, we delete GCP resources by changing the state from present to absent . The playbook will delete the resources in a particular order, to avoid dependency conflicts, such as trying to delete the network before the VM. Note we do not have to explicitly delete the disk since, if you recall, we provisioned the VM instance with the disks.auto_delete=true option (gist).

The output from the command should look similar to the following. We see the VM instance, attached disk, firewall, rules, external IP address, subnetwork, and finally, the network, each being deleted.

Conclusion

In this post, we saw how easy it is to get started with Ansible on the Google Cloud Platform. Using Ansible’s 300+ cloud modules, provisioning, configuring, deploying to, and testing a wide range of GCP, Azure, and AWS resources are easy, repeatable, and completely automatable.

All opinions expressed in this post are my own and not necessarily the views of my current or past employers or their clients.

Originally published at programmaticponderings.com on January 30, 2019.