Continuing on my journey of getting familiar with all things “Photon Controller” related, I wanted to take you through the process, step-by-step, of getting Docker SWARM running on top of Photon Controller. Now, my good pal William Lam has already described the process in a lot of detail over on his virtually ghetto blog. I thought I might try to expand on that a bit more, and highlight where things might go wrong (if you are a newbie like me to this stuff). I also wanted to do everything from the Photon CLI, rather than going through the UI for any of the steps.

*** Please note that at the time of writing, Photon Controller is still not GA ***

*** The steps highlighted here may change in the GA version of the product ***

My Setup

Physical environment (not nested)

4 hosts running ESXi 6.0U2

1 vSwitch with 2 port groups per host (one for ESXi management and the other for VMs)

VM Network provides both static and DHCP IP addresses

2 static IP address (one for Photon Controller, the other for Docker SWARM etcd container)

1. Initial Configuration – deploy Photon Controller “Installer” OVA

The first step is to deploy the Photon Controller “installer” OVA. You can get v0.8 from this link. This appliance is the mechanism by which you can deploy Photon Controller.

I also have Photon CLI deployed on my desktop, which I will use for creating the Photon Controller, and then deploying a Docker SWARM cluster on top. You can get the CLI tools from the same link as Photon Controller v0.8.

2. Deploying Photon Controller via Photon CLI



Once the “installer” is deployed (its full name is ESXCloud Installer) and is up and running, you can roll out Photon Controller in a few ways. First note the IP address of the installer appliance – you will need this shortly. As I mentioned, you can deploy it via the UI, or you can deploy it via the Photon CLI tools. I will do this via the Photon CLI tools in this example. In my setup, I have 4 hosts. I am going to use one of these as my management (MGMT) host, and the other 3 as CLOUD hosts for deploying my containers. I created a YAML file, which contained all the relevant information about my hosts (although I have put xxxxxx where the actual passwords should be below). Note that the Photon Controller manager will need a static IP – here it is set to 10.27.51.117. The netmask, DNS and gateway all pertain to the management VM, and not the ESXi host.

hosts: - metadata: MANAGEMENT_DATASTORE: esxi-hp-05-local MANAGEMENT_PORTGROUP: VM Network MANAGEMENT_NETWORK_NETMASK: 255.255.255.0 MANAGEMENT_NETWORK_DNS_SERVER: 10.27.51.252 MANAGEMENT_NETWORK_GATEWAY: 10.27.51.254 MANAGEMENT_VM_IPS: 10.27.51.117 address_ranges: 10.27.51.5 username: root password: xxxxxxxxxx usage_tags: - MGMT - address_ranges: 10.27.51.8,10.27.51.7,10.27.51.6 username: root password: xxxxxxxxx usage_tags: - CLOUD deployment: resume_system: true image_datastores: isilon-nfs-01 auth_enabled: false syslog_endpoint: ntp_endpoint: 10.27.51.252 use_image_datastore_for_vms: false loadbalancer_enabled: true

All CLOUD hosts have access to the NFS share, isilon-nfs-01, where images are to be stored. The first step is to point the target at the Photon Controller installer using the Photon CLI. In this example, the installer IP is 10.27.51.33.

C:\Users\chogan>photon target set http://10.27.51.33 Using target 'http://10.27.51.33' API target set to 'http://10.27.51.33' C:\Users\chogan>

The next step is to run a Photon CLI command to create my deployment. That command is “photon system deploy”, as shown below, and I’ve included the different steps that the command goes through:

C:\Users\chogan>photon system deploy Downloads\my_config.yaml Using target 'http://10.27.51.33' Created deployment c85aef0d-f79b-4271-b706-da987117ca9c 0h 0m 6s [= ] CREATE_HOST : CREATE_HOST | Step 1/1 0h 0m 2s [== ] PERFORM_DEPLOYMENT : PROVISION_CONTROL_PLANE_HOSTS | Step 2/6 0h12m42s [=== ] PERFORM_DEPLOYMENT : PROVISION_CONTROL_PLANE_VMS | Step 3/6 0h31m44s [==== ] PERFORM_DEPLOYMENT : PROVISION_CLOUD_HOSTS | Step 4/6 0h41m24s [===== ] PERFORM_DEPLOYMENT : PROVISION_CLUSTER_MANAGER | Step 5/6 0h42m44s [====== ] PERFORM_DEPLOYMENT : MIGRATE_DEPLOYMENT_DATA | Step 6/6 Deployment 'c85aef0d-f79b-4271-b706-da987117ca9c' is complete.

When the deployment succeeds, we can now set our target to the Photon Controller rather than the installer, and since we included load-balancer as an option, we include the load-balancer port of 28080 in the target:

C:\Users\chogan>photon target set http://10.27.51.117:28080 Using target 'http://10.27.51.117:28080' API target set to 'http://10.27.51.117:28080'

The load-balancer port is important or you might run into the issue described here when trying to upload images.

3. Create tenant, project and resources

We are now at the point where we can being to consume some of the resources on the Cloud hosts for our particular cluster deployment. To do this we need to create a tenant, a resource ticket, and a project. First we will create the tenant, and set it.

C:\Users\chogan>photon tenant create Cormac Comma-separated security group names, or hit enter for no security groups): Using target 'http://10.27.51.117:28080' Created tenant 'Cormac' ID: d3ca1f91-5a54-4128-9af9-bf4512570dde

C:\Users\chogan>photon tenant set Cormac Using target 'http://10.27.51.117:28080' Tenant set to 'Cormac'

The next step is to create a resource ticket. In this resource ticket, we will create 100 VMs, and each VM can have 16GB of Memory. We will call it gold:

C:\Users\chogan>photon resource-ticket create Using target 'http://10.27.51.117:28080' Resource ticket name: gold Limit 1 (ENTER to finish) Key: VM Value: 100 Unit: COUNT Limit 2 (ENTER to finish) Key: VM.Memory Value: 16 Unit: GB Limit 3 (ENTER to finish) Key: <hit ENTER> Tenant name: Cormac Creating resource ticket name: gold Please make sure limits below are correct: 1: VM, 100, COUNT 2: VM.Memory, 16, GB Are you sure [y/n]? y Resource ticket created: ID = 1297060f-1f3f-44c6-b64a-9d082624810f

And then finally we will create a project to consume some of those resources in the resource ticket. In fact, this project will claim all of the resources in the resource ticket, but of course you could have multiple projects consuming resources from the same ticket. We will also set our project to this newly created project, called SWARM:

C:\Users\chogan>photon project create -r gold Using target 'http://10.27.51.117:28080' Project name: SWARM Limit 1 (ENTER to finish) Key: VM Value: 100 Unit: COUNT Limit 2 (ENTER to finish) Key: VM.Memory Value: 16 Unit: GB Limit 3 (ENTER to finish) Key: <hit ENTER> Tenant name: Cormac Resource ticket name: gold Creating project name: SWARM Please make sure limits below are correct: 1: VM, 100, COUNT 2: VM.Memory, 16, GB Are you sure [y/n]? y Project created: ID = f9807ab2-186b-4564-ae23-4acb5a09dbb4

C:\Users\chogan>photon project set SWARM Using target 'http://10.27.51.117:28080' Project set to 'SWARM'

Excellent, we can now move on to building our SWARM cluster.

4. Create a SWARM image

Before building our cluster, we first of all need an image that can be consumed by the nodes deployed on the cluster. We are deploying SWARM, so we need a docker swarm image. At present, there is only the photon management image available:

C:\Users\chogan>photon image list Using target 'http://10.27.51.117:28080' ID Name \ State Size(Byte) Replication_type ReplicationProgress SeedingProgress 465ee30f-7b53-41b3-93a9-3e4ab4b7355c photon-management-vm-disk1.vmdk \ READY 85899345972 ON_DEMAND 20.0% 100.0% Total: 1

The next step is to create a Docker SWARM image using the one provided on github here. Just download it to your desktop, and create it as follows (assuming it is in the Downloads folder):

C:\Users\chogan>photon image create Downloads\photon-swarm-vm-disk1.vmdk \ -n swarm-vm.vmdk Image replication type (default: EAGER): Using target 'http://10.27.51.117:28080' Created image 'swarm-vm.vmdk' ID: 83dbd815-3638-40cd-a553-dac48efcfe8f

Now there is a second image available:

C:\Users\chogan>photon image list Using target 'http://10.27.51.117:28080' ID Name \ State Size(Byte) Replication_type ReplicationProgress SeedingProgress 465ee30f-7b53-41b3-93a9-3e4ab4b7355c photon-management-vm-disk1.vmdk \ READY 85899345972 ON_DEMAND 20.0% 100.0% 83dbd815-3638-40cd-a553-dac48efcfe8f swarm-vm.vmdk \ READY 85899345968 EAGER 20.0% 100.0% Total: 2

Caution: There are two things to highlight here. The first is the image create command. Note that there is no command line option to provide the location of the image; you simply put the path in. In my case it was in the Downloads folder. There is also a -i option (which I did not include) which is for the type of image that you will create, e.g. -i EAGER or -i ON_DEMAND, which will create the image up-front, or when needed.

The second issue here is that the image now needs to be transferred to the image datastore. This can take some time, and it is probably worth waiting for these transfers to complete before trying to build the cluster. This is because the cluster build process may time out if the images are not yet available. I use the HTML5 Host Client to verify the progress of the transfer:

I’ve requested that we be able to track this progress from the Photon CLI.

5. Allow deployment to support Docker SWARM cluster

For this part of the process, you need both the ID of the SWARM image and the ID of the deployment. The image ID is above. The deployment ID can be captured as follows:

C:\Users\chogan>photon deployment list Using target 'http://10.27.51.117:28080' ID c85aef0d-f79b-4271-b706-da987117ca9c Total: 1

The command to enable the cluster to deploy a SWARM cluster with a particular images is as follows:

C:\Users\chogan>photon deployment enable-cluster-type \ c85aef0d-f79b-4271-b706-da987117ca9c -k SWARM \ -i 83dbd815-3638-40cd-a553-dac48efcfe8f Are you sure [y/n]? y Using target 'http://10.27.51.117:28080' Cluster Type: SWARM Image ID: 83dbd815-3638-40cd-a553-dac48efcfe8f

We are now ready to deploy the SWARM cluster.

6. Create a Docker SWARM cluster



As I mentioned at the beginning on the post, I have a pretty simple setup with only a single VM network. If you are deploying in an environment that has multiple VM networks, you will have to select the appropriate one by following the guidance in this article.

However, since we only have a single VM network, we do not need to worry about this. Let’s run the cluster create command. This is where the other static IP is needed, for the first etcd container. This is used for discovering the SWARM cluster nodes (master, slaves). We’re also going with the simplest setup where these is a single master and a single slave in our SWARM cluster.

C:\Users\chogan>photon cluster create -n SWARM -k SWARM \ --dns 10.27.51.252 --gateway 10.27.51.254 --netmask 255.255.255.0 \ --etcd1 10.27.51.138 Using target 'http://10.27.51.117:28080' Slave count: 1 etcd server 2 static IP address (leave blank for none): Creating cluster: SWARM (SWARM) Slave count: 1 Are you sure [y/n]? y Cluster created: ID = 0b8b8639-5335-4b97-b8ab-5f90a6155cf3 Note: the cluster has been created with minimal resources. You can use the cluster now. A background task is running to gradually expand the cluster to its target capacity. You can run 'cluster show 0b8b8639-5335-4b97-b8ab-5f90a6155cf3' to see the state of the cluster.

7. Verifying the state of the cluster

Excellent. The SWARM cluster has been created. Let’s check a few things using these useful Photon CLI commands:

C:\Users\chogan>photon cluster list Using target 'http://10.27.51.117:28080' ID Name Type State Slave Count 0b8b8639-5335-4b97-b8ab-5f90a6155cf3 SWARM SWARM READY 1 Total: 1 READY: 1

C:\Users\chogan>photon cluster list_vms 0b8b8639-5335-4b97-b8ab-5f90a6155cf3 Using target 'http://10.27.51.117:28080' ID Name State 0010d6bc-84f0-4183-a77f-990399b96460 slave-a34d23ca-45aa-4aec-8a64-f93f85b3d2e4 STARTED 09029e50-d348-40b7-a33a-5ddf82e1c2e6 etcd-f1c423a2-4852-4f42-94ea-b30259da9800 STARTED 44c18c9b-e070-439d-ac25-a69c7be6474e master-0362f6f7-0905-44dd-8382-d37be65df0c4 STARTED Total: 3 STARTED: 3

C:\Users\chogan>photon cluster show 0b8b8639-5335-4b97-b8ab-5f90a6155cf3 Using target 'http://10.27.51.117:28080' Cluster ID: 0b8b8639-5335-4b97-b8ab-5f90a6155cf3 Name: SWARM State: READY Type: SWARM Slave count: 1 Extended Properties: map[netmask:255.255.255.0 dns:10.27.51.252 \ etcd_ips:10.27.51.138 gateway:10.27.51.254] VM ID VM Name \ VM IP 09029e50-d348-40b7-a33a-5ddf82e1c2e6 etcd-f1c423a2-4852-4f42-94ea-b30259da9800 \ 10.27.51.138 44c18c9b-e070-439d-ac25-a69c7be6474e master-0362f6f7-0905-44dd-8382-d37be65df0c4 \ 10.27.51.39 C:\Users\chogan>

And this final output shows us the IP address of the master node, which is what we will need to run some docker commands.

Caution: I had some issue deploying SWARM whereby the cluster create fails with “VmProvisionTaskService failed with error VM failed to acquire an IP address”. Sometimes this was when only the etcd was deployed, sometimes with etcd and master, and then other times when it was etcd, master and slave. My gut feel is that etcd was not discovering correctly. As part of the troubleshooting effort, I switched to another VLAN, and my SWARM cluster deployed without incident. Not sure what the root cause is, but I continue to investigate.

If this happens, delete the cluster, and retry the create command. If that does not work, try an alternative network for the cluster.

8. Working with Docker SWARM

Now we need to log onto a host that is running Docker. The easiest way to do this is to log onto the Photon Controller “installer” which already has docker installed.

Once logged in, point the DOCKER_HOST variable at the Swarm master IP address, and port 8333. Then you can start to run some docker commands, and the “docker ps -a” command should show you the nodes that make up the SWARM cluster. Ignore the fact that some of the IDs in the outputs may not match up what was shown earlier – I went through the exercise a number of times which is the reason for that.

esxcloud [ ~ ]$ export DOCKER_HOST=tcp://10.27.51.39:8333

esxcloud [ ~ ]$ docker info Containers: 3 Images: 4 Role: primary Strategy: spread Filters: affinity, health, constraint, port, dependency Nodes: 2 master-0362f6f7-0905-44dd-8382-d37be65df0c4: 10.27.51.39:2375 └ Containers: 2 └ Reserved CPUs: 0 / 4 └ Reserved Memory: 0 B / 8.187 GiB └ Labels: executiondriver=native-0.2, kernelversion=4.0.9, \ operatingsystem=VMware Photon/Linux, storagedriver=overlay slave-a34d23ca-45aa-4aec-8a64-f93f85b3d2e4: 10.27.51.40:2375 └ Containers: 1 └ Reserved CPUs: 0 / 1 └ Reserved Memory: 0 B / 4.053 GiB └ Labels: executiondriver=native-0.2, kernelversion=4.0.9, \ operatingsystem=VMware Photon/Linux, storagedriver=overlay CPUs: 5 Total Memory: 12.24 GiB Name: 47786077a0a0

esxcloud [ ~ ]$ docker version Client: Version: 1.8.1 API version: 1.20 Go version: go1.4.2 Git commit: d12ea79 Built: Thu Aug 13 02:49:29 UTC 2015 OS/Arch: linux/amd64 Server: Version: swarm/0.4.0 API version: 1.16 Go version: go1.4.2 Git commit: d647d82 Built: OS/Arch: linux/amd64 esxcloud [ ~ ]$ docker ps -a CONTAINER ID IMAGE COMMAND CREATED \ STATUS PORTS NAMES 85577bf5fa55 swarm:0.4.0 "/swarm join --addr=1" 5 minutes ago \ Up 5 minutes 2375/tcp slave-a34d23ca-45aa-4aec-8a64-f93f85b3d2e4/elegant_nobel 47786077a0a0 swarm:0.4.0 "/swarm manage etcd:/" 6 minutes ago \ Up 6 minutes 10.27.51.39:8333->2375/tcp master-0362f6f7-0905-44dd-8382-d37be65df0c4/goofy_wozniak 03db84efe339 swarm:0.4.0 "/swarm join --addr=1" 6 minutes ago \ Up 6 minutes 2375/tcp master-0362f6f7-0905-44dd-8382-d37be65df0c4/trusting_newton esxcloud [ ~ ]$

Now lets see what happens when I run a simple “hello-world” container a few times. In reality, these containers should get balanced across the nodes in the swarm cluster:

esxcloud [ ~ ]$ docker run hello-world Hello from Docker. This message shows that your installation appears to be working correctly. To generate this message, Docker took the following steps: 1. The Docker client contacted the Docker daemon. 2. The Docker daemon pulled the "hello-world" image from the Docker Hub. 3. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading. 4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal. To try something more ambitious, you can run an Ubuntu container with: $ docker run -it ubuntu bash Share images, automate workflows, and more with a free Docker Hub account: https://hub.docker.com For more examples and ideas, visit: https://docs.docker.com/engine/userguide/

Now if I run it a few more times, then take a look at my containers:

esxcloud [ ~ ]$ docker ps -a CONTAINER ID IMAGE COMMAND CREATED \ STATUS PORTS NAMES 214c689f5d93 hello-world "/hello" Less than a second ago\ Exited (0) 2 seconds ago master-0362f6f7-0905-44dd-8382-d37be65df0c4/focused_mestorf 03dc237d3c5d hello-world "/hello" 19 seconds ago \ Exited (0) 20 seconds ago slave-a34d23ca-45aa-4aec-8a64-f93f85b3d2e4/ecstatic_mestorf 4a9a914b4794 hello-world "/hello" 29 seconds ago \ Exited (0) 30 seconds ago slave-a34d23ca-45aa-4aec-8a64-f93f85b3d2e4/prickly_bhabha 85577bf5fa55 swarm:0.4.0 "/swarm join --addr=1" 6 minutes ago \ Up 6 minutes 2375/tcp slave-a34d23ca-45aa-4aec-8a64-f93f85b3d2e4/elegant_nobel 47786077a0a0 swarm:0.4.0 "/swarm manage etcd:/" 7 minutes ago \ Up 7 minutes 10.27.51.39:8333->2375/tcp master-0362f6f7-0905-44dd-8382-d37be65df0c4/goofy_wozniak 03db84efe339 swarm:0.4.0 "/swarm join --addr=1" 7 minutes ago \ Up 7 minutes 2375/tcp master-0362f6f7-0905-44dd-8382-d37be65df0c4/trusting_newton esxcloud [ ~ ]$

And if we look closely, we can see that some of the “hello-world” containers were run on the master, and others have been run on the slave.

9. Scale out the SWARM cluster

We can also scale out the number of slave nodes in the SWARM cluster using the Photon CLI tools. Here is how:

C:\Users\chogan>photon cluster resize 0b8b8639-5335-4b97-b8ab-5f90a6155cf3 3 Using target 'http://10.27.51.117:28080' Resizing cluster 0b8b8639-5335-4b97-b8ab-5f90a6155cf3 to slave count 3 Are you sure [y/n]? y RESIZE_CLUSTER completed for '' entity Note: A background task is running to gradually resize the cluster \ to its target capacity. You may continue to use the cluster. You can run 'cluster show ' to see the state of the cluster. If the resize operation is still \ in progress, the cluster state will show as RESIZING. Once the cluster is resized, the cluster \ state will show as READY.

Let’s examine the state of the cluster once more:

C:\Users\chogan>photon cluster show 0b8b8639-5335-4b97-b8ab-5f90a6155cf3 Using target 'http://10.27.51.117:28080' Cluster ID: 0b8b8639-5335-4b97-b8ab-5f90a6155cf3 Name: SWARM State: RESIZING Type: SWARM Slave count: 3 Extended Properties: map[netmask:255.255.255.0 dns:10.27.51.252 \ etcd_ips:10.27.51.118 gateway:10.27.51.254] VM ID VM Name VM IP40255f5c-0d63-4ad9-a6ac-4c95f7f583e3 etcd-b3e288c8-0151-49a1-9cc4-1754f2b62060 10.27.51.118 f1cbe821-f9bf-4fa5-93e6-3b55a5f52c91 master-184431d2-d4ff-4f21-a16d-66190ac95021 10.27.51.48

And lets look at the new slave nodes:

C:\Users\chogan>photon cluster list_vms 7e14ce09-f6ab-4ece-bb73-ca2c96459495 Using target 'http://10.27.51.117:28080' ID Name State 40255f5c-0d63-4ad9-a6ac-4c95f7f583e3 etcd-b3e288c8-0151-49a1-9cc4-1754f2b62060 STARTED 40fc5084-6e57-44b3-97c5-ac6a37fd0583 slave-4aeb2353-0720-417f-ba52-7603d1f781ad STARTED 7dbcfd3a-5919-4912-8c8a-7d4b2478a017 slave-39b26ca6-4752-4d27-b481-32544e35d254 STARTED edd38a15-a588-423a-809b-d03c0256a4dc slave-5fdc2fe3-64a7-4790-b2a1-841e04dab7e8 STARTED f1cbe821-f9bf-4fa5-93e6-3b55a5f52c91 master-184431d2-d4ff-4f21-a16d-66190ac95021 STARTED Total: 5 STARTED: 5 C:\Users\chogan>

And if you now run the docker commands ran previously, you should see the additional slave nodes in the SWARM cluster, and running additional containers should balance across the master and the additional slaves.