Recently, I stumbled on Hypriot, a custom flavor of Debian Linux distribution for Raspberry Pi that is optimized to run Docker. This seemed like a pretty good collection of options to explore containers and other distributed computing technologies in a cost-efficient, controlled, and multi-node environment :)

So, as my first step in this exploration, here’s how I set up a cluster of Rasp Pis.

Components

To start with, I got the following components.

In addition, to setup the nodes and then serve as the terminal to access the nodes on the cluster, I used a Ubuntu Linux box with a SD card slot and an Ethernet interface .

Design Choices

Rasp Pis should use WiFi connection to connect to the Internet, e.g., update software.

While I could have used iptables to set up the Linux box to bridge the subnet and home network, using Rasp Pis WiFi support seemed simpler. Also, this would provide an opportunity to learn around cloud-init’s support to configure the network interfaces of the instances/nodes. All Rasp Pis on the cluster should be on a dedicated subnet 192.168.2.0/24.

The reasons were 1) to keep the network traffic of the cluster separate from the network traffic of the home network and 2) not worry if and how security holes in the cluster could affect the home network. Ethernet interface on Rasp Pis should have a static IP in 192.168.2.0/24 subnet.

This would ease network access (e.g., via ssh) to the Rasp Pis and allow me to not connect a keyboard and a monitor to the Rasp Pis (for most part). One Rasp Pi should be named as master10 while other Rasp Pis should be named as worker1X.

I figured the naming scheme would help future exercises :) There should be at least 10GB of free storage space on the SD card after all the required software were installed.

A “full” installation of a Linux distribution on Rasp Pi takes around 2GB. So, I picked 16GB SD cards as around 14GB of free storage space would be available with a full installation. Cloud-init should be used to initialize the nodes.

This would use the support for cloud-init in HypriotOS and, consequently, reduce the amount of work to configure the nodes and set up the cluster. Also, it would provide an opportunity to learn about cloud-init. Flash should be used to flash the SD cards with the images.

Flash supports the preparation of cloud-init enabled images. So, after the above choices, this was obvious.

Set Up

Here’s how I set up the network (in an instructional mode).

Configure an Ethernet interface of the Linux box with the static IP 192.168.2.1/24 (via Settings -> Network -> Cable -> IPV4). Connect the Linux box to the switch. Download a copy of hypriotos-rpi-v1.8.0.img.zip, the latest image of HypriotOS for Rasp Pi.

Since this image includes docker, I didn’t download other binaries from Hypriot’s Downloads page. Create a Raspberry Pi’s config.txt file that is common to all images. This file is given below along with a description of its content . Create a dedicated cloud-data’s user-data yaml file for each image. This file for master10 image (master10.yaml) is given below along with a description of its content. Download a copy of Flash tool. Insert a SD card into the Linux box. Execute lsblk on the Linux box to identify the device associated with the SD card. Identify the associated device based on its size. On my Linux box, the device was /dev/mmcblk0. Typically, the partitions on an SD card are mounted as soon as the card is inserted into the SD card slot of a Linux box. If the device was mounted, there will be an entry for the device in MOUNTPOINT column of the output of lsblk . If such an entry exist, unmount the device using umount <mount-point> . If the device had multiple partitions, then make sure all partitions are unmounted. Execute the following command to flash the SD card.

flash -d /dev/mmcblk0 -C config.txt -u master10.yaml hypriotos-rpi-v1.8.0.img Eject the SD card. Repeat steps 4 thru 11 for each SD card with the dedicated user-data yaml file. Insert the SD cards into the SD card slot of each Rasp Pi. Connect the Rasp Pis to the switch. Connect the Rasp Pis to the USB charger. As the Rasp Pis boot and become available, their Ethernet interface will be accessible. To track which Rasp Pis are available, execute nmap -sn 192.168.2.1/24 . This will list the hosts available in192.168.2.0/24 subnet. Typically, executing this command after 30 seconds of powering up the Rasp Pis should list all of the Rasp Pis. [Optional] Execute the install-packages.sh (given below) to install vim, tmux, java 8, erlang, python3, sdkman, groovy, vertx, , nodejs, and ruby.

Raspberry Pi’s config.txt file

This file contains commands to configure Raspberry Pi hardware. The command at line 1 is required to enable maximum HDMI compatibility; just in case :) The command at line 2 is required to enable WiFi on Raspberry Pi 3.

Cloud-init’s user-data yaml file

This file contains instructions to cloud-init for initializing and configuring the instance/node.

Following are few relevant parts of this file.

hostname is used to configure the host name of the node. users block is used to create users.

In master10.yaml, a user with name life and password asdf. package_upgrade is used to upgrade the installed packages when the instance/node is booted for the first time.

In master10.yaml, this is set it to false as upgrading the packages as part of cloud-init process caused the system to run out of storage space. write_files block is used to write files into the instance’s file system (path: element).

In master10.yaml, four files are created.

a) /etc/network/interfaces.d/eth0 is used to configure the Ethernet interface with a static IP address in the dedicated subnet.

b) /etc/network/interfaces.d/wlan0 and /etc/wpa_supplicant/wpa_supplicant.conf are used to configure the WiFi interface with an external network. Remember to add the SSID and the password of the external network on lines 38 and 39 of master10.yaml.

c) /etc/rc.local is used to perform some tasks when the nodes boot up. Specifically, pick up hostname change (line 51), set the time zone (line 52), bring up the Ethernet interface (line 53), and bring down the WiFi interface (line 54). Also, this file is made executable (line 57).

Observations

After executing install-packages.sh, each node had around 13GB of free storage space. While configuring cloud-init’s user-data yaml files, I found contradicting information about the semantics of runcmd and bootcmd. After experimentation, I figured out that the commands in runcmd section are executed only when the instance/node is booted for the first time and the commands in bootcmd section are executed every time the instance/node is booted. While bootcmd seems like a good way to execute commands during the booting process, it is not a good way to execute commands that configure the system (e.g., bring down a network interface) as these commands may fail if the system has not completed booting. If you execute install-packages.sh, you will observe ruby/rails installation takes quite a bit of time. I am not sure if this is solely due to the relatively slow processors on Rasp Pi or due to the sequential nature of installing ruby by rvm. Something to be aware of while dealing with sequential compute heavy jobs on Rasp Pi.

That said, since Rasp Pis have four cores, I’m curious about speed ups that can be achieved via parallelization on Rasp Pis.

Assembling the hardware took about two hours. It was a fun activity to do with kids :) I’d highly recommend it.

However, the rest — researching about tools (e.g., flash, cloud-init), figuring out how to configure network interfaces, integrating configurations into cloud-init, debugging issues (e.g., runcmd vs bootcmd , package_upgrade=true ), and all the multiple reboots along the way —was worth about a day’s work. While the exercise involved lots of experimentation, it was a great learning experience.

After all of this effort, now I can power up the cluster and it works like a charm :)

A cluster of Raspberry Pi 3 Model B

I hope you find this information useful. If any of the steps don’t work for you, then let me know (post a comment) and I’ll try my best to help.

All of the code (which will be evolving) is available here.

Next up, some distributed computing exercises on this cluster :)