Today I would like to present an article about setting up GlusterFS cluster on a FreeBSD system with Ansible and GNU Parallel tools.

To cite Wikipedia “GlusterFS is a scale-out network-attached storage file system. It has found applications including cloud computing, streaming media services, and content delivery networks.” The GlusterFS page describes it similarly “Gluster is a scalable, distributed file system that aggregates disk storage resources from multiple servers into a single global namespace.”

Here are its advantages:

Scales to several petabytes.

Handles thousands of clients.

POSIX compatible.

Uses commodity hardware.

Can use any ondisk filesystem that supports extended attributes.

Accessible using industry standard protocols like NFS and SMB.

Provides replication/quotas/geo-replication/snapshots/bitrot detection.

Allows optimization for different workloads.

Open Source.

Lab Setup

It will be entirely VirtualBox based and it will consist of 6 hosts. To not create 6 same FreeBSD installations I used 12.0-RELEASE virtual machine image available from the FreeBSD Project directly:

There are several formats available – qcow2 / raw / vhd / vmdk – but as I will be using VirtualBox I used the VMDK one.

I will use different prompts depending on where the command is executed to make the article more readable. Also then there is ‘ % ‘ at the prompt then a regular user is needed and if there is ‘ # ‘ at the prompt then a superuser is needed.

gluster1 # // command run on the gluster1 node gluster* # // command run on all gluster nodes client # // command run on gluster client vbhost % // command run on the VirtualBox host

Here is the list of the machines for the GlusterFS cluster:

10.0.10.11 gluster1

10.0.10.12 gluster2

10.0.10.13 gluster3

10.0.10.14 gluster4

10.0.10.15 gluster5

10.0.10.16 gluster6

Each VirtualBox virtual machine for FreeBSD is the default one (as suggested in the VirtualBox wizard) with 512 MB RAM and NAT Network as shown on the image below.

Here is the configuration of the NAT Network on VirtualBox.

The cloned/copied FreeBSD-12.0-RELEASE-amd64.vmdk image will need to have different UUIDs so we will use VBoxManage internalcommands sethduuid command to achieve this.

vbhost % for I in $( seq 6 ); do cp FreeBSD-12.0-RELEASE-amd64.vmdk vbox_GlusterFS_${I}.vmdk; done vbhost % for I in $( seq 6 ); do VBoxManage internalcommands sethduuid vbox_GlusterFS_${I}.vmdk; done

To start the whole GlusterFS environment on VirtualBox use these commands.

vbhost % VBoxManage list vms | grep GlusterFS "FreeBSD GlusterFS 1" {162a3b6f-4ec9-4709-bff8-162b0c8c9c41} "FreeBSD GlusterFS 2" {2e30326c-ac5d-41d2-9b28-483375df38f6} "FreeBSD GlusterFS 3" {6b2747ab-3ec6-4b1a-a28e-5d871d7891b3} "FreeBSD GlusterFS 4" {12379cf8-31d9-4ff1-9945-465fc3ed15f0} "FreeBSD GlusterFS 5" {a4b0d515-5924-4517-9052-df238c366f2b} "FreeBSD GlusterFS 6" {66621755-1b97-4486-aa15-a7bec9edb343}

Check which GlusterFS machines are running.

vbhost % VBoxManage list runningvms | grep GlusterFS vbhost %

Starting of the machines in VirtualBox Headless mode in parallel.

vbhost % VBoxManage list vms \ | grep GlusterFS \ | awk -F \" '{print $2}' \ | while read I; do VBoxManage startvm "${I}" --type headless & done

After that command you should see these machines running.

vbhost % VBoxManage list runningvms "FreeBSD GlusterFS 1" {162a3b6f-4ec9-4709-bff8-162b0c8c9c41} "FreeBSD GlusterFS 2" {2e30326c-ac5d-41d2-9b28-483375df38f6} "FreeBSD GlusterFS 3" {6b2747ab-3ec6-4b1a-a28e-5d871d7891b3} "FreeBSD GlusterFS 4" {12379cf8-31d9-4ff1-9945-465fc3ed15f0} "FreeBSD GlusterFS 5" {a4b0d515-5924-4517-9052-df238c366f2b} "FreeBSD GlusterFS 6" {66621755-1b97-4486-aa15-a7bec9edb343}

Before we will try connect to our FreeBSD machines we need to make the minimal network configuration. Each FreeBSD machine will have such minimal /etc/rc.conf file as shown example for gluster1 host.

gluster1 # cat /etc/rc.conf hostname=gluster1 ifconfig_DEFAULT="inet 10.0.10.11/24 up" defaultrouter=10.0.10.1 sshd_enable=YES

For the setup purposes we will need to allow root login on these FreeBSD GlusterFS machines with PermitRootLogin yes option in the /etc/ssh/sshd_config file. You will also need to restart the sshd(8) service after the changes.

gluster1 # grep '^PermitRootLogin' /etc/ssh/sshd_config PermitRootLogin yes # service sshd restart

By using NAT Network with Port Forwarding the FreeBSD machines will be accessible on the localhost ports. For example the gluster1 machine will be available on port 2211 , the gluster2 machine will be available on port 2212 and so on. This is shown in the sockstat utility output below.

vbhost % sockstat -l4 USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS vermaden VBoxNetNAT 57622 17 udp4 *:* *:* vermaden VBoxNetNAT 57622 19 tcp4 *:2211 *:* vermaden VBoxNetNAT 57622 20 tcp4 *:2212 *:* vermaden VBoxNetNAT 57622 21 tcp4 *:2213 *:* vermaden VBoxNetNAT 57622 22 tcp4 *:2214 *:* vermaden VBoxNetNAT 57622 23 tcp4 *:2215 *:* vermaden VBoxNetNAT 57622 24 tcp4 *:2216 *:* vermaden VBoxNetNAT 57622 28 tcp4 *:2240 *:* vermaden VBoxNetNAT 57622 29 tcp4 *:9140 *:* vermaden VBoxNetNAT 57622 30 tcp4 *:2220 *:* root sshd 96791 4 tcp4 *:22 *:*

I think the corelation between IP address and the port on the host is obvious 🙂

Here is the list of the machines with ports on localhost:

10.0.10.11 gluster1 2211

10.0.10.12 gluster2 2212

10.0.10.13 gluster3 2213

10.0.10.14 gluster4 2214

10.0.10.15 gluster5 2215

10.0.10.16 gluster6 2216

To connect to such machine from the VirtualBox host system you will need this command:

vbhost % ssh -l root localhost -p 2211

To not type that every time you need to login to gluster1 let’s make come changes to ~/.ssh/config file for convenience. This way it will be possible to login in very short way.

vbhost % ssh gluster1

Here is the modified ~/.ssh/config file.

vbhost % cat ~/.ssh/config # GENERAL StrictHostKeyChecking no LogLevel quiet KeepAlive yes ServerAliveInterval 30 VerifyHostKeyDNS no # ALL HOSTS SETTINGS Host * StrictHostKeyChecking no Compression yes # GLUSTER Host gluster1 User root Hostname 127.0.0.1 Port 2211 Host gluster2 User root Hostname 127.0.0.1 Port 2212 Host gluster3 User root Hostname 127.0.0.1 Port 2213 Host gluster4 User root Hostname 127.0.0.1 Port 2214 Host gluster5 User root Hostname 127.0.0.1 Port 2215 Host gluster6 User root Hostname 127.0.0.1 Port 2216

I assume that you already have some SSH keys generated (with ~/.ssh/id_rsa as private key) so lets remove the need to type password on each SSH login.

vbhost % ssh-copy-id -i ~/.ssh/id_rsa gluster1 Password for root@gluster1: vbhost % ssh-copy-id -i ~/.ssh/id_rsa gluster2 Password for root@gluster2: vbhost % ssh-copy-id -i ~/.ssh/id_rsa gluster3 Password for root@gluster3: vbhost % ssh-copy-id -i ~/.ssh/id_rsa gluster4 Password for root@gluster4: vbhost % ssh-copy-id -i ~/.ssh/id_rsa gluster5 Password for root@gluster5: vbhost % ssh-copy-id -i ~/.ssh/id_rsa gluster6 Password for root@gluster6:

Ansible Setup

As we already have SSH integration now we will configure Ansible to connect to out ‘localhost’ ports for FreeBSD machines.

Here is the Ansible’s hosts file.

vbhost % cat hosts [gluster] gluster1 ansible_port=2211 ansible_host=127.0.0.1 ansible_user=root gluster2 ansible_port=2212 ansible_host=127.0.0.1 ansible_user=root gluster3 ansible_port=2213 ansible_host=127.0.0.1 ansible_user=root gluster4 ansible_port=2214 ansible_host=127.0.0.1 ansible_user=root gluster5 ansible_port=2215 ansible_host=127.0.0.1 ansible_user=root gluster6 ansible_port=2216 ansible_host=127.0.0.1 ansible_user=root [gluster:vars] ansible_python_interpreter=/usr/local/bin/python2.7

Here is the listing of these machines using ansible command.

vbhost % ansible -i hosts --list-hosts gluster hosts (6): gluster1 gluster2 gluster3 gluster4 gluster5 gluster6

Lets verify that out Ansible setup works correctly.

vbhost % ansible -i hosts -m raw -a 'echo' gluster gluster1 | CHANGED | rc=0 >> gluster3 | CHANGED | rc=0 >> gluster2 | CHANGED | rc=0 >> gluster5 | CHANGED | rc=0 >> gluster4 | CHANGED | rc=0 >> gluster6 | CHANGED | rc=0 >>

It works as desired.

We are not able to use Ansible modules other then Raw because by default Python is not installed on FreeBSD as shown below.

vbhost % ansible -i hosts -m ping gluster gluster1 | FAILED! => { "changed": false, "module_stderr": "", "module_stdout": "/bin/sh: /usr/local/bin/python2.7: not found\r

", "msg": "MODULE FAILURE

See stdout/stderr for the exact error", "rc": 127 } gluster2 | FAILED! => { "changed": false, "module_stderr": "", "module_stdout": "/bin/sh: /usr/local/bin/python2.7: not found\r

", "msg": "MODULE FAILURE

See stdout/stderr for the exact error", "rc": 127 } gluster4 | FAILED! => { "changed": false, "module_stderr": "", "module_stdout": "/bin/sh: /usr/local/bin/python2.7: not found\r

", "msg": "MODULE FAILURE

See stdout/stderr for the exact error", "rc": 127 } gluster5 | FAILED! => { "changed": false, "module_stderr": "", "module_stdout": "/bin/sh: /usr/local/bin/python2.7: not found\r

", "msg": "MODULE FAILURE

See stdout/stderr for the exact error", "rc": 127 } gluster3 | FAILED! => { "changed": false, "module_stderr": "", "module_stdout": "/bin/sh: /usr/local/bin/python2.7: not found\r

", "msg": "MODULE FAILURE

See stdout/stderr for the exact error", "rc": 127 } gluster6 | FAILED! => { "changed": false, "module_stderr": "", "module_stdout": "/bin/sh: /usr/local/bin/python2.7: not found\r

", "msg": "MODULE FAILURE

See stdout/stderr for the exact error", "rc": 127 }

We need to get Python installed on FreeBSD.

We will partially use Ansible for this and partially the GNU Parallel.

vbhost % ansible -i hosts --list-hosts gluster \ | sed 1d \ | while read I; do ssh ${I} env ASSUME_ALWAYS_YES=yes pkg install python; done pkg: Error fetching http://pkg.FreeBSD.org/FreeBSD:12:amd64/quarterly/Latest/pkg.txz: No address record A pre-built version of pkg could not be found for your system. Consider changing PACKAGESITE or installing it from ports: 'ports-mgmt/pkg'. Bootstrapping pkg from pkg+http://pkg.FreeBSD.org/FreeBSD:12:amd64/quarterly, please wait...

… we forgot about setting up DNS in the FreeBSD machines, let’s fix that.

It is as easy as executing echo nameserver 1.1.1.1 > /etc/resolv.conf command on each FreeBSD machine.

Lets verify what input will be sent to GNU Parallel before executing it.

vbhost % ansible -i hosts --list-hosts gluster \ | sed 1d \ | while read I; do echo "ssh ${I} 'echo nameserver 1.1.1.1 > /etc/resolv.conf'"; done ssh gluster1 'echo nameserver 1.1.1.1 > /etc/resolv.conf' ssh gluster2 'echo nameserver 1.1.1.1 > /etc/resolv.conf' ssh gluster3 'echo nameserver 1.1.1.1 > /etc/resolv.conf' ssh gluster4 'echo nameserver 1.1.1.1 > /etc/resolv.conf' ssh gluster5 'echo nameserver 1.1.1.1 > /etc/resolv.conf' ssh gluster6 'echo nameserver 1.1.1.1 > /etc/resolv.conf'

Looks reasonable, lets engage the GNU Parallel then.

vbhost % ansible -i hosts --list-hosts gluster \ | sed 1d \ | while read I; do echo "ssh ${I} 'echo nameserver 1.1.1.1 > /etc/resolv.conf'"; done | parallel Computers / CPU cores / Max jobs to run 1:local / 2 / 2 Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete local:0/6/100%/1.0s

We will now verify that the DNS is configured properly on the FreeBSD machines.

vbhost % for I in $( jot 6 ); do echo -n "gluster${I} "; ssh gluster${I} 'cat /etc/resolv.conf'; done gluster1 nameserver 1.1.1.1 gluster2 nameserver 1.1.1.1 gluster3 nameserver 1.1.1.1 gluster4 nameserver 1.1.1.1 gluster5 nameserver 1.1.1.1 gluster6 nameserver 1.1.1.1

Verification of the DNS by using ping(8) to test Internet connectivity.

vbhost % for I in $( jot 6 ); do echo; echo "gluster${I}"; ssh gluster${I} host freebsd.org; done gluster1 freebsd.org has address 96.47.72.84 freebsd.org has IPv6 address 2610:1c1:1:606c::50:15 freebsd.org mail is handled by 10 mx1.freebsd.org. freebsd.org mail is handled by 30 mx66.freebsd.org. gluster2 freebsd.org has address 96.47.72.84 freebsd.org has IPv6 address 2610:1c1:1:606c::50:15 freebsd.org mail is handled by 30 mx66.freebsd.org. freebsd.org mail is handled by 10 mx1.freebsd.org. gluster3 freebsd.org has address 96.47.72.84 freebsd.org has IPv6 address 2610:1c1:1:606c::50:15 freebsd.org mail is handled by 30 mx66.freebsd.org. freebsd.org mail is handled by 10 mx1.freebsd.org. gluster4 freebsd.org has address 96.47.72.84 freebsd.org has IPv6 address 2610:1c1:1:606c::50:15 freebsd.org mail is handled by 30 mx66.freebsd.org. freebsd.org mail is handled by 10 mx1.freebsd.org. gluster5 freebsd.org has address 96.47.72.84 freebsd.org has IPv6 address 2610:1c1:1:606c::50:15 freebsd.org mail is handled by 10 mx1.freebsd.org. freebsd.org mail is handled by 30 mx66.freebsd.org. gluster6 freebsd.org has address 96.47.72.84 freebsd.org has IPv6 address 2610:1c1:1:606c::50:15 freebsd.org mail is handled by 10 mx1.freebsd.org. freebsd.org mail is handled by 30 mx66.freebsd.org.

The DNS resolution works properly, now we will switch from the default quarterly pkg(8) repository to the latest one which has more frequent updates as the name suggests. We will need to use sed -i '' s/quarterly/latest/g /etc/pkg/FreeBSD.conf command on each FreeBSD machine.

Verification what will be sent to GNU Parallel.

vbhost % ansible -i hosts --list-hosts gluster \ | sed 1d \ | while read I; do echo "ssh ${I} 'sed -i \"\" s/quarterly/latest/g /etc/pkg/FreeBSD.conf'"; done ssh gluster1 'sed -i "" s/quarterly/latest/g /etc/pkg/FreeBSD.conf' ssh gluster2 'sed -i "" s/quarterly/latest/g /etc/pkg/FreeBSD.conf' ssh gluster3 'sed -i "" s/quarterly/latest/g /etc/pkg/FreeBSD.conf' ssh gluster4 'sed -i "" s/quarterly/latest/g /etc/pkg/FreeBSD.conf' ssh gluster5 'sed -i "" s/quarterly/latest/g /etc/pkg/FreeBSD.conf' ssh gluster6 'sed -i "" s/quarterly/latest/g /etc/pkg/FreeBSD.conf'

Let’s send the command to FreeBSD machines then.

vbhost % ansible -i hosts --list-hosts gluster \ | sed 1d \ | while read I; do echo "ssh $I 'sed -i \"\" s/quarterly/latest/g /etc/pkg/FreeBSD.conf'"; done | parallel Computers / CPU cores / Max jobs to run 1:local / 2 / 2 Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete local:0/6/100%/1.0s

As shown below the latest repository is configured in the /etc/pkg/FreeBSD.conf file on each FreeBSD machine.

vbhost % ssh gluster3 tail -7 /etc/pkg/FreeBSD.conf FreeBSD: { url: "pkg+http://pkg.FreeBSD.org/${ABI}/latest", mirror_type: "srv", signature_type: "fingerprints", fingerprints: "/usr/share/keys/pkg", enabled: yes }

We may now get back to Python.

vbhost % ansible -i hosts --list-hosts gluster \ | sed 1d \ | while read I; do echo ssh ${I} env ASSUME_ALWAYS_YES=yes pkg install python; done ssh gluster1 env ASSUME_ALWAYS_YES=yes pkg install python ssh gluster2 env ASSUME_ALWAYS_YES=yes pkg install python ssh gluster3 env ASSUME_ALWAYS_YES=yes pkg install python ssh gluster4 env ASSUME_ALWAYS_YES=yes pkg install python ssh gluster5 env ASSUME_ALWAYS_YES=yes pkg install python ssh gluster6 env ASSUME_ALWAYS_YES=yes pkg install python

… and execution on the FreeBSD machines with GNU Parallel.

vbhost % ansible -i hosts --list-hosts gluster \ | sed 1d \ | while read I; do echo ssh ${I} env ASSUME_ALWAYS_YES=yes pkg install python; done | parallel Computers / CPU cores / Max jobs to run 1:local / 2 / 2 Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete local:0/6/100%/156.0s

The Python packages and its dependencies are installed.

vbhost % ssh gluster3 pkg info gettext-runtime-0.19.8.1_2 GNU gettext runtime libraries and programs indexinfo-0.3.1 Utility to regenerate the GNU info page index libffi-3.2.1_3 Foreign Function Interface pkg-1.10.5_5 Package manager python-2.7_3,2 "meta-port" for the default version of Python interpreter python2-2_3 The "meta-port" for version 2 of the Python interpreter python27-2.7.15 Interpreted object-oriented programming language readline-7.0.5 Library for editing command lines as they are typed

Now with Ansible Ping module works as desired.

% ansible -i hosts -m ping gluster gluster1 | SUCCESS => { "changed": false, "ping": "pong" } gluster4 | SUCCESS => { "changed": false, "ping": "pong" } gluster5 | SUCCESS => { "changed": false, "ping": "pong" } gluster3 | SUCCESS => { "changed": false, "ping": "pong" } gluster2 | SUCCESS => { "changed": false, "ping": "pong" } gluster6 | SUCCESS => { "changed": false, "ping": "pong" }

GlusterFS Volume Options

GlusterFS has a lot of options to setup the volume. They are described in the GlusterFS Administration Guide in the Setting up GlusterFS Volumes part. Here they are:

Distributed – Distributed volumes distribute files across the bricks in the volume. You can use distributed volumes where the requirement is to scale storage and the redundancy is either not important or is provided by other hardware/software layers.

Replicated – Replicated volumes replicate files across bricks in the volume. You can use replicated volumes in environments where high-availability and high-reliability are critical.

Distributed Replicated – Distributed replicated volumes distribute files across replicated bricks in the volume. You can use distributed replicated volumes in environments where the requirement is to scale storage and high-reliability is critical. Distributed replicated volumes also offer improved read performance in most environments.

Dispersed – Dispersed volumes are based on erasure codes, providing space-efficient protection against disk or server failures. It stores an encoded fragment of the original file to each brick in a way that only a subset of the fragments is needed to recover the original file. The number of bricks that can be missing without losing access to data is configured by the administrator on volume creation time.

Distributed Dispersed – Distributed dispersed volumes distribute files across dispersed subvolumes. This has the same advantages of distribute replicate volumes, but using disperse to store the data into the bricks.

Striped [Deprecated] – Striped volumes stripes data across bricks in the volume. For best results, you should use striped volumes only in high concurrency environments accessing very large files.

Distributed Striped [Deprecated] – Distributed striped volumes stripe data across two or more nodes in the cluster. You should use distributed striped volumes where the requirement is to scale storage and in high concurrency environments accessing very large files is critical.

Distributed Striped Replicated [Deprecated] – Distributed striped replicated volumes distributes striped data across replicated bricks in the cluster. For best results, you should use distributed striped replicated volumes in highly concurrent environments where parallel access of very large files and performance is critical. In this release, configuration of this volume type is supported only for Map Reduce workloads.

Striped Replicated [Deprecated] – Striped replicated volumes stripes data across replicated bricks in the cluster. For best results, you should use striped replicated volumes in highly concurrent environments where there is parallel access of very large files and performance is critical. In this release, configuration of this volume type is supported only for Map Reduce workloads.

From all of the above still supported the Dispersed volume seems to be the best choice. Like Minio Dispersed volumes are based on erasure codes.

As we have 6 servers we will use 4 + 2 setup which is logical RAID6 against these 6 servers. This means that we will be able to lost 2 of them without service outage. This also means that if we will upload 100 MB file to our volume we will use 150 MB of space across these 6 servers with 25 MB on each node.

We can visualize this as following ASCII diagram.

+-----------+ +-----------+ +-----------+ +-----------+ +-----------+ +-----------+ | gluster1 | | gluster2 | | gluster3 | | gluster4 | | gluster5 | | gluster6 | | | | | | | | | | | | | | brick1 | | brick2 | | brick3 | | brick4 | | brick5 | | brick6 | +-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+ | | | | | | 25|MB 25|MB 25|MB 25|MB 25|MB 25|MB | | | | | | +-------------+-------------+------+------+-------------+-------------+ | 100|MB | +---+---+ | file0 | +-------+

Deploy GlusterFS Cluster

We will use gluster-setup.yml as our Ansible playbook.

Lets create something for the start, for example to always install the latest Python package.

vbhost % cat gluster-setup.yml --- - name: Install and Setup GlusterFS on FreeBSD hosts: gluster user: root tasks: - name: Install Latest Python Package pkgng: name: python state: latest

We will now execute it.

vbhost % ansible-playbook -i hosts gluster-setup.yml PLAY [Install and Setup GlusterFS on FreeBSD] ********************************** TASK [Gathering Facts] ********************************************************* ok: [gluster3] ok: [gluster5] ok: [gluster1] ok: [gluster4] ok: [gluster2] ok: [gluster6] TASK [Install Latest Python Package] ******************************************* ok: [gluster4] ok: [gluster2] ok: [gluster5] ok: [gluster3] ok: [gluster1] ok: [gluster6] PLAY RECAP ********************************************************************* gluster1 : ok=2 changed=0 unreachable=0 failed=0 gluster2 : ok=2 changed=0 unreachable=0 failed=0 gluster3 : ok=2 changed=0 unreachable=0 failed=0 gluster4 : ok=2 changed=0 unreachable=0 failed=0 gluster5 : ok=2 changed=0 unreachable=0 failed=0 gluster6 : ok=2 changed=0 unreachable=0 failed=0

We just installed Python on these machines no update was needed.

As we will be creating cluster we need to add time synchronization between the nodes of the cluster. We will use mose obvious solution – the ntpd(8) daemon that is in the FreeBSD base system. These lines are added to our gluster-setup.yml playbook to achieve this goal

- name: Enable NTPD Service raw: sysrc ntpd_enable=YES - name: Start NTPD Service service: name: ntpd state: started

After executing the playbook again with the ansible-playbook -i hosts gluster-setup.yml command we will see additional output as the one shown below.

TASK [Enable NTPD Service] ************************************************ changed: [gluster2] changed: [gluster1] changed: [gluster4] changed: [gluster5] changed: [gluster3] changed: [gluster6] TASK [Start NTPD Service] ****************************************************** changed: [gluster5] changed: [gluster4] changed: [gluster2] changed: [gluster1] changed: [gluster3] changed: [gluster6]

Random verification of the NTP service.

vbhost % ssh gluster1 ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== 0.freebsd.pool. .POOL. 16 p - 64 0 0.000 0.000 0.000 ntp.ifj.edu.pl 10.0.2.4 3 u 1 64 1 119.956 -345759 32.552 news-archive.ic 229.30.220.210 2 u - 64 1 60.533 -345760 21.104

Now we need to install GlusterFS on FreeBSD machines – the glusterfs package.

We will add appropriate section to the playbook.

- name: Install Latest GlusterFS Package pkgng: state: latest name: - glusterfs - ncdu

You can add more then one package to the pkgng Ansible module – for example I have also added ncdu package.

You can read more about pkgng Ansible module by typing the ansible-doc pkgng command or at least its short version with -s argument.

vbhost % ansible-doc -s pkgng - name: Package manager for FreeBSD >= 9.0 pkgng: annotation: # A comma-separated list of keyvalue-pairs of the form `[=]'. A `+' denotes adding an annotation, a `-' denotes removing an annotation, and `:' denotes modifying an annotation. If setting or modifying annotations, a value must be provided. autoremove: # Remove automatically installed packages which are no longer needed. cached: # Use local package base instead of fetching an updated one. chroot: # Pkg will chroot in the specified environment. Can not be used together with `rootdir' or `jail' options. jail: # Pkg will execute in the given jail name or id. Can not be used together with `chroot' or `rootdir' options. name: # (required) Name or list of names of packages to install/remove. pkgsite: # For pkgng versions before 1.1.4, specify packagesite to use for downloading packages. If not specified, use settings from `/usr/local/etc/pkg.conf'. For newer pkgng versions, specify a the name of a repository configured in `/usr/local/etc/pkg/repos'. rootdir: # For pkgng versions 1.5 and later, pkg will install all packages within the specified root directory. Can not be used together with `chroot' or `jail' options. state: # State of the package. Note: "latest" added in 2.7

You can read more about this particular module on the following – https://docs.ansible.com/ansible/latest/modules/pkgng_module.html – Ansible page.

We will now add GlusterFS nodes to the /etc/hosts file and add autoboot_delay=1 parameter to the /boot/loader.conf file so our systems will boot 9 seconds faster as 10 is the default delay setting.

Here is out gluster-setup.yml Ansible playbook this far.

vbhost % cat gluster-setup.yml --- - name: Install and Setup GlusterFS on FreeBSD hosts: gluster user: root tasks: - name: Install Latest Python Package pkgng: name: python state: latest - name: Enable NTPD Service raw: sysrc ntpd_enable=YES - name: Start NTPD Service service: name: ntpd state: started - name: Install Latest GlusterFS Package pkgng: state: latest name: - glusterfs - ncdu - name: Add Nodes to /etc/hosts File blockinfile: path: /etc/hosts block: | 10.0.10.11 gluster1 10.0.10.12 gluster2 10.0.10.13 gluster3 10.0.10.14 gluster4 10.0.10.15 gluster5 10.0.10.16 gluster6 - name: Add autoboot_delay to /boot/loader.conf File lineinfile: path: /boot/loader.conf line: autoboot_delay=1 create: yes

Here is the result of the execution of this playbook.

vbhost % ansible-playbook -i hosts gluster-setup.yml PLAY [Install and Setup GlusterFS on FreeBSD] ********************************** TASK [Gathering Facts] ********************************************************* ok: [gluster3] ok: [gluster5] ok: [gluster1] ok: [gluster4] ok: [gluster2] ok: [gluster6] TASK [Install Latest Python Package] ******************************************* ok: [gluster4] ok: [gluster2] ok: [gluster5] ok: [gluster3] ok: [gluster1] ok: [gluster6] TASK [Install Latest GlusterFS Package] **************************************** ok: [gluster2] ok: [gluster1] ok: [gluster3] ok: [gluster5] ok: [gluster4] ok: [gluster6] TASK [Add Nodes to /etc/hosts File] ******************************************** changed: [gluster5] changed: [gluster4] changed: [gluster2] changed: [gluster3] changed: [gluster1] changed: [gluster6] TASK [Enable GlusterFS Service] ************************************************ changed: [gluster1] changed: [gluster4] changed: [gluster2] changed: [gluster3] changed: [gluster5] changed: [gluster6] TASK [Add autoboot_delay to /boot/loader.conf File] **************************** changed: [gluster3] changed: [gluster2] changed: [gluster5] changed: [gluster1] changed: [gluster4] changed: [gluster6] PLAY RECAP ********************************************************************* gluster1 : ok=6 changed=3 unreachable=0 failed=0 gluster2 : ok=6 changed=3 unreachable=0 failed=0 gluster3 : ok=6 changed=3 unreachable=0 failed=0 gluster4 : ok=6 changed=3 unreachable=0 failed=0 gluster5 : ok=6 changed=3 unreachable=0 failed=0 gluster6 : ok=6 changed=3 unreachable=0 failed=0

Let’s check that FreeBSD machines can now ping each other by names.

vbhost % ssh gluster6 cat /etc/hosts # LOOPBACK 127.0.0.1 localhost localhost.my.domain ::1 localhost localhost.my.domain # BEGIN ANSIBLE MANAGED BLOCK 10.0.10.11 gluster1 10.0.10.12 gluster2 10.0.10.13 gluster3 10.0.10.14 gluster4 10.0.10.15 gluster5 10.0.10.16 gluster6 # END ANSIBLE MANAGED BLOCK vbhost % ssh gluster1 ping -c 1 gluster3 PING gluster3 (10.0.10.13): 56 data bytes 64 bytes from 10.0.10.13: icmp_seq=0 ttl=64 time=1.924 ms --- gluster3 ping statistics --- 1 packets transmitted, 1 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 1.924/1.924/1.924/0.000 ms

… and our /boot/loader.conf file.

vbhost % ssh gluster4 cat /boot/loader.conf autoboot_delay=1

Now we need to create directories for GlusterFS data. Without better idea we will use /data directory with /data/colume1 as the directory for volume1 and bricks will be put as /data/volume1/brick1 dirs. In this setup I will use just one brick per server but in production environment you would probably use one brick per physical disk.

Here is the playbook command we will use to create these directories on FreeBSD machines.

- name: Create brick* Directories for volume1 raw: mkdir -p /data/volume1/brick` hostname | grep -o -E '[0-9]+' `

After executing it with ansible-playbook -i hosts gluster-setup.yml command the directories has beed created.

vbhost % ssh gluster2 find /data -ls | column -t 2247168 8 drwxr-xr-x 3 root wheel 512 Dec 28 17:48 /data 2247169 8 drwxr-xr-x 3 root wheel 512 Dec 28 17:48 /data/volume2 2247170 8 drwxr-xr-x 2 root wheel 512 Dec 28 17:48 /data/volume2/brick2

We now need to add glusterd_enable=YES to the /etc/rc.conf file on GlusterFS nodes and then start the GlsuterFS service.

This is the snippet we will add to our playbook.

- name: Enable GlusterFS Service raw: sysrc glusterd_enable=YES - name: Start GlusterFS Service service: name: glusterd state: started

Let’s make quick random verification.

vbhost % ssh gluster4 service glusterd status glusterd is running as pid 2684.

Now we need to proceed to the last part of the GlusterFS setup – create the volume.

We will do this from the gluster1 – the 1st node of the GlusterFS cluster.

First we need to peer probe other nodes.

gluster1 # gluster peer probe gluster1 peer probe: success. Probe on localhost not needed gluster1 # gluster peer probe gluster2 peer probe: success. gluster1 # gluster peer probe gluster3 peer probe: success. gluster1 # gluster peer probe gluster4 peer probe: success. gluster1 # gluster peer probe gluster5 peer probe: success. gluster1 # gluster peer probe gluster6 peer probe: success.

Then we can create the volume. We will need to use force option to because for our example setup we will use directories on the root partition.

gluster1 # gluster volume create volume1 \ disperse-data 4 \ redundancy 2 \ transport tcp \ gluster1:/data/volume1/brick1 \ gluster2:/data/volume1/brick2 \ gluster3:/data/volume1/brick3 \ gluster4:/data/volume1/brick4 \ gluster5:/data/volume1/brick5 \ gluster6:/data/volume1/brick6 \ force volume create: volume1: success: please start the volume to access data

We can now start the volume1 GlsuerFS volume.

gluster1 # gluster volume start volume1 volume start: volume1: success gluster1 # gluster volume status volume1 Status of volume: volume1 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick gluster1:/data/volume1/brick1 N/A N/A N N/A Brick gluster2:/data/volume1/brick2 N/A N/A N N/A Brick gluster3:/data/volume1/brick3 N/A N/A N N/A Brick gluster4:/data/volume1/brick4 N/A N/A N N/A Brick gluster5:/data/volume1/brick5 N/A N/A N N/A Brick gluster6:/data/volume1/brick6 N/A N/A N N/A Self-heal Daemon on localhost N/A N/A N 644 Self-heal Daemon on gluster6 N/A N/A N 643 Self-heal Daemon on gluster5 N/A N/A N 647 Self-heal Daemon on gluster2 N/A N/A N 645 Self-heal Daemon on gluster3 N/A N/A N 645 Self-heal Daemon on gluster4 N/A N/A N 645 Task Status of Volume volume1 ------------------------------------------------------------------------------ There are no active volume tasks gluster1 # gluster volume info volume1 Volume Name: volume1 Type: Disperse Volume ID: 68cf9607-16bc-4550-9b6b-16a5c7656f51 Status: Started Snapshot Count: 0 Number of Bricks: 1 x (4 + 2) = 6 Transport-type: tcp Bricks: Brick1: gluster1:/data/volume1/brick1 Brick2: gluster2:/data/volume1/brick2 Brick3: gluster3:/data/volume1/brick3 Brick4: gluster4:/data/volume1/brick4 Brick5: gluster5:/data/volume1/brick5 Brick6: gluster6:/data/volume1/brick6 Options Reconfigured: nfs.disable: on transport.address-family: inet

Here are contents of currently unused/empty brick.

gluster1 # find /data/volume1/brick1 /data/volume1/brick1 /data/volume1/brick1/.glusterfs /data/volume1/brick1/.glusterfs/indices /data/volume1/brick1/.glusterfs/indices/xattrop /data/volume1/brick1/.glusterfs/indices/entry-changes /data/volume1/brick1/.glusterfs/quarantine /data/volume1/brick1/.glusterfs/quarantine/stub-00000000-0000-0000-0000-000000000008 /data/volume1/brick1/.glusterfs/changelogs /data/volume1/brick1/.glusterfs/changelogs/htime /data/volume1/brick1/.glusterfs/changelogs/csnap /data/volume1/brick1/.glusterfs/brick1.db /data/volume1/brick1/.glusterfs/brick1.db-wal /data/volume1/brick1/.glusterfs/brick1.db-shm /data/volume1/brick1/.glusterfs/00 /data/volume1/brick1/.glusterfs/00/00 /data/volume1/brick1/.glusterfs/00/00/00000000-0000-0000-0000-000000000001 /data/volume1/brick1/.glusterfs/landfill /data/volume1/brick1/.glusterfs/unlink /data/volume1/brick1/.glusterfs/health_check

The 6-node GlusterFS cluster is now complete and volume1 available to use.

Alternative

The GlusterFS’s documentation Quick Start Guide also suggests using Ansible to deploy and manage GlusterFS with gluster-ansible repository or gluster-ansible-cluster but they have below requirements.

Ansible version 2.5 or above.

GlusterFS version 3.2 or above.

As GlusterFS on FreeBSD is at 3.11.1 version I did not used them.

FreeBSD Client

We will now use another VirtualBox machine – also based on the same FreeBSD 12.0-RELEASE image – to create FreeBSD Client machine that will mount our volume1 volume.

We will need to install glusterfs package with pkg(8) command. Then we will use mount_glusterfs command to mount the volume. Keep in mind that in order to mount GlusterFS volume the FUSE ( fuse.ko kernel module is needed.

client # pkg install glusterfs client # kldload fuse client # mount_glusterfs 10.0.10.11:volume1 /mnt client # echo $? 0 client # mount /dev/gpt/rootfs on / (ufs, local, soft-updates) devfs on /dev (devfs, local, multilabel) /dev/fuse on /mnt (fusefs, local, synchronous) client # ls /mnt ls: /mnt: Socket is not connected

It is mounted but does not work. The solution to this problem is to add appropriate /etc/hosts entries to the GlusterFS nodes.

client # cat /etc/hosts ::1 localhost localhost.my.domain 127.0.0.1 localhost localhost.my.domain 10.0.10.11 gluster1 10.0.10.12 gluster2 10.0.10.13 gluster3 10.0.10.14 gluster4 10.0.10.15 gluster5 10.0.10.16 gluster6

Lets mount it again now with needed /etc/hosts entries.

client # umount /mnt client # mount_glusterfs gluster1:volume1 /mnt client # ls /mnt client #

We now have our GlusterFS volume properly mounted and working on the FreeBSD Client machine.

Lets write some file there with dd(8) to see how it works.

client # dd FILE bs=1m count=100 status=progress 73400320 bytes (73 MB, 70 MiB) transferred 1.016s, 72 MB/s 100+0 records in 100+0 records out 104857600 bytes transferred in 1.565618 secs (66975227 bytes/sec)

Let’s see how it looks in the brick directory.

gluster1 # ls -lh /data/volume1/brick1 total 25640 drw------- 10 root wheel 512B Jan 3 18:31 .glusterfs -rw-r--r-- 2 root wheel 25M Jan 3 18:31 FILE gluster1 # find /data /data/ /data/volume1 /data/volume1/brick1 /data/volume1/brick1/.glusterfs /data/volume1/brick1/.glusterfs/indices /data/volume1/brick1/.glusterfs/indices/xattrop /data/volume1/brick1/.glusterfs/indices/xattrop/xattrop-aed814f1-0eb0-46a1-b569-aeddf5048e06 /data/volume1/brick1/.glusterfs/indices/entry-changes /data/volume1/brick1/.glusterfs/quarantine /data/volume1/brick1/.glusterfs/quarantine/stub-00000000-0000-0000-0000-000000000008 /data/volume1/brick1/.glusterfs/changelogs /data/volume1/brick1/.glusterfs/changelogs/htime /data/volume1/brick1/.glusterfs/changelogs/csnap /data/volume1/brick1/.glusterfs/brick1.db /data/volume1/brick1/.glusterfs/brick1.db-wal /data/volume1/brick1/.glusterfs/brick1.db-shm /data/volume1/brick1/.glusterfs/00 /data/volume1/brick1/.glusterfs/00/00 /data/volume1/brick1/.glusterfs/00/00/00000000-0000-0000-0000-000000000001 /data/volume1/brick1/.glusterfs/landfill /data/volume1/brick1/.glusterfs/unlink /data/volume1/brick1/.glusterfs/health_check /data/volume1/brick1/.glusterfs/ac /data/volume1/brick1/.glusterfs/ac/b4 /data/volume1/brick1/.glusterfs/11 /data/volume1/brick1/.glusterfs/11/50 /data/volume1/brick1/.glusterfs/11/50/115043ca-420f-48b5-af05-c9552db2e585 /data/volume1/brick1/FILE

Linux Client

I will also show how to mount GlusterFS volume on the Red Hat clone CentOS in its latest 7.6 incarnation. It will require glusterfs-fuse package installation.

[root@localhost ~]# yum install glusterfs-fuse [root@localhost ~]# rpm -q --filesbypkg glusterfs-fuse | grep /sbin/mount.glusterfs glusterfs-fuse /sbin/mount.glusterfs [root@localhost ~]# mount.glusterfs 10.0.10.11:volume1 /mnt Mount failed. Please check the log file for more details.

Similarly like with FreeBSD Client the /etc/hosts entries are needed.

[root@localhost ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 10.0.10.11 gluster1 10.0.10.12 gluster2 10.0.10.13 gluster3 10.0.10.14 gluster4 10.0.10.15 gluster5 10.0.10.16 gluster6 [root@localhost ~]# mount.glusterfs 10.0.10.11:volume1 /mnt [root@localhost ~]# ls /mnt FILE [root@localhost ~]# mount 10.0.10.11:volume1 on /mnt type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

With apropriate /etc/hosts entries it works as desired. We see the FILE file generated fron the FreeBSD Client machine.

GlusterFS Cluster Redundancy

After messing with the volume and creating and deleting various files I also tested its redundancy. In theory this RAID6 equivalent protection should protect us from the loss of two of six servers. After shutdown of two VirtualBox machines the volume is still available and ready to use.

Closing Thougts

Pity that FreeBSD does not provide more modern GlusterFS package as currently only 3.11.1 version is available.

EOF