We have covered the basics about Galera Cluster in a previous article – Galera Cluster for MySQL and MariaDB . This article further goes into the Galera technology and discusses topics like:

1. Who provides Galera Cluster?

2. What is MariaDB Galera Cluster?

3. An overview of MariaDB Galera Cluster Setup.

Galera Cluster is a synchronous multi-master cluster that uses the InnoDB storage engine. In MariaDB, it supports the XtraDB and InnoDB storage engines. It is actually the Galera replication plugin that extends the wsrep API of the underlying DBMS. Galera Cluster uses Certification based synchronous replication in the multi master server setup. This replication eliminates many issues faced by asynchronous replication based clusters, like write conflicts, replication lag between cluster nodes, slaves going out of sync with masters, single point of failure, etc. Certification based replication utilizes group communication and transaction ordering techniques. Changes at one node are grouped as a write-set upon COMMIT and this write-set is broadcasted to other nodes. Each node including the source node performs a certification test to decide whether the write-set can be applied or not. Only if the certification test passes, the write-set is applied as a transaction and COMMITTED on the nodes; otherwise a roll-back is performed, thus discarding the changes. The certification test is based on Global Ordering of transactions in which each transaction is assigned a global transaction id. During COMMIT time the last transaction is checked with previous transactions to detect any primary key conflicts and if conflict detected, certification test is failed. If test passes, all nodes receive transactions in the same global order.

Galera Cluster Plugin is an open source patch for MySQL developed by Codership. It is available at Codership as 2 software packages – the Galera replication library and original MySQL version extended with the Write Set Replication (wsrep) API implementation (mysql-wsrep). Including this Codership product, there are 3 Galera variants:

1. MySQL Galera Cluster by Codership

2. MariaDB Galera Cluster by MariaDB

3. Percona XtraDB Cluster for MySQL by Percona that integrates the Percona Server and Percona XtraBackup with Codership Galera library.

The MariaDB Galera Cluster

Starting with MariaDB 10.1, the wsrep API for Galera Cluster is included in the original MariaDB package. MariaDB Galera Cluster uses the Codership Galera Library and mysql-wsrep to implement its Cluster. MariaDB is available for the major Linux distributions like openSUSE, Arch Linux, Fedora, CentOS, RedHat, Mint, Ubuntu, Debian, etc. As in any Galera Cluster implementation, MariaDB Galera Cluster is also implemented using the below components:

DBMS – here it is MariaDB Server. wsrep API – defines and implements the interface and responsibilities for the DBMS Server and replication provider. wsrep hooks – the wsrep integration of the wsrep API, inside the DBMS engine. Galera plugin – implements the wsrep API for Galera Library to provide the write-set replication functionality. Certification layer – prepare write-sets and performs certification. Replication layer – manages the entire replication and provides total ordering capabilities. GCS framework – provides plugin architecture for various group communication plugins for the Galera Cluster.

Features of the MariaDB Galera Cluster

As like any other Galera Clusters, MariaDB Galera Cluster also has the following features:

Certification based synchronous replication. Multi-master topology. Clients can read/write to any node. Cluster membership control and dropping of failed nodes from the cluster with re-join upon recovery. Automatic joining of nodes to running cluster. Row level parallel replication. Direct client connections through MariaDB interface.

Downloading and Installing MariaDB Galera Cluster

It can be downloaded and installed using Yum or Apt. The Galera package is included by default in the MariaDB packages. On Ubuntu, following will install the MariaDB Server with Galera Plugin:

sudo apt-get update

sudo apt-get install mariadb-server

Apart from the memory needed for MariaDB Server, additional memory is needed for the certification index and uncommitted writesets. There is a process called Writeset caching that takes place when a node could not process an incoming writeset. This typically occurs when the node is undergoing a state change operation like WRITE or dumping through utilities like mysqldump. Correspondingly the source node also could not apply the writeset at the target node in this scenario. Then the pending writeset will be cached in memory for a catch-up phase. The reading of writeset from in-memory cache and committing it should happen normally but if the system runs out of memory, the state transfer of the nodes will fail or the cluster will block waiting for the state transfer to end. The following Galera parameters are to be configured for writeset caching:

gcs.recv_q_hard_limit : it is the maximum allowed size of the recv queue (committed writesets to be replicated).It should normally has a value of RAM + SWAP. If this limit is exceeded, the Galera Cluster will abort the server.

gcs.recv_q_soft_limit : It is the fraction (in decimal form) of gcs.recv_q_hard_limit after which replication rate will be throttled to avoid a memory shortage and server abort. Its default value is 0.25.

gcs.max_throttle : It is a fraction (in decimal form) that specifies how much to throttle replication rate during state transfer so that running out of memory can be avoided. Its default value is 0.25.

The following server requirements are applicable to a MariaDB Galera Cluster:

1. General log and slow query log need to be file type. 2. For versions before 5.5.40-galera and 10.0.14-galera, query cache has to be disabled.

Running MariaDB Galera Cluster

A new cluster needs to be bootstrapped in the server indicating to the server that there is no existing cluster running on the server. For this invoke mysqld with the option –wsrep-new-cluster.

$ mysqld --wsrep-new-cluster

For systems using SysV init scripts:

$ service mysql bootstrap

To add a new node to a running cluster:

$ mysqld --wsrep_cluster_address=gcomm://192.168.0.1

The IP address can be replaced by DNS name also. Once added, the new node need to connect to any existing node to automatically retrieve the cluster map and reconnect to the rest of the nodes.

MariaDB Galera Cluster Settings

Following mandatory settings are applicable to the cluster. They are to be set either in the MariaDB Server configuration file or as command line option (given in brackets).

wsrep_provider : file location of the wsrep library. (–wsrep-provider=value)

wsrep_cluster_address : IP/DNS address of cluster nodes to connect to when starting up. Eg: gcomm://192.168.0.1:1234?gmcast.listen_addr=0.0.0.0:2345. It is also possible to specify addresses of all nodes – gcomm://<node1 or ip:port>,<node2 or ip2:port>,<node3 or ip3:port> (–wsrep-cluster-address=value)

binlog_format : specifies the binary log format that decides whether the replication is row-based, statement-based or mixed. Valid values are ROW, STATEMENT or MIXED. (–binlog-format=format)

default_storage_engine : to be enabled at server startup for server to start. Default value is InnoDB. (–default-storage-engine=name)

innodb_autoinc_lock_mode : A numeric value from 0 to 2, that specifies the locking mode to be used for generating auto increment values in the tables. Default values – 2 for >=MariaDB 10.2.4 and 1 for MariaDB 10.2.3. (–innodb-autoinc-lock-mode=#)

0 – Traditional lock mode that holds a table-level lock for all INSERTS until the auto increment value is generated.

1 – Consecutive lock mode that holds a table-level lock for all bulk INSERTS. For simple INSERTS no lock is used, instead a lightweight mutex is used (mutual exclusion object that facilitates multiple threads to access a resource).

2 – Interleaved lock mode that does not locks tables. Though most fastest and scalable mode, this is not suitable for statement-based replication.

innodb_doublewrite : A Boolean default value of 1 (ON) specifies InnoDB to first store data in a doublewrite buffer before writing to the data file. Disabling this has a marginal improvement in performance. (–innodb-doublewrite)

query_cache_size : the query cache size in bytes. By default it is 1M (–query-cache-size=#)

wsrep_on : To enable wsrep replication. It has a default Boolean value OFF and needs to be turned ON for transactions to be applied to nodes in the cluster. However, it will not affect setting up the cluster and membership of the nodes with default value OFF. For Galera replication ON is needed. (–wsrep-on[={0|1}])

The Galera STATUS VARIABLES can be queried with the statement:

SHOW STATUS LIKE ‘wsrep_%’;