Recently Oracle announced several new features for the latest available development version of MySQL that is 5.7.2 at the time of writing this article. Most of them are performance and replication related that show us how incredible the new release will be.

In this post I’m going to try to explain in some easy steps how the new multi-source replication works and how we can configure it for our own tests. It is important to mention that this is a development release, so it is not production ready. Therefore this post is intend to people that want to test the new feature and see how it works with their application, always in a staging environment.

What is multi-source replication?

First, we need to have clear that multi-master and multi-source replication are not the same. Multi-Master replication is the usual circular replication where you can write on any server and data gets replicated to all others.

Multi-source is different. MySQL replication had a limitation, fixed with this new release, that said that one slave could have only one master. That is a limiting factor when we are designing our replication environment. There were some “hacks” to make it work, but now there is an official way. So, in a nutshell, Multi-Source means that a slave can have more than one master. Now, replication environments like these one are possible:

How does it work?

This will help us create some replication hierarchies that were impossible in the past. For example, you can have a slave in your office replicating the data from all servers that you have in the offices spread around the world.

Now we have the concept of communication channels. Each communication channel is a connection from slave to master to get the binary log events. That means we will have one IO_THREAD for each communication channel. We will need to run different “CHANGE MASTER” commands, one for each master, with the “FOR CHANNEL” argument that we will use to give a name to that channel.

CHANGE MASTER MASTER_HOST='something', MASTER_USER=... FOR CHANNEL="name_of_channel"; 1 CHANGE MASTER MASTER_HOST = 'something' , MASTER_USER = . . . FOR CHANNEL = "name_of_channel" ;

Pretty easy. There is one single pre-requisite. The slave should have been configured first with the crash-safe feature of MySQL 5.6. That means that info usually included in master.info or relay-log.info should be on a table. Let’s start with the configuration.

Show me an example!

First you need to download the lab version of mysql from this link.

We have a sandbox environment with 2 masters and 1 slave servers. I won’t go over the details of how to configure the server_id, binary logs or replication users. I assume they are well configured. If you need a howto, you can follow this one.

First, we have to enable the crash safe feature on the slave:

master_info_repository=TABLE; relay_log_info_repository=TABLE; 1 2 master_info_repository = TABLE ; relay_log_info_repository = TABLE ;

After a restart of the slave we can start creating the channels with the names “master1” and “master2”:

slave > change master to master_host="127.0.0.1", master_port=12047, master_user="msandbox",master_password="msandbox" for channel="master1"; slave > change master to master_host="127.0.0.1", master_port=12048, master_user="msandbox",master_password="msandbox" for channel="master2"; 1 2 slave > change master to master_host = "127.0.0.1" , master_port = 12047 , master_user = "msandbox" , master_password = "msandbox" for channel = "master1" ; slave > change master to master_host = "127.0.0.1" , master_port = 12048 , master_user = "msandbox" , master_password = "msandbox" for channel = "master2" ;

To start the slave processes you need to specify what channel are you referring to:

slave > start slave for channel="master1"; slave > start slave for channel="master2"; 1 2 slave > start slave for channel = "master1" ; slave > start slave for channel = "master2" ;

Now, we want to check the status of the slave:

slave > show slave statusG Empty set (0.00 sec) 1 2 slave > show slave statusG Empty set ( 0.00 sec )

Oh, it is empty. We have to specify again which channel we want to check:

slave > SHOW SLAVE STATUS FOR CHANNEL="master1"G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_User: msandbox Master_Port: 12047 Connect_Retry: 60 Master_Log_File: mysql-bin.000002 Read_Master_Log_Pos: 232 Relay_Log_File: squeeze-relay-bin-master1.000003 Relay_Log_Pos: 395 Relay_Master_Log_File: mysql-bin.000002 Slave_IO_Running: Yes Slave_SQL_Running: Yes [...] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 slave > SHOW SLAVE STATUS FOR CHANNEL = "master1" G * * * * * * * * * * * * * * * * * * * * * * * * * * * 1. row * * * * * * * * * * * * * * * * * * * * * * * * * * * Slave_IO_State : Waiting for master to send event Master_Host : 127.0.0.1 Master_User : msandbox Master_Port : 12047 Connect_Retry : 60 Master_Log_File : mysql - bin . 000002 Read_Master_Log_Pos : 232 Relay_Log_File : squeeze - relay - bin - master1 . 000003 Relay_Log_Pos : 395 Relay_Master_Log_File : mysql - bin . 000002 Slave_IO_Running : Yes Slave_SQL_Running : Yes [ . . . ]

and we can also check that the IO_THREAD and SQL_THREADS are running:

slave > SHOW PROCESSLIST; +----+-------------+-----------------------------------------------------------------------------+ | Id | User | State | +----+-------------+-----------------------------------------------------------------------------+ | 2 | system user | Waiting for master to send event | | 3 | system user | Slave has read all relay log; waiting for the slave I/O thread to update it | | 4 | system user | Waiting for master to send event | | 5 | system user | Slave has read all relay log; waiting for the slave I/O thread to update it | +----+-------------+-----------------------------------------------------------------------------+ 1 2 3 4 5 6 7 8 9 slave > SHOW PROCESSLIST ; + -- -- + -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + | Id | User | State | + -- -- + -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + | 2 | system user | Waiting for master to send event | | 3 | system user | Slave has read all relay log ; waiting for the slave I / O thread to update it | | 4 | system user | Waiting for master to send event | | 5 | system user | Slave has read all relay log ; waiting for the slave I / O thread to update it | + -- -- + -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - +

Let’s test it:

master1 > create database master1; master2 > create database master2; slave > show databases like 'master%'; +--------------------+ | Database (master%) | +--------------------+ | master1 | | master2 | +--------------------+ 1 2 3 4 5 6 7 8 9 master1 > create database master1 ; master2 > create database master2 ; slave > show databases like 'master%' ; + -- -- -- -- -- -- -- -- -- -- + | Database ( master % ) | + -- -- -- -- -- -- -- -- -- -- + | master1 | | master2 | + -- -- -- -- -- -- -- -- -- -- +

It works, that easy!

Conclusion

The new multi-source feature allow us to create new replication environments that were impossible in the past without some complex “hacks”. Of course, your application should be designed and developed with this new architecture in mind. Like with multi-master, multi-source needs special care to not end up with your data messed up.

MySQL replication is getting better on every release giving us more configuration, performance and design possibilities. And all those new features can be combined. Your replication environment can be even better if you mix some of the new (and old) features added recently to the replication. For example, you can configure GTID or enable multi-threaded slave per schema.