Until recently, I've been a fan of the PostgreSQL database, but Mysql's database replication implementation is very slick.

Not to disparage my favorite RDBMS too much, but PostgreSQL's replication mechanisms all seemed sort of kludgey. I'm not interested in creating update/delete triggers on all of my tables, for example. Mysql's mechanism is completely transparent to the client, and doesn't need any table definition changes; it's all done on the server(s).

When I refer to “database replication,” I'm not referring to a bulk file transfer. Nor am I referring to proxy mechanism that sends update/delete statements to multiple database servers. I'm referring to the ability to have a database update against one server reflected on multiple other servers without any additional intervention from the client. All this in what I would call “near real-time.” There is necessarily some propagation delay.

By implementing replication, (and in fact, migrating my entire application to Mysql to take advantage of it) I am trying to mitigate two potential threats against my application, and thus my business: a complete server or service failure, and degraded performance during routine backup cycles. In the event that my main database server dies, I can simply point my applications to the other replicated server and continue to do business. In my case, I simply have to move a DNS pointer to accomplish this task. On the other hand, I currently do database dumps late at night because of the load it places on my server. With replication, I can perform my daily backups during prime-time, on a local read-only server. My main servers need never know that they've been backed up.

So before we talk about how to design and implement Mysql replication, let's talk briefly about how it works. Essentially, Mysql uses a Master-slave model where the master keeps a log of all of the database updates that it has performed. Then, one or more slaves connect to the master, read each log entry, and perform the indicated update. The master server keeps track of housekeeping issues such as log rotation and access control. Each slave server has to maintain an idea of it's current position within the server's transaction log. As new transactions occur on the server, they get logged on the master server and downloaded by each slave. Once the transaction has been committed by each slave, the slaves update their position in the server's transaction log and wait for the next transaction. This is all done asynchronously, which means that the master server doesn't have to wait for the slaves to “catch up.” It also means that if a slave is unable to connect to the master for a period of time, it can simply download all of the pending transactions when connectivity is re-established. So far, I've found it to be pretty robust.

Even though we're talking about a master-slave model, we can actually build quite a few different server topologies to fit different needs.

The base case is where we have two servers and simply slave one to the other, which results in a master-slave configuration. In this case, we could think of the slave server as a hot standby server in case the master fails, or a server on which to run time consuming reports without affecting the master server. By adding slaves, we can implement a star topology.

If we only have two servers, we can simply configure each to be a slave to the other and we end up with a master-master configuration that works pretty well. Transactions that occur on one server are reflected on the other and vise versa. In this case, each server is completely equivalent to the other, so you could use this configuration as a load-balancing mechanism.

Variations on these themes give us the ability to build chains and even rings of replicated database servers, but this may not be as beneficial as you might think. In a chain topology for example, each server and link, adds to the data propagation time was well as additional points of failure. So while you can build some truly amazing topologies, in general, you want to keep things as simple as possible. For my particular case, I chose a master to master configuration with an additional server slaved to one of the masters. This configuration gave me a hot stand-by with automatic recovery from failure. I also gained the ability to run backups on my slave, or run my entire business from the slave in the event that the data center housing the other two servers loses connectivity. This is a lot to ask from such a simple topology.

When I first set out to configure replication on my database servers, I really expected it to be a complex procedure; it wasn't. My approach was to completely read all of the documentation at the Mysql website, ask a few questions on the Mysql replication mailing list, and begin the work. It took longer to read the documentation than it did to actually do the configuration, as it should. The two resources that I found most helpful were at:

dev.mysql.com/doc/refman/5.0/en/replication.html

and

www.howtoforge.com/mysql_database_replication

The first link is a fairly extensive document detailing all of the configuration issues and options. I recommend you read it completely before finalizing your design. The second link leads to a document that bills itself as a "copy & paste" HowTo.

Because I'm more interested in discussing some of the gotcha's, I'm only going to discuss the configuration process briefly. Fortunately, all we need to do is modify the [mysqld] section of /etc/mysql/my.cnf, issue a couple of SQL commands at the CLI, and restart Mysql. This is what I added to one of my masters:

#skip-networking bind-address=0.0.0.0 log-bin=mysql-bin server-id=11 innodb_flush_log_at_trx_commit=1 sync_binlog=1 auto_increment_offset=2 auto_increment_increment=5 master-host=master2.example.com master-user=slave1 master-password=password1

All of the items above are well documented at the resources mentioned earlier, so I won't go into too much detail here. However, the auto_increment_offset and auto_increment_increment items are kind of interesting.

In a master-master configuration, it is entirely possible that an insert on the same table can occur on both servers. What happens if one of the fields in that table is an auto_increment field? The problem is that we want such a field to have a value that is unique across servers, but we don't really have a mechanism for the servers to perform this bookkeeping chore. So, we use a slick work-around; instead of adding one each time we perform an insert on a table with an auto_increment field, we add a larger number, say 5, plus an offset. Each master would then be assigned a unique offset that is less than the increment value. For example, we could use an increment of 5 and an offset of 1 and 2 for our two masters. Then when we inserted into an auto_increment field, one of our servers would assign values of 1, 6, 11, 16, 21... while the other server would use the values 2, 7, 12, 17, 22... Neither server would assign a value that the other one might assign. In this case, we'd skip the values 3,4,5,8,9,10, etc., but we'd be guaranteed that each of our values would be unique. If we then added additional master servers, we could assign them different offsets, as long as we had less than 5 servers.

The resources mentioned earlier detail the rest of the required steps, so I'll just list the steps briefly.

The next step in setting up our replication is to grant permission to the slaves to connect to the master in order to download the transaction log. Then we have to find out where the current position is in the transaction log on the master and inform each slave what this position is so that they can pick up at the current transaction. Then we restart Mysql, and to be safe, I restarted ALL of my servers, and that was all I had to do. It took me less than 30 minutes to get it done.

Once you have replication working, there are a few things you need to be aware of. The biggest issue is how to get data into your newly-replicated database. Obviously, it is MUCH easier to start with an empty database and just start adding data. Unfortunately, my databases already had data in them. There were procedures on the websites that detailed a couple of methods, depending on how much down-time you can afford. I simply dumped and dropped all of my databases, configured the replication, and restored the data on one of the masters. This worked for me, but there are better methods.

As I said earlier, PostgreSQL is my favorite RDBMS, but replication is a very compelling reason to migrate to Mysql. I've found replication to be very easy to configure and so far, it's been very stable.