Introduction

Evaluating the performance of database systems is a very demanding task. There are a lot of hard choices to be made, e.g.:

What operating system and operating system version is to be used

What configuration setup is to be used

What benchmarks are to be used and how long are the warm-up and measure times

What test setups are to be used

What version of the database management system is used

What storage engine is used

While performance evaluation is mostly machine time, there is still a lot of hard work for the human monitoring the tests. In this blog post we have made following choices:

We’re using an Intel Xeon E5-2690 @ 2.9GHz CPU containing 32-cores and Linux 3.4.12 with 132G main memory. The database is stored on a Fusion-IO ioDrive2 Duo 2.41TB Firmware v7.2.5, rev 110646, using Driver 3.3.4 build 5833069. The database file system is using NVMFS and all test logs and outputs are stored on EXT4.

We have selected following benchmarks: LinkBench [1] which is based on traces from production databases that store social graph data at Facebook, a major social network. LinkBench provides a realistic and challenging test for persistent storage of social and web service data. Percona provides an on-line transaction processing (OLTP) benchmark which is an implementation of TPC Benchmark C [2] like workload, which simulates a warehouse application. Approved in July of 1992, TPC Benchmark C is an on-line transaction processing (OLTP) benchmark. TPC-C is more complex than previous OLTP benchmarks such as TPC-A because of its multiple transaction types, more complex database and overall execution structure. TPC-C involves a mix of five concurrent transactions of different types and complexity either executed on-line or queued for deferred execution. The database is comprised of nine types of tables with a wide range of record and population sizes. TPC-C is measured in transactions per minute (tpmC).

We are using LinkBench with 10x database size i.e. about 100G and TPC-C with 1000 warehouses. InnoDB buffer pool is set to 50G, thus on both benchmarks the database does not fit in main-memory.

We selected alpha versions of MariaDB 10.1.0 and MySQL 5.7.4-labs-tplc, because we are interested how their development releases perform. We selected exactly these versions because they both include support for FusionIO hardware, atomic writes [3,4] and page compression [5].

We decided to use InnoDB storage engine on both servers tested.

We use FusionIO hardware atomic writes [3,4] on all tests.

Our reasons for these choices are the following:

We want to evaluate performance on modern hardware not some old soon-to-be-deprecated hardware. The selected hardware is something that could be used when new hardware is acquired today for a database server.

We do not want to evaluate the performance of any corner cases, i.e. single threaded client, all data in main-memory, ridiculous high or low number of client threads, only read-only or write-only workloads, and etc…. These cases are interesting for some applications but not all.

We want to use benchmarks that are well known and approved by the database community.

XtraDB is not available for MySQL and we have evidence that XtraDB does not perform that well on FusionIO hardware.

We use atomic writes on all tests because it provides better performance and thus we have disabled the doublewrite buffer, because when using atomic writes it is unnecessary.

MySQL/MariaDB configuration

We can’t use exactly the same configuration file (my.cnf) for both MariaDB and MySQL because there are some differences on the naming of the parameters. So, below are the configuration variable values so that first common variables and their values are shown followed with tag oracle containing MySQL parameters and with tag mariadb for MariaDB parameters. All parameters are selected so that they are fair and representative, but we do not claim that they optimal values for this hardware or these benchmarks.

[mysqld] innodb_buffer_pool_size = 50G innodb_use_native_aio = 1 innodb_data_file_path=ibdata1:10M:autoextend innodb_doublewrite = 0 innodb_file_per_table = 1 innodb_flush_log_at_trx_commit=1 innodb_buffer_pool_instances = 16 innodb_compression_level = 6 #following is used for page compressed tables 2=lz4 innodb_compression_algorithm = 2 #following is used for uncompressed tables #innodb_compression_algorithm = 0 innodb_flush_method = O_DIRECT innodb_thread_concurrency = 32 innodb_write_io_threads = 32 innodb_read_io_threads = 32 innodb_file_format=barracuda innodb_lru_scan_depth=25000 innodb_io_capacity=25000 innodb_io_capacity_max=35000 innodb_flush_neighbors=0 innodb_adaptive_flushing=1 innodb_log_buffer_size=256M innodb_log_files_in_group=2 innodb_log_file_size=8GB [oracle] innodb_compression_punch_hole = 1 [mariadb] innodb_use_fallocate = 1 innodb_use_atomic_writes = 1 innodb_use_trim = 1 ignore-builtin-innodb plugin-load=ha_innodb.so plugin-dir=/usr/local/mysql/ #these are used on MariaDB page compressed tables, not so good on other configurations innodb_mtflush_threads = 16 innodb_use_mtflush = 1

Benchmark setup

As mentioned before, for LinkBench we use 10x database size, this will create a database size of about 100G. The load command is as follows

./bin/linkbench -D dbid=linkdb -D maxid1=100000001 -c config/MyConfig.properties -l

The LinkBench measure phase uses 64 client threads and we use maxtime 86300 seconds, i.e. 24 hours, and warm-up is default 180 seconds. The measure command is as follows:

nohup ./bin/linkbench -D requesters=64 -Dmaxid1=100000001 -c config/MyConfig.properties -D requests=500000000 -D maxtime=86300 -r

For TPC-C we use 1000 warehouses, thus the load command is as follows:

sh -x load.sh tpcc1000 1000

We use the default number of threads, i.e. 64 client threads (if not something else mentioned). We have run TPC-C with time limit 10800 seconds, i.e. 3 hours, and warmup is 30 seconds, i.e. the measure command is as follows:

./tpcc_start -dtpcc1000 -u root -w1000 -c64 -r30 -l10800 > res

Performance results: LinkBench

Both servers contain support for atomic writes [3,4] and page compression [5] for FusionIO hardware. Below we compare the operations = transactions/second reported using this feature:

Both servers provide stable performance after the buffer pool is warm and full. In both servers the performance drops after the buffer pool is full and we really need to fetch pages from FusionIO hardware. However, MariaDB provides clearly superior performance after the buffer pool is full. In this test we used the LZ4 compression method that is provided by both servers. MariaDB can provide about 30% better performance and the difference at the end of the test is 5654 operations/second.

Both servers provide the following compression methods if available on the system:

MariaDB also provides support for the following compession methods if available on the system:

Below we present the same results using uncompressed tables on both servers.

Both servers start with very good performance, however performance slowly drops when the buffer pool is full and write amplification starts to slow down the I/O performance. This is due to Write amplification (WA), which can happen on Flash memory and Solid-state drives (SSDs) where the actual amount of physical information written is a multiple of the logical amount intended to be written. Because flash memory must be erased before it can be rewritten, the process to perform these operations results in moving (or rewriting) user data and metadata more than once. This multiplying effect increases the number of writes required over the life of the SSD which shortens the time it can reliably operate. The increased writes also consume bandwidth (computing) to the flash memory which mainly reduces random write performance to the SSD [6]. MariaDB provides about 8% better performance and at the end of the tests the difference is 2225 operations/second.

The next research point is row compressed tables. Because both MySQL and MariaDB contain the same, or almost the same, implementation, we do not expect any significant differences on performance. Results below:

Results are almost identical as expected and the small differences seen are clearly inside a statistical variation. Finally, let’s combine all previous results to see the differences between uncompressed, row-compressed, and page-compressed results.

As expected, the best performance is achieved by using uncompressed tables. Note that this result does not contradict the results presented on [5] because here we use atomic writes and have disabled the doublewrite buffer on all tests. At the end of the measure time, page compressed performance is about 22% less than uncompressed performance and the difference is 5500 operations/second. Similarly, at end of the measure time, row compressed performance is about 68% less than uncompressed performance and row compressed performance is about 37% less than page compressed performance. Finally, the difference is 11800 operations/second between uncompressed and row compressed and 6568 operations/second between page compressed and row compressed.

For our final test, we measured the load time used to create the 10x LinkBench database using MariaDB 10.1.

Clearly, creating a page-compressed database has the shortest time followed by an uncompressed database. Creating a row compressed database is significantly slower than either of those.

Performance results: TPC-C

We again start from presenting the results using TpmC from page compression feature [5] using a different number of client treads:

Again MariaDB provides clearly better performance on all tests. The server used has 32 cores and that is the reason why performance numbers drop when the number of client treads are increased to higher than 32 cores. Let’s see the performance results from New Order transactions and using the default number of client threads:

While MySQL occasionally can provide a higher number of completed New Order transactions the variation is significantly higher and the median is significantly lower than the MariaDB results. MariaDB can provide quite stable performance over time after the buffer pool is warm. Clearly, the MySQL flush implementation can’t keep up with very fast FusionIO hardware. The bottleneck on all tests on this post is not the FusionIO hardware, instead the bottleneck is in the MySQL/MariaDB implementation.

In the following graph we have compared the results using uncompressed tables and page compressed tables while varying the number of TPC-C test suite client threads from 4 to 512:

When client threads are between 4 and 32 MariaDB can provide clearly better performance for both uncompressed and page compressed tables. Actually, MariaDB performance for page compressed tables is the same or even better than MySQL performance for uncompressed tables. When the number of client threads is increased, MariaDB performance drops earlier than MySQL performance but not as significantly as MySQL performance seen on the final results using 512 client threads. With high numbers of client threads, MariaDB page compressed can offer significantly better performance compared with uncompressed tables.

Conclusions

First of all, both alpha versions of these database servers were very stable. We did not encounter even a single crash or other problem while the performance evaluations were being done. This is a very good sign for the quality of these development releases.

Secondly, there is clearly a need to identify and fix new bottlenecks introduced by the significantly faster FusionIO storage device. In all tests we could not even nearly use the full throughput provided by the FusionIO hardware. FusionIO hardware can be seen as a new level on memory hierarchy [7] that requires new kinds of optimizations and research in a similar way as main-memory has required [8]. Some early research results can be seen on [9].

Finally, MariaDB can clearly offer better performance to applications having a similar workload as presented by LinkBench and TPC-C.

References

[1] Timothy G. Armstrong, Vamsi Ponnekanti, Dhruba Borthakur, and Mark Callaghan. 2013. LinkBench: a database benchmark based on the Facebook social graph. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD ’13). ACM, New York, NY, USA, 1185-1196. http://doi.acm.org/10.1145/2463676.2465296.

[2] http://www.tpc.org/tpcc/

[3] MariaDB Introduces Atomic Writes, https://blog.mariadb.org/mariadb-introduces-atomic-writes/

[4] Xiangyong Ouyang and David W. Nellans and Robert Wipfel and David Flynn and Dhabaleswar K. Panda: Beyond block I/O: Rethinking traditional storage primitives, in Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture, pages 301-311, http://dx.doi.org/10.1109/HPCA.2011.5749738

[5] Significant performance boost with new MariaDB page compression on FusionIO, https://blog.mariadb.org/significant-performance-boost-with-new-mariadb-page-compression-on-fusionio/

[6] Write amplification. https://en.wikipedia.org/wiki/Write_amplification

[7] Memory hierarchy. https://en.wikipedia.org/wiki/Memory_hierarchy

[8] Stefan Manegold, Peter A. Boncz, and Martin L. Kersten. 2000. Optimizing database architecture for the new bottleneck: memory access. The VLDB Journal 9, 3 (December 2000), 231-246. DOI=10.1007/s007780000031 http://dx.doi.org/10.1007/s007780000031

[9] Seok-Hoon Kang, Dong-Hyun Koo, Woon-Hak Kang and Sang-Won Lee: A Case for Flash Memory SSD in Hadoop Applications,

International Journal of Control and Automation, Vol. 6, No. 1, February, 2013. http://www.sersc.org/journals/IJCA/vol6_no1/17.pdf