In this blog post, I’ll look at MyRocks performance through some benchmark testing.

As the MyRocks storage engine (based on the RocksDB key-value store http://rocksdb.org ) is now available as part of Percona Server for MySQL 5.7, I wanted to take a look at how it performs on a relatively high-end server and SSD storage. I wanted to check how it performs for different amounts of available memory for the given database size. This is similar to the benchmark I published a while ago for InnoDB (https://www.percona.com/blog/2010/04/08/fast-ssd-or-more-memory/).

In this case, I plan to use a sysbench-tpcc benchmark (https://www.percona.com/blog/2018/03/05/tpcc-like-workload-sysbench-1-0/) and I will execute it for both MyRocks and InnoDB. We’ll use InnoDB as a baseline.

For the benchmark, I will use 100 TPC-C warehouses, with a set of 10 tables (to shift the bottleneck from row contention). This should give roughly 90GB of data size (when loaded into InnoDB) and is roughly equivalent to 1000 warehouses data size.

To vary the memory size, I will change innodb_buffer_pool_size from 5GB to 100GB for InnoDB, and rocksdb_block_cache_size for MyRocks.

For MyRocks we will use LZ4 as the default compression on disk. The data size in the MyRocks storage engine is 21GB. Interesting to note, that in MyRocks uncompressed size is 70GB on the storage.

For both engines, I did not use FOREIGN KEYS, as MyRocks does not support it at the moment.

MyRocks does not support SELECT .. FOR UPDATE statements in REPEATABLE-READ mode in the Percona Server for MySQL implementation. However, “SELECT .. FOR UPDATE” is used in this benchmark. So I had to use READ-COMMITTED mode, which is supported.

The most important setting I used was to enable binary logs, for the following reasons:

Any serious production uses binary logs With disabled binary logs, MyRocks is affected by a suboptimal transaction coordinator

I used the following settings for binary logs:

binlog_format = ‘ROW’

binlog_row_image=minimal

sync_binlog=10000 (I am not using 0, as this causes serious stalls during binary log rotations when the content of binary log is flushed to storage all at once)

While I am not a full expert in MyRocks tuning yet, I used recommendations from this page: https://github.com/facebook/mysql-5.6/wiki/my.cnf-tuning. The Facebook-MyRocks engineering team also provided me input on the best settings for MyRocks.

Let’s review the results for different memory sizes.

This first chart shows throughput jitter. This helps to understand the distribution of throughput results. Throughput is measured every 1 second, and on the chart, I show all measurements after 2000 seconds of a run (the total length of each run is 3600 seconds). So I show the last 1600 seconds of each run (to remove warm-up phases):

To better quantify results, let’s take a look at them on a boxplot. The quickest way to understand boxplots is to take a look at the middle line. It represents a median of measurements (see more at https://www.percona.com/blog/2012/02/23/some-fun-with-r-visualization/):

Before we jump to the summary of results, let’s take a look at a variation of the throughput for both InnoDB and MyRocks. We will zoom to a 1-second resolution chart for 100 GB of allocated memory:

We can see that there is a lot of variation with periodical 1-second performance drops with MyRocks. At this moment, I do not know what causes these drops.

So let’s take a look at the average throughput for each engine for different memory settings (the results are in tps, and more is better):

Memory, GB InnoDB MyRocks 5 849.0664 4205.714 10 1321.9 4298.217 20 1808.236 4333.424 30 2275.403 4394.413 40 2968.101 4459.578 50 3867.625 4503.215 60 4756.551 4571.163 70 5527.853 4576.867 80 5984.642 4616.538 90 5949.249 4620.87 100 5961.2 4599.143

This is where MyRocks behaves differently from InnoDB. InnoDB benefits greatly from additional memory, up to the size of working dataset. After that, there is no reason to add more memory.

At the same time, interestingly MyRocks does not benefit much from additional memory.

Basically, MyRocks performs as expected for a write-optimized engine. You can refer to my article How Three Fundamental Data Structures Impact Storage and Retrieval for more details.

In conclusion, InnoDB performs better (compared to itself) when the working dataset fits (or almost fits) into available memory, while MyRocks can operate (and outperform InnoDB) on small memory sizes.

IO and CPU usage

It is worth looking at resource utilization for each engine. I took vmstat measurements for each run so that we can analyze IO and CPU usage.

First, let’s review writes per second (in KB/sec). Please keep in mind that these writes include binary log writes too, not just writes from the storage engine.

Memory, GB InnoDB MyRocks 5 244754.4 87401.54 10 290602.5 89874.55 20 311726 93387.05 30 313851.7 93429.92 40 316890.6 94044.94 50 318404.5 96602.42 60 276341.5 94898.08 70 217726.9 97015.82 80 184805.3 96231.51 90 187185.1 96193.6 100 184867.5 97998.26

We can also calculate how many writes per transaction each storage engine performs:

This chart shows the essential difference between InnoDB and MyRocks. MyRocks, being a write-optimized engine, uses a constant amount of writes per transaction.

For InnoDB, the amount of writes greatly depends on the memory size. The less memory we have, the more writes it has to perform.

What about reads?

The following table shows reads in KB per second.

Memory, GB InnoDB MyRocks 5 218343.1 171957.77 10 171634.7 146229.82 20 148395.3 125007.81 30 146829.1 110106.87 40 144707 97887.6 50 132858.1 87035.38 60 98371.2 77562.45 70 42532.15 71830.09 80 3479.852 66702.02 90 3811.371 64240.41 100 1998.137 62894.54

We can translate this to the number of reads per transaction:

This shows MyRocks’ read-amplification. The allocation of more memory helps to decrease IO reads, but not as much as for InnoDB.

CPU usage

Let’s also review CPU usage for each storage engine. Let’s start with InnoDB:

The chart shows that for 5GB memory size, InnoDB spends most of its time in IO waits (green area), and the CPU usage (blue area) increases with more memory.

This is the same chart for MyRocks:

In tabular form:

Memory, GB engine us sys wa id 5 InnoDB 8 2 57 33 5 MyRocks 56 11 18 15 10 InnoDB 12 3 57 28 10 MyRocks 57 11 18 13 20 InnoDB 16 4 55 25 20 MyRocks 58 11 19 11 30 InnoDB 20 5 50 25 30 MyRocks 59 11 19 10 40 InnoDB 26 7 44 24 40 MyRocks 60 11 20 9 50 InnoDB 35 8 38 19 50 MyRocks 60 11 21 7 60 InnoDB 43 10 36 10 60 MyRocks 61 11 22 6 70 InnoDB 51 12 34 4 70 MyRocks 61 11 23 5 80 InnoDB 55 12 31 1 80 MyRocks 61 11 23 5 90 InnoDB 55 12 32 1 90 MyRocks 61 11 23 4 100 InnoDB 55 12 32 1 100 MyRocks 61 11 24 4

We can see that MyRocks uses a lot of CPU (in us+sys state) no matter how much memory is allocated. This leads to the conclusion that MyRocks performance is limited more by CPU performance than by available memory.

MyRocks directory size

As MyRocks writes all changes and compacts SST files down the road, it would be interesting to see how the data directory size changes during the benchmark so we can estimate our storage needs. Here is a chart of datadirectory size:

We can see that datadirectory goes from 20GB at the start, to 31GB during the benchmark. It is interesting to observe the data growing until compaction shrinks it.

Conclusion

In conclusion, I can say that MyRocks performance increases as the ratio of dataset size to memory increases, outperforming InnoDB by almost five times in the case of 5GB memory allocation. Throughput variation is something to be concerned about, but I hope this gets improved in the future.

MyRocks does not require a lot of memory and shows constant write IO while using most of the CPU resources.

I think this potentially makes MyRocks a great choice for cloud database instances, where both memory and IO can cost a lot. MyRocks deployments would make it cheaper to deploy in the cloud.

I will follow up with further cloud-oriented benchmarks.

Extras

Raw results, scripts, and config

My goal is to provide fully repeatable benchmarks. To this end, I’m sharing all the scripts and settings I used in the following GitHub repo:

https://github.com/Percona-Lab-results/201803-sysbench-tpcc-myrocks

MyRocks Performance Settings

rocksdb_max_open_files=-1 rocksdb_max_background_jobs=8 rocksdb_max_total_wal_size=4G rocksdb_block_size=16384 rocksdb_table_cache_numshardbits=6 # rate limiter rocksdb_bytes_per_sync=16777216 rocksdb_wal_bytes_per_sync=4194304 rocksdb_compaction_sequential_deletes_count_sd=1 rocksdb_compaction_sequential_deletes=199999 rocksdb_compaction_sequential_deletes_window=200000 rocksdb_default_cf_options="write_buffer_size=256m;target_file_size_base=32m;max_bytes_for_level_base=512m;max_write_buffer_number=4;level0_file_num_compaction_trigger=4;level0_slowdown_writes_trigger=20;level0_stop_writes_trigger=30;max_write_buffer_number=4;block_based_table_factory={cache_index_and_filter_blocks=1;filter_policy=bloomfilter:10:false;whole_key_filtering=0};level_compaction_dynamic_level_bytes=true;optimize_filters_for_hits=true;memtable_prefix_bloom_size_ratio=0.05;prefix_extractor=capped:12;compaction_pri=kMinOverlappingRatio;compression=kLZ4Compression;bottommost_compression=kLZ4Compression;compression_opts=-14:4:0" rocksdb_max_subcompactions=4 rocksdb_compaction_readahead_size=16m rocksdb_use_direct_reads=ON rocksdb_use_direct_io_for_flush_and_compaction=ON 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 rocksdb_max_open_files = - 1 rocksdb_max_background_jobs = 8 rocksdb_max_total_wal_size = 4G rocksdb_block_size = 16384 rocksdb_table_cache_numshardbits = 6 # rate limiter rocksdb_bytes_per_sync = 16777216 rocksdb_wal_bytes_per_sync = 4194304 rocksdb_compaction_sequential_deletes_count_sd = 1 rocksdb_compaction_sequential_deletes = 199999 rocksdb_compaction_sequential_deletes_window = 200000 rocksdb_default_cf_options = "write_buffer_size=256m;target_file_size_base=32m;max_bytes_for_level_base=512m;max_write_buffer_number=4;level0_file_num_compaction_trigger=4;level0_slowdown_writes_trigger=20;level0_stop_writes_trigger=30;max_write_buffer_number=4;block_based_table_factory={cache_index_and_filter_blocks=1;filter_policy=bloomfilter:10:false;whole_key_filtering=0};level_compaction_dynamic_level_bytes=true;optimize_filters_for_hits=true;memtable_prefix_bloom_size_ratio=0.05;prefix_extractor=capped:12;compaction_pri=kMinOverlappingRatio;compression=kLZ4Compression;bottommost_compression=kLZ4Compression;compression_opts=-14:4:0" rocksdb_max_subcompactions = 4 rocksdb_compaction_readahead_size = 16m rocksdb_use_direct_reads = ON rocksdb_use_direct_io_for_flush_and_compaction = ON

InnoDB settings

# files innodb_file_per_table innodb_log_file_size=15G innodb_log_files_in_group=2 innodb_open_files=4000 # buffers innodb_buffer_pool_size= 200G innodb_buffer_pool_instances=8 innodb_log_buffer_size=64M # tune innodb_doublewrite= 1 innodb_support_xa=0 innodb_thread_concurrency=0 innodb_flush_log_at_trx_commit= 1 innodb_flush_method=O_DIRECT_NO_FSYNC innodb_max_dirty_pages_pct=90 innodb_max_dirty_pages_pct_lwm=10 innodb_lru_scan_depth=1024 innodb_page_cleaners=4 join_buffer_size=256K sort_buffer_size=256K innodb_use_native_aio=1 innodb_stats_persistent = 1 #innodb_spin_wait_delay=96 # perf special innodb_adaptive_flushing = 1 innodb_flush_neighbors = 0 innodb_read_io_threads = 4 innodb_write_io_threads = 2 innodb_io_capacity=2000 innodb_io_capacity_max=4000 innodb_purge_threads=4 innodb_adaptive_hash_index=1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 # files innodb_file_per _ table innodb_log_file_size = 15G innodb_log_files_in_group = 2 innodb_open_files = 4000 # buffers innodb_buffer_pool_size = 200G innodb_buffer_pool_instances = 8 innodb_log_buffer_size = 64M # tune innodb_doublewrite = 1 innodb_support_xa = 0 innodb_thread_concurrency = 0 innodb_flush_log_at_trx_commit = 1 innodb_flush_method = O_DIRECT_NO _ FSYNC innodb_max_dirty_pages_pct = 90 innodb_max_dirty_pages_pct_lwm = 10 innodb_lru_scan_depth = 1024 innodb_page_cleaners = 4 join_buffer_size = 256K sort_buffer_size = 256K innodb_use_native_aio = 1 innodb_stats_persistent = 1 #innodb_spin_wait_delay=96 # perf special innodb_adaptive_flushing = 1 innodb_flush_neighbors = 0 innodb_read_io_threads = 4 innodb_write_io_threads = 2 innodb_io_capacity = 2000 innodb_io_capacity_max = 4000 innodb_purge_threads = 4 innodb_adaptive_hash_index = 1

Hardware spec

Supermicro server:

CPU: Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz 2 sockets / 28 cores / 56 threads



Memory: 256GB of RAM

Storage: SAMSUNG SM863 1.9TB Enterprise SSD

Filesystem: ext4

Percona-Server-5.7.21-20

OS: Ubuntu 16.04.4, kernel 4.13.0-36-generic

You May Also Like

For a detailed look at how MyRocks stacks up against typical InnoDB deployments, read my blog MyRocks Engine: Things to Know Before You Start. We go over the differences, major and minor, in the storage engine and discuss its implementation with Percona Server. MyRocks could also be beneficial for your cloud deployment. Saving With MyRocks in The Cloud shows how the storage engine performed under heavy I-O workloads in the cloud and what that means for your storage costs.