Introduction

MongoDB 3.2 introduces a new option for at-rest data encryption. In this post we take a closer look at the forces driving the need for increased encryption, MongoDB features for encrypting your data, as well as the performance characteristics of the new Encrypted Storage Engine.

Data security is top of mind for many executives due to increased attacks as well as a series of data breaches in recent years that have negatively impacted several high profile brands. For example, in 2015, a major health insurer was a victim of a massive data breach in which criminals gained access to the Social Security numbers of more than 80 million people ­— resulting in an estimated cost of $100M. In the end, one of the critical vulnerabilities was the health insurer did not encrypt sensitive patient data stored at-rest.

Data encryption is a key part of a comprehensive strategy to protect sensitive data. However, encrypting and decrypting data is potentially very resource intensive. It is important to understand the performance characteristics of your encryption technology to accurately conduct capacity planning.

MongoDB 3.2: Delivering Native Encryption At-Rest

MongoDB 3.2 provides a comprehensive encryption solution that protects your data, both in-flight and at-rest. For encryption-in-flight, MongoDB uses SSL/TLS, which ensures secure communication between your database and client, as well as inter-cluster traffic between nodes. Learn more about MongoDB and SSL/TLS.

With the latest version 3.2, MongoDB also includes a fully integrated encryption-at-rest solution that reduces cost and performance overhead. Encryption-at-rest is part of MongoDB Enterprise Advanced only, but is freely available for development and evaluation. We will take a closer look at this new option later in the post.

Before 3.2, the primary methods to provide encryption-at-rest were to use 3rd party applications that encrypt files at the application, file system, or disk level. These methods work well with MongoDB but can add extra cost, complexity, and overhead.

Additionally, disk and file system encryption might not protect against all situations. While disk level encryption protects from someone taking the physical drive from the machine, it does not protect from someone that has physical access to the machine and can override the file system. Similarly, file system encryption will prevent someone from overriding the file system, but does not preclude someone from gaining unauthorized access through the application or database layer.

Database encryption mitigates these problems by adding an extra layer of security. Even if an administrator has access to the file system, he/she will first need to be authenticated to the database before decrypting the data files.

MongoDB’s Encrypted Storage Engine supports a variety of encryption algorithms from the OpenSSL library. AES-256 in CBC mode is the default, while other options include GCM mode, as well as FIPS mode for FIPS-140-2 compliance.

Encryption is performed at the page level to provide optimal performance. Instead of having to encrypt/decrypt the entire file or database for each change, only the modified pages need to be encrypted or decrypted.

Additionally, the Encrypted Storage Engine provides safe and secure management of the encryption keys. Each encrypted node contains an internal database key that is used to encrypt/decrypt the data files. The database key is wrapped with an external master key, which must be given to the node for it to initialize. MongoDB uses operating system protection mechanisms, such as VirtualLock and mlock, that lock the process’ virtual memory space into memory, ensuring that keys are never written or paged to disk in unencrypted form.

Evaluating Performance

Encrypting and decrypting data requires the use of additional resources, and administrators will want to understand the performance impact to adjust capacity planning accordingly.

In our Encrypted Storage Engine benchmarking tests, we saw an average throughput overhead between 10% and 20%. Let’s take a closer look at some benchmark data to show the results for Insert Only, Read Only, and 50%-Read/50%-Insert workloads.

For our benchmark, we used Intel Xeon X5675 CPUs, which support the AES-NI instruction set, and ran the CPUs at high load(100%). There were four different configurations that we evaluated; “Working Set Fits In Memory”, “Working Set Exceeds Memory”, “Encrypted”, and “Unencrypted”. The ‘Working Set’ refers to the amount of data and indexes that is actively used by your system.

Let’s first look at an Insert-Only workload. With a high CPU load, we see an encryption overhead of around ~16%.

Now, let’s take a look at the results of our Read-Only Workload. We ran the benchmark between two scenarios; “Working Set Fits In Memory” and “Working Set Exceeds Memory”.

From the benchmark results, the decryption overhead for a Read-Only workload ranges between 5–20%.

Lastly, here are the benchmark results for a 50%-Read, 50%-Insert workload.

For the 50%-Read/50%-Insert workloads, the encryption overhead ranges between 12%–20%.

In addition to throughput, latency is also a critical component of encryption overhead. From our benchmark, average latency overheads ranged between 6% to 30%. Though average latency overhead was slightly higher than throughput overhead, latencies were still very low—all under 1ms.

Average Latency(us) Unencrypted Encrypted % Overhead Insert Only Average Latency(us) 32.4 40.9 -26.5% Read Only Working Set Fits In Memory Avg Latency(us) 230.5 245.0 -6.3% Read Only Working Set Exceeds Memory Avg Latency(us) 447.0 565.8 -26.6% 50% Insert/50% Read Working Set Fits In Memory Avg Latency(us) 276.1 317.4 -15.0% 50% Insert/50% Read Working Set Exceeds Memory Avg Latency(us) 722.3 936.5 -29.7%

MongoDB Atlas Encryption At Rest

MongoDB Atlas is a database as a service and provides all the features of the database without the heavy lifting of setting up operational tasks. Developers no longer need to worry about provisioning, configuration, patching, upgrades, backups, and failure recovery. Atlas offers elastic scalability, either by scaling up on a range of instance sizes or scaling out with automatic sharding, all with no application downtime. MongoDB Atlas provides encryption of data-in-flight over the network and at rest on disk. Data-at-rest can be optionally protected using encrypted data volumes. Encrypted data volumes secure your data without the need for you to build, maintain, and secure your own key management infrastructure.

Summary

In this post, we looked at a few workloads to determine the impact of encryption with MongoDB's new Encrypted Storage Engine. The results demonstrate that the Encrypted Storage Engine provides a secure way to encrypt your data-at-rest, while maintaining exceptional performance. With the Encrypted Storage Engine and diligent capacity planning, you shouldn't have to make a tradeoff between high performance and strong security when encrypting data-at-rest. For users interested in a database as a service, MongoDB Atlas provides encrypted data volumes to ensure your data at rest is secure.

Environment

These tests were conducted on bare metal servers. Each server had the following specification:

CPU: 3.06GHz Intel Xeon Westmere(X5675-Hexcore) RAM: 6x16GB Kingston 16GB DDR3 2Rx4 OS: Ubuntu 14.04-64 Network Card: SuperMicro AOC-STGN-i2S Motherboard: SuperMicro X8DTN+_R2 Document Size: 1KB Workload: YCSB Version: MongoDB 3.2

Additional Resources

Try MongoDB’s New Encrypted Storage Engine. Users can try the Encrypted Storage Engine free for unlimited development and evaluation.

Read our installing MongoDB Enterprise 3.2 documentation.

Learn More About Encryption and all of the security features available for MongoDB by reading our guide.

About the Author - Jason Ma

Jason is a Principal Product Marketing Manager based in Palo Alto, and has extensive experience in technology hardware and software. He previously worked for SanDisk in Corporate Strategy doing M&A and investments, and as a Product Manager on the Infiniflash All-Flash JBOF. Before SanDisk, he worked as a HW engineer at Intel and Boeing. Jason has a BSEE from UC San Diego, MSEE from the University of Southern California, and an MBA from UC Berkeley.