Packet Capturing MySQL with Rust

Recently, AgilData launched the Gibbs MySQL Scalability Advisor, a free self-service tool that allows users to capture a live stream of queries to be uploaded to Gibbs and analyzed by AgilData’s experts. Spyglass is the database traffic capture tool for Gibbs. Built using the Rust programming language, it provides exceptional performance for profiling your MySQL database. Running Spyglass on your application or database server allows you to pick up all kinds of information about your queries, performance and database health.

Spyglass will watch interactions between your MySQL Servers and client applications, unobtrusively and without any changes to either. By logging statistics from live-interactions between clients and MySQL instances, Spyglass provides for eventual analysis to chart query performance, database schema, table indexes, and return a report of optimization opportunities.

User information is protected by replacing actual values, such as dates, numbers, and text, with the ‘?’ question-mark character. It’s cool stuff, especially for a free service.

Spyglass is Open Source

Spyglass has inherent security concerns associated with packet capturing of production data. In order to demonstrate confidence that nothing suspicious or malicious is under the hood we felt it was important to allow everyone access to the source. Spyglass is open-source software licensed under the GPL3 license. The source code, as well as some pre-built binaries can be found on AgilData’s GitHub account.

To run Spyglass, you need extra permissions above that of a normal user in order to capture network traffic at the data-link layer, below IP, and without having to alter or interfere with the regular data flow between the client app and database servers. We recommend running it using “sudo.”

As a Spyglass user, you have the opportunity to examine the file and verify if you want to send the contents to Gibbs for full analysis, prior to it being uploaded. You can use the Spyglass command-line utility on 64-bit Linux, OS X and Windows operating systems after cloning from GitHub and building it. Or use the pre-built musl-based Linux image, which has no shared-library requirements, on any current 64-bit distro, from the GitHub repo releases page for Spyglass.

What it “Must Have”

Gibbs required the ability to run on production systems and sit on an app server without disruption to production processes. If it altered anything, caused a noticeable slowdown, or required a tedious setup process, nobody would use it. Our “absolute must-haves” list included:

• No reconfiguration of the client or app server

• No reconfiguration of the MySQL database server

• A small, single executable of the pre-built binary

• Run on any modern 64-bit Linux distribution

• Production friendly – Minimal use of the system resources

• Does not require any restarts of the app or MySQL servers

• User controlled “start and stop” at any time

• Easy User Interface on the command line, allowing it to run on headless servers

• User control to examine the data file captured prior to transmitting

• And most importantly have the performance necessary to keep up in real-time

We want to provide you a tool built with the memory and type safety commonly found in garbage-collected languages, but also the truly high-performance that is lacking in them.

T[Rust]

We needed a reliable and trustworthy systems language to produce a high-performance executable, that was also able to examine network packets at a level below the typical TCP layer. While this could be done with the venerable C language, Rust provides all of the same low-level control but with the memory-safety you would typically get from using a managed, garbage-collected language. For us, a combination of low-level performance and direct compatibility with other native ‘C’ code using higher-level functional constructs and “zero-cost” abstractions for these constructs was key. Concurrently, it provides modern, high-level functional programming concepts.

Teams pushing the boundaries for today’s newest infrastructure have correctly characterized the language as “cutting edge, highly sophisticated hybrid functional language … which is uniquely high-performance, low-footprint and reliable.” The Rust team themselves state “Rust is a systems programming language that runs blazingly fast, prevents segfaults, and guarantees thread safety.”

Programmers familiar with Scala, Haskell, OCaml, and other similar languages will find much to like here. Rust provides “zero-cost abstractions” allowing manipulation of collections using map, filter, folds, and other familiar paradigms; the compiler does most of the work, resulting in high-performance executables. For example, many languages implement Option using a wrapper object. In Rust, that will compile down to be as efficient as if the programmer manually checked for “null values” everywhere, without having to write that code. Rust allows idiomatic functional code, without the run-time penalty. (As an added bonus, it does away with the dreaded null ).

Is Rust > Java?

Many people will point out that the Java Virtual Machine is an awesome piece of engineering. We agree; its twenty-plus year history has allowed it to evolve into something used effectively by millions of programmers. But has anyone else noticed it has reached its limits? While Scala came on the scene over ten years ago, bringing functional programming to Java and vastly improving the experience, it is still saddled with the JVM as its run-time. For example, in the Big Data space, JVM-based products are regularly using the “unsafe” interface within the JVM to directly invoke native code. Using things such as memory-mapped files and custom collections written in ‘C’, numerous database vendors are pushing the boundary of performance you can get from the JVM.

The issue is, once you’ve done this, you’re really no longer completely within the Java ecosystem. If you’re going to write and link with native code, it is far easier to do it in languages like Go or Rust. And, unlike using C, programmers still benefit from using modern functional programming features such as map, filter, and fold over collections. And use native code you must, if you want to stray beyond the Java standard library in the realm of networking.

Rust’s protection features actually make it safer than the JVM with “unsafe calls” via JNI. It’s even shielded better, in my opinion, than straight, plain Java in regards to memory leaks. Borrow-checking by the Rust compiler generally assures not only that no “use after free” memory access occurs, but also avoids the programmer forgetting to free. Long-time C developers know keeping track of your mallocs and frees is a tedious, but critical task. Java promised, and delivered, freedom from much of this. It was a key enabler for millions of developers to create reliable software, without the strict memory accounting. It could not do it all, as memory leaks, and how to find and fix them, became an issue for JVM software. While the segfaults became almost extinct, the garbage collection costs soared, and JVM profilers joined the toolbox to help count objects. While not having to explicitly drop class instances and their associated memory saved programmers plenty of time coding and debugging, the opposite problem of holding onto too much replaced it. And the cost of garbage-collection, including the large variance in when it occurs, and how long it takes, limited the practical heap size for production Java apps on standard JVMs if you have a maximum response time you’re attempting to adhere to (and who doesn’t?).

Borrow checking in Rust is a compile-time, instead of run-time operation. There is no equivalent to the Java Virtual Machine, or any other large runtime requirement, thereby enabling small self-contained executables, as you would produce with straight C. Unlike C, you still get safe memory access, without having to explicitly handle it yourself in most instances, just as you do with Java. And the compiler tends to aggressively, and automatically, generate the code to drop your constructs and free their memory, resulting in a much smaller set of situations where you may accidentally hold onto things longer than actually needed, which does tend to occur in the Java ecosystem more often than we would hope.

“Rust”proofing Spyglass – How It’s Made

Spyglass had to just work without any reconfiguration of the systems being studied using packet-capturing. For an unobtrusive means to silently gather payloads from the data-link layer, the excellent pnet crate was used (a crate is the Rust equivalent of a library or package in other languages). Rust’s community repository of crates already offers over 5,000 libraries (as of this writing), and has topped over 42 million crate downloads. With a large following of performance-oriented systems developers contributing new crates daily, it usually isn’t difficult to locate what you need.

The bulk of the new work was in implementing code to handle the MySQL wire protocol, without actually receiving the TCP-layer stream. Since TCP handles missing and out-of-order packets, the programmer is normally allowed to ignore these details. But merely listening in on the data-link layer does not. While ignoring that for expediency in early iterations of Spyglass produced acceptable results in a simple and lightly loaded test harness, it wasn’t adequate where real systems were involved. Implementation of a minimalist Finite State Machine was necessary to monitor out-of-order and missing packets for each individual MySQL connection, and sync-up with the stream if too much divergence occurs.

Gibbs needs the schema for tables involved in the queries being captured, and waiting around for someone to “show tables” and “show columns” wouldn’t cut it. Luckily, a MySQL client crate already existed, providing not only a place to corroborate our interpretation of the MySQL wire protocol, but also easy client code for the portion of Spyglass which doesn’t rely upon data-link snooping. So we just used this client to query those items directly, and acquire the results in the usual manner. Look how short the code is to gather the create table information from a Vector of Strings (table names), including formatting and timestamping the results:

for t in tables { let timespec = time::get_time(); let millis = timespec.sec * 1000 + timespec.nsec as i64 / 1000 / 1000; let _ = pool.prep_exec(format!("show create table {}", t), ()) .map(|res| { res .map(|x| x.unwrap()) .fold((), |_, row| { let (_, c,): (String, String) = mysql::from_row(row); let msg = format!("--GIBBStTYPE: DDLtTIMESTAMP: {}tSCHEMA: {}tSQL:n{};n", millis, db, c); write_cap(&mut cap, &msg); // write to capture file printfl!("."); // use custom macro which flushes to console }) }); } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 for t in tables { let timespec = time::get_time(); let millis = timespec.sec * 1000 + timespec.nsec as i64 / 1000 / 1000; let _ = pool.prep_exec(format!("show create table {}", t), ()) .map(|res| { res .map(|x| x.unwrap()) .fold((), |_, row| { let (_, c,): (String, String) = mysql::from_row(row); let msg = format!("--GIBBStTYPE: DDLtTIMESTAMP: {}tSCHEMA: {}tSQL:n{};n", millis, db, c); write_cap(&mut cap, &msg); // write to capture file printfl!("."); // use custom macro which flushes to console }) }); }

Nobody would approve of uploading their actual captured data values over the Internet, and they are not necessary for the Gibbs performance analysis anyways. So our next step was to find a way to handle it. While it might be tempting to include a full SQL parser, or other means of tokenizing and classifying each substring in the capture, a far simpler method presented itself. How about the venerable regular expressions (regex)? Would that work? Back to https://crates.io to see if a library for regex exists; sure enough it does. Isn’t the use of regular expressions something that programmers seeking “type safety” frown upon, since, like a string containing SQL, you won’t find out about your typos until you’re actually executing the program? Enter regex macros! While it is presently slower, and requires a Rust nightly, it has the very appealing property that if your regex is not a correct expression, your program won’t compile! Instead of having to find some interactive regex testing UI, regular expressions which are improper will be flagged by your regular compiler, allowing you to continue your workflow with your standard code editor and build tools. What we essentially end up with is a custom compiled set of code geared explicitly for our use case, redacting literal strings, ranges, numbers, and booleans without destroying the SQL statement surrounding them. And the amount of source code needed to accomplish this is astonishingly small:

let redact = regex!(r#"(?x)( (?P<p>[\s=\(\+-/\*]) ( '[^'\\]*((\\.|'')[^'\\]*)*' | ;"[^"\\]*((\\.|"")[^"\\]*)*" | [\.][\d]+ | [\d][\.\d]* ) )"#); let cr = redact.replace_all(&x, "$p?"); 1 2 3 4 5 6 7 8 9 10 11 let redact = regex!(r#"(?x)( (?P<p>[\s=\(\+-/\*]) ( '[^'\\]*((\\.|'')[^'\\]*)*' | ;"[^"\\]*((\\.|"")[^"\\]*)*" | [\.][\d]+ | [\d][\.\d]* ) )"#); let cr = redact.replace_all(&x, "$p?");

Where x is the String containing the captured statement, and the “$p?” exchanges all the captured literals with a question mark, a single call against the compiled regex suffices. You may also notice some of the other Rust features used. Multi-line raw string literals eliminated the string escaping, thereby preventing the obfuscation of the regex escaping. Nobody likes dealing with multiple layers of escapes within a single string. Especially with regex, already a syntax which looks like someone scrambled random punctuation about like a kid playing with a box of Alpha-Bits cereal spilled on the floor.

Now that we have our text in a condition suitable for transmission back to AgilData, we have to deal with all that being an http client entails. Combining hyper and multipart crates provides convenient APIs for this task. While the resulting code isn’t quite as terse as the regex Rust macro example above, be sure to check out Spyglass’ comm.rs source file to see how posting a multipart form was used to upload the data capture file.

Incorporating these crates, along with the Rust standard library and language features, yielded the results we were looking for. In far less time than what we estimated we would need if using C alone, even with the wealth of decades-worth of C libraries at our disposal, Spyglass was born.

What else can Spyglass see? It’s up to you.

We are open to suggestions. For the systems programmer, Spyglass code provides a good starting base for any form of case-specific protocol traffic capture at a level below TCP/IP. The Finite State Machine (FSM) may be modified to match the specifics of your situation and will definitely give you a head-start of a few days to a few weeks, especially if you are new to the Rust ecosystem. Please let us know how you’ve adapted it, or even better, send some pull requests our way with improvements! And if you’re ever in Michigan, stop by a Rust Detroit Meetup or follow us on Twitter.

Resources

Learn more about using the free Gibbs Scalability Advisor with Spyglass to scale your MySQL.

Spyglass repository on Github.

Read about our new product, AgilData Scalable Cluster for MySQL, available today.

Understand AgilData Scalable Cluster for MySQL pricing.

Download the AgilData Scalable Cluster for MySQL Whitepaper for more information.