A Russian programmer named Sergey Aleynikov was picked up this past Friday by the FBI for allegedly stealing and passing along code that, if circulating out in the wild, could expose US markets to manipulation and cost Aleynikov's former employer, Goldman Sachs, millions. Bloomberg quotes assistant US Attorney Facciponti saying that "there is a danger that somebody who knew how to use this program could use it to manipulate markets in unfair ways. The copy in Germany is still out there, and we at this time do not know who else has access to it."

So how could a 32MB compressed source code archive pose a threat to markets and to America's most powerful investment bank? The story is actually less complex than it may sound.

Recovering the black box

In a nutshell, the "black box" trading platforms of Goldman and other banks use a combination of proprietary, secret algorithms and the fastest hardware available to take in a torrent of news and other market data and generate a stream of trades that are timed to the millisecond. So, instead of operating on the old stock market adage, "buy on the rumor, sell on the news," a high-frequency trading platform like Goldman's will buy a few milliseconds after the news hits, then sell moments later at a very small premium to other traders and platforms who didn't get the news in (or their trades out) quite as fast. Do this billions of times a day, and voila, you're printing money.

Obviously, to make a scheme like this work, you need a few things, one of which is hardware that's at least a few milliseconds faster than everyone else's (see my previous post on the high-frequency trading "arms race"). The second main ingredient is software that, given a set of data inputs, can figure out which trades everyone else is likely to make in response to those same inputs, so that the platform can get there first and be holding those assets when everyone else suddenly decides they want them.

If you have your hands on the code that runs on Goldman's trading platform—again, one of the largest in the world—then you know with 100 percent accuracy which trades Goldman's computers are going to make in response to a given set of inputs. All you need then is even faster hardware so that you can get to those trades just a few milliseconds before Goldman, and you'll always beat the bank and therefore be able to sell to Goldman at a slight premium. Goldman will therefore make less on every trade, since you'll essentially be usurping their place in the pecking order.

When US government prosecutors claim that the release of Goldman's secret sauce could potentially expose markets to manipulation, what they're really saying is that some unknown party could use it to out-manipulate Goldman, and possibly even do something more ambitious like frustrate Goldman's platform so that it fails while simultaneously finding some way to short it. Given that Goldman's platform is one of the main providers of liquidity to the market (i.e., it fills a market function by holding assets that everyone will want shortly, and then selling them to all comers), it would ostensibly be a bad thing if it suddenly blew up.

How Aleynikov did it

The FBI's complaint (PDF) in the case describes how Aleynikov pulled off his heist of the code. During the first five days in June, the programmer, whose LinkedIn profile describes him as VP of Equity Strategy, ran some scripts via a bash shell that copied and compressed a bunch of source code, then sent it out via HTTPS to a German server—about 32MB total over four separate occasions, which is actually quite a lot of compressed ASCII source code.

Aleynikov tried to cover his tracks by having the script erase his bash history, but Goldman's machines actually keep a backup of everyone's bash history, which is how they figured out what he had done. The bank was tipped off by the HTTPS transfers, which seem to have set off some sort of alarm that invited further scrutiny.

The programmer had informed Goldman that he was quitting and going to work for Chicago-based Teza Technologies, LLC, another high-frequency trading shop that has now suspended his employment in the wake of his arrest. He was released on a $750,000 bond today and now awaits trial.

This story is still developing, and I encourage you to read the second half of Matthew Goldstein's Reuters story, which is where the arrest first came to light, to get a sense of where it's headed. (Zero Hedge is also on top of this). In particular, there are a number of very odd twists here, the latest of which makes the New York Stock Exchange look particularly bad.

The NYSE puts out a weekly list of the top program traders by volume, and Goldman typically tops this list by a country mile. Then last week's list came out, and Goldman's name was shockingly absent. And today, now that the code theft story is out, the NYSE has put out a statement claiming that Goldman's absence on the list was the result of a "system error;" it has also released a revised list showing Goldman once again dominating program trading activity.

Needless to say, many econ bloggers are incredulous that the top entrant in the weekly program trading list suddenly went missing last week and nobody at the NYSE caught the error before now, especially given the Aleynikov news and the timing of the "error." Conspiracy theories are legion, and even if none of them are true, it's hard to shake the feeling that this story is about to blow up into a major scandal.

Listing image by Flickr: Nick McCarthy