When 22-year-old programmer Aaron Swartz decided last fall to help an open-government activist amass a public and free copy of millions of federal court records, he did not expect he'd end up with an FBI agent trying to stake out his house.

But that's what happened, as Swartz found out this week when he got his FBI file through a Freedom of Information Act request. A partially-redacted FBI report shows the feds mounted a serious investigation of Swartz for helping put public documents onto the public web .

The FBI ran Swartz through a full range of government databases starting in February, and drove by his home, after the U.S. court system told the feds he'd pilfered approximately 18 million pages of documents worth $1.5 million dollars. That's how much the public records would have cost through the federal judiciary's pay-walled PACER record system, which charges eight cents a page for most legal filings.

"I think its pretty silly they go after people who use the library to try to get access to public court documents," Swartz said. "It is pretty silly that instead of calling me up, they sent an FBI agent to my house."

The feds also checked Swartz's Facebook page, ran his name against the Department of Labor to figure out his work history, looked for outstanding warrants and prior convictions, checked to see if his mobile phone number had ever come up in a federal wiretap or pen register, and checked him against the records in a private data broker's database.

The Great Court Records Caper began last year when the judiciary and the Government Printing Office experimented with giving away free access to PACER at 17 select libraries around the country. Swartz decided to use the trial to grab as many of the public court records as he could and, perversely, release them to the public.

He visited one of the libraries — the 7th U.S. Circuit Court of Appeals library in Chicago — and installed a small PERL script he'd written. The code cycled sequentially through case numbers, requesting a new document from PACER every three seconds. In this manner, Swartz got nearly 20 million pages of court documents, which his script uploaded to Amazon's EC2 cloud computing service.

Or, as the FBI report put it, the public records were "exfiltrated."

The script ran for a couple of weeks — from September 4 to 22, until the court system's IT department realized something was wrong. Someone was downloading everything. None of the records, of course, were private or sealed, and Lexis Nexis has a copy of of PACER's database that it sells a high markup. But Swartz wasn't paying anything.

The Government Printing Office abruptly shut down the free trial and reported to the FBI that PACER was "compromised," the FBI file reveals. The Administrative Office of the U.S. Courts told the FBI in March that Swartz had gained unauthorized access to the free PACER account.

"AARON SWARTZ would have known his access was unauthorized because it was with a password that did not belonged [sic] to him," reads the FBI report summarizing the judiciary's position.

Swartz says his script only ran on the library computer. It didn't use a password at all, but used the PACER authentication cookie set in the PC's browser.

He donated the 19,856,160 pages to public.resource.org, an open government initiative spearheaded by Carl Malamud as part of a broader project to make public as many government databases as Malamud can find. It was Malamud who previously shamed the SEC into putting all its EDGAR filings online in the '90s, and he used $600,000 in donations to buy 50 years of documents from the nation's appeals court, which he promptly put on the internet for anyone to download in bulk.

The Washington bureau of the FBI opened their investigation of Swartz just a week or so before the New York Times published its account of the caper. The bureau didn't contact him then, but in April, the FBI asked to interview the code jock — saying it needed his help to close the "security hole" he'd exploited. When Swartz declined, on the advice of counsel, the feds dropped the investigation after the Justice Department's Computer Crime and Intellectual Property Section closed the case.

Swartz, a former employee of Reddit — a sister company of Wired.com — requested his FBI file in August, and describes it as the "usual mess of confusions that shows the FBI's lack of sense of humor." (Threat Level notes that the FBI's filled Swartz's FOIA request at an admirable speed that would have been unheard of as recently as last year.)

That's how Swartz learned that a Chicago-based FBI agent got Swartz's driver's license photo, and considered a stakeout of his home. But any surveillance, the agent concluded, would be conspicuous, since so few cars were parked on Swartz's dead-end street in Highland Park, Illinois.

The feds evidently identified Swartz in the first place by approaching Amazon, which provided his name, phone number and address. It's not clear if the feds got a subpoena to learn his identity, but they may not have needed one; Amazon's user agreement for its cloud computing solutions gives it the right to turn over customer information to the government on request.

Amazon did not reply to a call and online request for comment.

Two months after opening an investigation, the feds finally called Swartz on April 14. He declined to speak to them, and demurred again through his lawyer two days later.

The investigation was closed on April 20.

PACER records still cost eight cents a page, but now PACER users running the Firefox browser can donate their downloads to the public domain with a simple plug-in called RECAP.

Use of the plug-in is not likely to start an investigation of you.

But then again, who knows.

Photo: Flickr/Creative Commons

See Also: