Swartz indicted for JSTOR theft

Digital activist gained access through MIT network drops

CORRECTION TO THIS ARTICLE: This article incorrectly references the “MIT Student Processing Board.” It is the Student Information Processing Board, or SIPB.

Aaron H. Swartz is an accomplished 24-year-old by anyone’s standards. He co-authored the now widely-used RSS 1.0 specification at age 14, was one of three owners of the massively popular social news site Reddit, and recently completed a fellowship at the Harvard Ethics Center Lab on Institutional Corruption.

On Jan. 6, 2011, Swartz allegedly entered the basement of MIT’s Building 16, using his white bicycle helmet as a mask to hide his identity from passersby. A federal indictment, unsealed on July 19, describes his entering a restricted network wiring closet, retrieving a laptop and external hard drive he had hidden there under a cardboard box weeks before, and cautiously stepping out of the wiring closet with his makeshift mask in place.

According to the indictment, Swartz’s laptop had been using MIT’s network to rapidly download articles from JSTOR. JSTOR is an archive of academic journals to which many universities, including MIT, pay large amounts of money for access. The indictment describes these events as the final phase of Swartz’s three-month JSTOR downloading operation, bringing his total count of acquired JSTOR articles to 4.8 million. MIT valued that information at $50,000, according to the Cambridge Police incident report.

Swartz’s intention, the indictment claimed, was to upload all of the documents to a peer-to-peer file-sharing site, where anyone could access them for free.

He never got the chance. Within two hours of fleeing Building 16, Swartz was captured by Secret Service Agent Michael Pickett, in what was the culmination of three months of detective work by MIT Information Services & Technology, the MIT and Cambridge Police Departments, and the United States Secret Service.

“Ghost laptop”

Aaron Swartz’s alleged JSTOR downloading operation was far less daring in its early stages. The indictment states that it started on Sept. 24, 2010 — three months before his arrest — with the purchase of an Acer laptop from a local store. The

new computer was put to use on the same day, registered on MIT’s network as a guest. When prompted, Swartz provided the name “Gary Host,” which he had abridged to form the machine’s client name, “ghost laptop,” according to the indictment.

He put his newly-assigned MIT IP address (18.55.6.215) to use the next day, the indictment says, running a program on the laptop that downloaded JSTOR articles at a staggering rate. While the indictment describes the program as being smart enough to avoid being automatically flagged by JSTOR’s systems, the strain it put on JSTOR’s servers was enough to have impaired other research institutions attempting to access the materials. It wasn’t long before JSTOR and MIT took notice. That evening, JSTOR blocked the IP address of the laptop, preventing it from accessing their archives.

This setback didn’t deter Swartz for long, according to the indictment. The next day the “ghost laptop” was assigned a new IP address, 18.55.6.216, and continued to rapidly download JSTOR materials. JSTOR again detected the activity, and this time took a more drastic measure: noticing that the offender’s two IP addresses had begun with 18.55.6, JSTOR blocked a broad range of similar MIT IP addresses. This action denied many MIT affiliates access to JSTOR for three days.

By the time JSTOR reversed its ban on that MIT IP address range on Sept. 29, MIT had taken a more targeted approach to keeping the offender off the network: blocking his laptop’s MAC address. A MAC address is a sequence of characters which uniquely identifies a machine’s hardware. Though it is meant to be a permanent identifier, it can be changed — a trivial operation for someone with Swartz’s expertise. The Acer laptop was registered again on MIT’s network less than a week later, still under the name “Gary Host” but with a slightly altered MAC address.

Grace Host

“Grace Host” first made her appearance on MIT’s network on Oct. 8. That was the name, states the federal indictment, that Swartz provided when he registered a second machine, this time a MacBook, as a guest on the network. Together, Grace and Gary Host downloaded JSTOR articles at such an astounding pace that several of JSTOR’s servers crashed.

This time, JSTOR’s response was far more severe. All of MIT was denied access to the JSTOR archives for several days. When access was restored days later, the indictment suggests that Swartz used his newfound knowledge of MIT’s networking infrastructure to take a new approach.

The restricted basement wiring closet

In the basement of Building 16 there is a wiring and telephony closet, known as Room 16-004t. Between November and December 2010, Aaron Swartz accessed this room and hard-wired his Acer laptop into the network, assigning himself two IP addresses. The computer was hidden under a cardboard box in the closet, and it remained there undetected for weeks. In this time it downloaded over 2 million JSTOR articles, more than 100 times the number of legitimate JSTOR downloads at MIT during that time period.

It was Jan. 4, 2011, when IS&T discovered the machine beneath the cardboard box, according to the officer report released by the Cambridge Police Department. By 10:30 a.m. an MIT police officer was on the scene, and before long he was joined by Cambridge police detective Joseph Murphy and U.S. Secret Service agent Michael Pickett. The indictment states that the laptop was running a script called “keepgrabbing.py,” which was responsible for downloading the JSTOR articles. Fingerprints were lifted from the laptop and hard drive, and then the detective, the officer, and the agent left Building 16.

The laptop and hard drive remained under the cardboard box in 16-004t. However, it was now accompanied by a hidden network camera, installed by IS&T.

Less than five hours later, a “white male, dark or black shoulder length wavy hair, wearing a dark coat, gray backpack, jeans with a white bicycle helmet” was observed on camera entering 16-004t, carrying what looked like a hard drive. When the man matching Swartz’s description returned again on Jan. 6, 2011, he was spotted by the MIT police officer monitoring the video feed. But by the time police units arrived at 16-004t, Swartz had disappeared, along with his laptop and hard drives.

The arrest of Aaron Swartz

According to the officer report and a statement released by the MIT Student Processing Board (SIPB), Swartz didn’t leave MIT’s campus immediately on Jan. 6. His next stop was the fifth floor of MIT’s Student Center (Building W20). “Around 1:30 p.m., a man matching Aaron Swartz’s description visited the SIPB office. He left shortly afterward, around 1:50 p.m.,” wrote David Wilson, SIPB Chairman, in an email to The Tech. “[At 4:20 p.m.], the MIT Police and representatives of IS&T came by and removed a laptop and external hard drive that had been hidden underneath a table. At the time, SIPB did not know where the machine had come from, nor was SIPB informed of the reason for its removal,” Wilson said. Though Swartz was not affiliated with SIPB, the student group welcomes visitors to use their office if there are members present.

It was 2:11 p.m. on Jan. 6 when Swartz was spotted on a bicycle on Massachusetts Avenue by an MIT police officer, according to the officer’s report. The report states that when he encountered Captain Albert Pierce of the MIT Police Department, Swartz jumped off his bike and ran down Lee Street, a few blocks north of City Hall in Central Square. He made it approximately 400 feet before being handcuffed and charged with breaking and entering. Though he refused to give the officers his name, a USB drive found on his person left little doubt that this was the man they were after — it contained “keepgrabbing2.py.”

Legal ramifications

Swartz faces up to 35 years in prison and up to $1 million in fines if he is convicted of the following charges: wire fraud, computer fraud, unlawfully obtaining information from a protected computer, and recklessly damaging a protected computer. The next hearing will be on Aug. 8. He is out on $100,000 bail.

These charges come despite JSTOR’s not pressing charges. “The criminal investigation and today’s indictment of Mr. Swartz has been directed by the United States Attorney’s Office,” said a statement released by JSTOR on July 19. “It was the government’s decision whether to prosecute, not JSTOR’s. As noted previously, our interest was in securing the content. Once this was achieved, we had no interest in this becoming an ongoing legal matter.”

Demand Progress, a group which Aaron Swartz founded, runs online campaigns to fight online censorship. The organization is currently rallying support for Swartz with an online petition that has been signed by over 35,000 people.

This isn’t the first time Swartz has run into trouble with the government for excessive downloading. This case is reminiscent of an incident in 2008, when Swartz was involved in the downloading of hundreds of thousands of legal documents from the Public Access to Court Electronic Records and releasing them for free. Though it led to an FBI investigation, Swartz was not indicted.

Earlier in 2008, Aaron Swartz authored a document titled “Guerilla Open Access Manifesto.”

“We need to download scientific journals and upload them to file sharing networks. We need to fight for Guerilla Open Access,” said Swartz in the manifesto. “With enough of us, around the world, we’ll not just send a strong message opposing the privatization of knowledge — we’ll make it a thing of the past. Will you join us?”