On a bright April morning in Menlo Park, California, I became an Internet spy.

This was easier than it sounds because I had a willing target. I had partnered with National Public Radio (NPR) tech correspondent Steve Henn for an experiment in Internet surveillance. For one week, while Henn researched a story, he allowed himself to be watched—acting as a stand-in, in effect, for everyone who uses Internet-connected devices. How much of our lives do we really reveal simply by going online?

Henn let me into his Silicon Valley home and ushered me into his office with a cup of coffee. Waiting for me there was the key tool of my new trade: a metal-and-plastic box that resembled nothing more threatening than an unlabeled Wi-Fi router. This was the PwnPlug R2, a piece of professional penetration testing gear designed by Pwnie Express CTO Dave Porcello and his team and on loan to us for this project.

The box would soon sink its teeth into the Internet traffic from Henn's home computer and smartphone, silently gobbling up every morsel of data and spitting it surreptitiously out of Henn's home network for our later analysis. With its help, we would create a pint-sized version of the Internet surveillance infrastructure used by the National Security Agency. Henn would serve as a proxy for Internet users, Porcello would become our one-man equivalent of the NSA’s Special Source Operations department, and I would become Henn's personal NSA analyst.

As Henn cleared a spot on his desk for the PwnPlug, he joked that it might not provide anything useful for us to analyze. In the year since Edward Snowden pulled back the curtain of secrecy around the NSA’s dragnet surveillance programs, many of the major Internet service providers targeted by the spy agency have publicly announced plans to better protect customers, often through the expanded use of encryption.

Our experiment would answer the question: could a passive observer of Internet traffic still learn much about a target in this post-Snowden world?

Henn dialed up Porcello and put him on speakerphone as we finalized the location and setup of the PwnPlug. As I snapped in an Ethernet cable, Henn turned on his iPhone and connected to the PwnPlug’s Wi-Fi network. Porcello watched remotely as data from Henn's network suddenly poured into a specially configured Pwnie Express server.

“Whoa,” Porcello said. “Yep, there’s Yahoo, NPR... there’s an HTTP request to Google... the phone is checking for an update. Wow, there’s a lot of stuff going on here. It's just thousands and thousands of pages of stuff... Are you sure you’re not opening any apps?”

“I didn’t do anything!” Henn replied. “My phone is just sitting here on my desk.”

He checked his phone and found that Mail, Notes, Safari, Maps, Calendar, Messages, Twitter, and Facebook were running in the background—and making connections to the Internet. The Safari Web browser proved the most revealing. Like most people who use the iPhone, Henn had left open dozens of websites; when his phone had connected to the PwnPlug’s network, the browser had refreshed them, revealing movies he was checking out for his kids, a weather report, and research he was doing for work.

In the first two minutes of our test, we had already captured a snapshot of Henn’s recent online life—and the real surveillance hadn't even begun.

Your own personal NSA

While the NSA runs hundreds of surveillance programs, its broad, passive surveillance of the Internet has just two key components: Turbulence, a network monitoring system that skims traffic from the Internet’s fiber-optic backbone, and XKeyscore, an analytics database that processes the captured traffic, using rules that look for specific strings of text or patterns in data (e-mail addresses, phone numbers, file attachments). According to leaked NSA documents and whistleblower testimony, pieces of both Turbulence and XKeyscore are scattered about the world near Internet chokepoints such as the infamous “secret room” at AT&T’s San Francisco offices that has been described by former AT&T employee Mark Klein.

To recreate this setup in miniature, the PwnPlug in Henn’s office was configured as a Wi-Fi access point; it acted as our equivalent of the NSA’s Turbulence. While the PwnPlug is generally used for network penetration testing, Porcello configured the device used in our test only to intercept traffic outbound to or inbound from the Internet, not traffic that began and ended on Henn's home network. The device captured every packet matching these criteria and sent it over a secure SSH connection back to a server at Pwnie Express headquarters in Berlin, Vermont.

The remote machine at Pwnie acted as our diminutive version of XKeyscore. To emulate the NSA's processing of captured traffic, Porcello ran a number of open source analytics tools against Henn's traffic, including the ngrep packet search tool, the tshark and Wireshark traffic analysis tools, the tcpflow data stream capture tool, the dsniff suite’s passive monitoring tools, and tcpxtract for capturing files within Internet traffic.

For more than a month before the experiment began, Ars Technica and NPR made technical and legal preparations to ensure that any data captured from Henn would be handled with confidentiality and care. The focus would be solely on Henn’s personal online activities; we explicitly did not attempt to penetrate NPR’s corporate network, to hack Henn’s computer or phone, or to grab traffic from Henn's other family members. We would simply watch the traffic passing between our test Wi-Fi network and the Internet in the same way that the NSA collects data from millions of Internet users around the world each day.

Our full access to Henn's activities lasted for several days while he reported a single story. To make Henn as accurate a proxy as possible for the average unsuspecting Internet user, one condition stipulated for the test was that when the PwnPlug was active, Henn wouldn’t take extra measures to avoid surveillance (though he followed his normal operational security protocols). Henn could also pull the plug on our test at any time.

The experiment unfolded in two phases. In the first, we simply observed Henn’s normal Internet traffic. In the second, Henn, Porcello, and I stopped the broad surveillance of Henn and turned our tools on specific traffic created by leading Web applications and services. Here's what we found.