In March, security researcher Chris Vickery made a remarkable discovery. In one of the most notable operations of its kind, he said in a blog post, a group called River City Media had collected about 1.4 billion personal information records, and was using them for spam. “Chances are that you, or at least someone you know, is affected,” Vickery wrote. RCM said it had used legitimate marketing practices to collect the data, but regardless, the scope of the program was massive, and when the records leaked, it left a mountain of personal information exposed.

Clues suggest the data may have come from leaks or other, overlapping databases

Often when these sorts of records leak, they’re used for fraud. But RCM’s data may have been put to a political use. As The Verge reported yesterday, tens of thousands of identical anti-net neutrality comments tied to real names and addresses were bombarding the FCC. Some clues suggest those identities were pulled from dumps, or from other, similar databases with overlapping information.

The Verge received a tip piecing together a likely scenario involving information gleaned in another, smaller spam dump known as Special K. The Verge manually examined names and addresses found in that dump, and compared them to the data leak database Have I Been Pwned, finding substantial overlap with the RCM database. Afterward, independent data analyses found that a high percentage of personal information used in the FCC spam — more than 65 percent — overlapped with information in data breaches. Developer Chris Sinchok wrote on Medium that a control sample was closer to 31 percent, suggesting the spammers may have been “working with breach data directly, or with a data warehouse whose lists ended up in one of these breaches.”

CFIF is looking into what happened

The text of the identical comments posted to the FCC was written by a conservative group called the Center for Individual Freedom, which provided a form intended for people to send to the FCC. Someone could either have automated a way to fill out the form, or taken the text and used it in their own spam campaign. (CFIF told The Verge yesterday that it was looking into what might have happened.)

If someone did want to use the dump to manufacture FCC comments, it wouldn’t have been hard. There are a number of widely available tools for filling leaked data into web forms, letting criminals submit millions of entries at a time. Typically, those tools are used for account takeovers — seeing if leaked LinkedIn credentials can be used to access a person’s Gmail account, for instance — but they would have worked just as well on the CFIF form. CFIF may not even have been aware of the attack, since many of the tools include measures to disguise the source IP address. But someone appears to have copied the text and automated the spam in another way: Sinchok writes that the spam comments appeared to be submitted over API.

Vickery offered to provide the FCC with the data set

The anti-net neutrality campaign seems to have slowed down since yesterday, but at its peak earlier this week it produced tens of thousands of comments — roughly 17,000 in one 24-hour period — and accounted for a significant portion of the total comments. “The unprecedented regulatory power the Obama Administration imposed on the internet is smothering innovation, damaging the American economy and obstructing job creation,” the comment begins. “I urge the Federal Communications Commission to end the bureaucratic regulatory overreach of the internet known as Title II and restore the bipartisan light-touch regulatory consensus that enabled the internet to flourish for more than 20 years.”

The attack coincided with more controversy over the comment system. John Oliver organized a pro-net neutrality drive, directing viewers to contribute. The increased traffic appeared to crash the comment system, although the FCC later claimed the problems were caused by malicious DDoS attacks.

Several individuals whose names appeared in the FCC comments were contacted by The Verge this week and said they had no knowledge of the FCC comments made in their name, and were uncertain how their personal information may have been uploaded. The spam lists may go some way toward answering those questions.

Vickery offered to provide the RCM data to the FCC so it could explore and weed out fraudulent comments. The Verge has asked the FCC for comment on the offer.

Update, May 18, 11:00AM ET: Includes information on independent data analyses.