In October 2017, Twitter general counsel Sean Edgett faced difficult questions from the Senate Judiciary Committee about foreign interference in the 2016 election. Flanked by representatives from Facebook and Google, Edgett explained how Russia’s Internet Research Agency (IRA) had systematically spread fake news and stoked partisan sentiment through a carefully coordinated, years-long social media campaign.

A year later, Twitter released an archive of more than 10 million tweets, from 3,841 accounts it said were affiliated with the IRA, hoping to encourage “open research and investigation of these behaviors from researchers and academics.” The company has followed with additional data dumps, most recently last month when it released details of accounts linked to Russia, Iran, Venezuela, and the Catalan independence movement in Spain. All told, Twitter has shared more than 30 million tweets from accounts it says were “actively working to undermine” healthy discourse.

Researchers say the trove has been invaluable in learning about state-sponsored disinformation campaigns and how to combat them. Patrick Warren and Darren Linvill of Clemson University used the data to identify different kinds of troll behavior and examine how each contributed to the IRA campaign. “A lot of people have been using the data to try to come up with strategies to make our political conversation more robust,” Warren says. He points to a recent Stanford report that recommends regulating political ads, strengthening internal monitoring at social media companies, and standardizing labels for content linked to disinformation campaigns.

Still, there’s much missing from Twitter’s data dumps, and many unanswered questions about how impactful these accounts really were, how they operated, and how successful Twitter is at finding and shutting them down.

The data releases include the text of the tweets, the account names, number of people those accounts followed, the number of people who followed them, and how many times a tweet was liked and retweeted. But Twitter doesn’t release the names of accounts that followed or were followed by these state-sponsored profiles, to protect the privacy of those users. “The real thing that we don’t know is who saw these tweets?” says Cody Buntain, a postdoctoral researcher at NYU’s Social Media and Political Participation Lab. “That’s the critical piece of information that Twitter does not provide.”

Without those follower networks, Buntain and others say it’s hard to assess the impact of the accounts and how they grew and evolved over time. Did a bunch of fake accounts start following each other to give themselves the appearance of normalcy? Or did they start following specific people and grow their following organically? Researchers can’t say. With that information, “we could see what kind of content was the most engaging,” says Buntain. He says that information would also help us understand which niches of Twitter were targeted and how.

The follower networks are public while an account is working, but they disappear once Twitter shuts it down. Exposing those followers could subject users to abuse or harassment. “I can see why the platforms would be hesitant,” says Ben Nimmo, a senior fellow of the Atlantic Council’s Digital Forensic Research Lab. People who followed IRA or other state-sponsored accounts may have been manipulated, but they weren’t breaking the law or even violating Twitter’s terms of service.

“We're committed to publishing every tweet, video, and image that we can reliably attribute to a state-backed information operation,” a Twitter spokesperson says via email. “We have an obligation to balance these important public disclosures with our commitment to protecting people's reasonable expectation of privacy, and we conduct thorough impact assessments before each.”