How does a journalist end up with 2.6TB of leaked classified data? And when you do, where do you start?

Hannes Munzinger has had to face exactly this problem. As a data journalist working on Süddeutsche Zeitung, he had to help untangle the 11.5 million leaked documents that made up the Panama Papers, which revealed how the rich use offshore tax regimes to avoid tax. I spoke to him at the British Film Institute, set up for the sixth annual Lovie Awards.

See related Philip Hammond unveils £1.9 billion plan to tackle hackers “How does a game replicate the complexity of a mess like Syria?” Government not ready for the impact of artificial intelligence, MPs claim Around six months after the publication of the Panama Papers, we talked about the importance of trust, the future of offshore entities and the changing position of data journalism within long-term investigations. I asked Munzinger how, in a situation where 107 different news organisations were collaboratively poring over documents, Süddeutsche Zeitung showed restraint when it came to publishing news – waiting for the vast quantity of information to be reviewed in its entirety before going public. Was there ever a feeling that, in spite of working through the International Consortium of Investigative Journalists (ICIJ), his paper needed to push ahead if they stumbled across something volatile, anticipating that another organisation would do the same?

“I think that the openness of Bastian Obermayer and Frederik Obermaier, who first got the data, was very important,” he told me. “I think there was a basis of trust, because they published everything they found. I think it was very important that the two guys that started this whole investigation showed to the others that they are very open about this.

“It was about building up the trust you needed by showing that you don’t keep anything for yourself. Just share it, share it, share it.”

(Above: “Hello. This is John Doe. Interested in data?” – The message that started the Panama Papers investigation)

Sharing the data internally was one thing, but the ethos and approach of the Panama Papers journalists is markedly different from the publishing culture of an organisation such as WikiLeaks. What convinces a source to give information to journalists instead of simply uploading it in bulk onto the internet? “I think it depends on the source,” said Munzinger. “The source left us a manifesto, and wrote why he or she wanted to have journalists dig into this [data], and that WikiLeaks didn’t respond to his or her messages. I think it will become even more important that journalists dig into [data] because WikiLeaks has some problems, such as what we’ve seen around [accusations of interfering with] the US election.”

Aside from ethos, a crucial consideration for a source is inevitably safety. When you’re tied to the largest leak of data in history, you want to make sure that the journalists handling your information can offer protection. I asked whether assuring anonymity becomes a more difficult task when you’re dealing with a global investigation that spans hundreds of organisations. Once again, his response hinged on the idea of trust.

“That’s why you have networks like the ICIJ, where you can trust every reporter on the team,” said Munzinger. “On the Panama Papers, no-one worked on that project that hadn’t worked with the ICIJ before, so they knew they could trust them. But of course this is the most important question, because if any one of these reporters had leaked something, the source could be in trouble. So it was very important that everyone respected the rules that the ICIJ set.”

But can you enforce those rules? Outside of trust in individual reporters, is there a way to safeguard guidelines set down by organisations such as the ICIJ?

“The project manager told everyone the only important thing is the story. No ego trips.”

“I think if you want to have access to colleagues like the ones that are in the ICIJ, then you have to stick to the rules,” he said. “The project manager told everyone the only important thing is the story. No ego trips. You can only join this project if you say right from the beginning I won’t break the rules. It’s mostly based on trust.

“I cannot imagine a larger project right now, so this case was the best test you can do. There were 400 colleagues from more than 80 countries, so what bigger team can you manage?”

The changing place of data journalism

While the Panama Papers investigation was far from the first to harness vast amounts of financial data, it has been central in establishing a new, large-scale, data-centred form of investigative journalism. While Bastian Obermayer and Frederik Obermaier initiated the Panama Papers investigation, Süddeutsche Zeitung hired Vanessa Wormer to become head of data for the newspaper. “The journalists that were sent the data weren’t that experienced with millions of documents,” Munzinger told me. “They had done great stories before – big investigations– but they were not technical journalists, they were writers.

“They realised they needed data journalists – that they couldn’t do this on their own.”

In 2015, ahead of the Panama Papers investigation, the ICIJ had been crucial in co-ordinating an investigation into the tax-evasion scheme allegedly operated with the encouragement of HSBC Private Bank (Suisse). The resulting report, “Swiss Leaks: Murky Cash Sheltered By Bank Secrecy”, was put together with the help of journalists from 45 countries. Given the echoes with what would go on to become the Panama Papers, it’s little surprise that Süddeutsche Zeitung sought a similar means to organise their efforts. Nevertheless, the success of the Panama Papers has arguably had major structural repercussions, not only for Süddeutsche Zeitung but for journalism at large. Does Munzinger think the perception of data journalism has changed since the Panama Papers?

“We could discuss what we mean when we say ‘data journalism’,” he teased, “but I think what we learned at Süddeutsche Zeitung is that there is very fruitful co-operation between investigative journalism and data journalism, and that’s something we didn’t do before. Since the Panama Papers we’ve been doing it regularly. This really changed the way we do investigations.”

(Above: Mar Cabra (L) of the ICIJ with Hannes Munzinger (R) of Suddeutsche Zeitung at The Lovie Awards)

I mentioned the 2015 film Spotlight, and how its depiction of investigative journalism plays up to the ideal many people may have of the job: knocking on doors, arguing with officials, leafing through reams of paper. How different is something like that to an investigation like the Panama Papers? Munzinger pointed out that much of what The Boston Globe’s Spotlight team did is indeed a form of data journalism, albeit one with less in the way of pan-global networking. That may be the case, but when you have a new process – one that is so reliant on software to help sift through documents – do you end up with an older generation that finds it harder to think about this data-led environment?

“We have the proof that it’s a good approach, and I think we will all head in this direction.”

“We had great investigative reporters that have been working with their phones for decades,” said Munzinger. “But now we have younger colleagues who grew up with this whole open-data mindset, and this whole idea of exchanging knowledge. That’s obviously changed how we do investigative journalism. I think the future will involve more collaboration, because the biggest data leak was managed with these techniques. We have the proof that it’s a good approach, and I think we will all head in this direction.”

To trawl through millions of scanned files, Süddeutsche Zeitung used the software Nuix to perform optical character recognition (OCR) and search through the documents. You can read a full account of the logistics Süddeutsche Zeitung developed with such systems via the newspaper’s own explainer. Munzinger told me that, in a nutshell, while there may be a change in mindset towards the structure of data, the technological processes were ultimately underpinned with traditional journalistic techniques.

”You can’t read 11.5 million documents, but you get to know the structure. You get to know the things that reoccur. The software helps, but in the end, you have to read a lot.”

Before letting Munzinger get ready to accept his award, I asked him whether he thinks offshore tax havens have a future in the wake of the Panama Papers. “Of course offshore will have a future, but I think it has become harder to do these things [that] behind-the-curtains offshore companies provide,” he said. “Panama recently signed the OECD treaty, for mutual exchange of tax information, as the 105th country. It’s around 200 days after the Panama Papers were published, so we see there’s change, but if you want to hide something there’s other ways. We’ll have to try and find it again.”