The ICIJ's exploration of offshore secrets began when a computer hard drive packed with corporate data arrived in the post. Gerard Ryle, ICIJ's director, obtained the small black box as a result of his three-year investigation of Australia's Firepower scandal, a case involving offshore havens and corporate fraud.

The hard drive contained more than 260 gigabytes, the equivalent of half a million books. Its files included 2m emails, four large databases. There were details of more than 122,000 offshore companies or trusts, and nearly 12,000 intermediaries (agents or "introducers").

Unlike the smaller cache of US cables and war logs passed in 2010 to WikiLeaks, the offshore data was not structured or clean, but an unsorted collation of internal memos and instructions, official documents, emails, large and small databases and spreadsheets, scanned passports and accounting ledgers.

Analysing the immense quantity of information required "free text retrieval" software, which can work with huge volumes of unsorted data. Such high-end systems have been sold for more than a decade to intelligence agencies, law firms and commercial corporations. Journalism is just catching up.

The named people who administered offshore companies included shareholders, directors, secretaries, lawyers, accountants, nominees and trustees. But many of such structures were simply legal devices designed to conceal. The real beneficial owners proved often to be the so-called "settlors" or "protectors" of offshore trusts, and those holding legal powers of attorney which enable them to exert secret control over the bank accounts.

China, Hong Kong, Taiwan, the Russian Federation and former Soviet republics appeared to provide the majority of secret offshore owners. The British Virgin Islands are the second-largest source of capital investment in China – on paper at least. Cyprus, an offshore island currently in financial crisis as a result, is also identified in the data as a huge source of Russian investment.

ICIJ's collaborating journalists from 46 countries constituted one of the largest groups ever to have worked together on a data project.

Interestingly, the team's attempts to use encrypted email systems such as PGP ("Pretty Good Privacy") were abandoned because of complexity and unreliability that slowed them down.

Meanwhile, computer programmers in Germany, the UK and Costa Rica also designed sophisticated data mining and cleaning software for ICIJ. Manual analysis in New Zealand proved crucial in early decisions on what countries ICIJ needed reporters.

ICIJ's own search system – named Interdata – was developed by a British programmer as dozens of new journalists joined the expanding project. Interdata allowed them to download copies of those of the 2.5m offshore documents relevant to their countries.

ICIJ rebuilt some of the databases in an effort to run them in their original format. There were surprises. The databases were formatted to record who really lay behind each entity, as required by international regulations on money laundering and "due diligence". Journalists hoped the truth was just a click away.

In fact, entries for "beneficial owners" were often empty. The offshore agencies had frequently passed off their supposed legal responsibility to intermediaries in other countries. The lesson was that the empty fields were not an accident; it was the design.

Only occasionally would an alert screen pop up, giving contact details for the persons who really owned the assets. ICIJ's fundamental lesson therefore had to be patience and perseverance.

But persistently following leads through incomplete data yielded some great rewards: not just occasional and unexpected top names, but also the inside details of many nuanced and complex schemes for hiding wealth.

• Duncan Campbell was the data journalism manager for the ICIJ project.