Security researchers have uncovered 1.5 billion business and consumer files exposed online – just a month before Europe's General Data Protection Regulation comes into force.

During the first three months of 2018, threat intel firm Digital Shadows detected 1,550,447,111 publicly available files across open Amazon Simple Storage Service (S3) buckets, rsync, Server Message Block (SMB), File Transfer Protocol (FTP) servers, misconfigured websites, and Network Attached Storage (NAS) drives.

This included documents spanning payroll data, tax returns, medical records, credit cards and intellectual property. A staggering 64,176,425 files came from the UK alone.

The trove amounts to more than 12PB (12,000TB) of exposed data – more than 4,000 times larger than the Panama Papers leak, which weighed in at a measly 2.6TB.

The most common data exposed was payroll and tax return files, which accounted for 700,000 and 60,000 files respectively. However, consumers were also at risk from 14,687 instances of leaked contact information and 4,548 patient lists. A large volume of point-of-sale terminal data – transactions, times, places, and even some credit card details – was publicly available.

Although misconfigured Amazon S3 buckets have hogged headlines recently, in this study (registration required) cloud system leaks accounted for only 7 per cent of exposed data. Instead it is older, yet still widely used, technologies – such as SMB (33 per cent), rsync (28 per cent) and FTP (26 per cent) – which have contributed the most.

Business-critical information also leaked. For example, a patent summary for renewable energy in a document marked as "strictly confidential" was discovered. Another case included a document containing proprietary source code submitted as part of a copyright application. This file included the code that outlined the design and workflow of a site providing software Electronic Medical Records, as well as details about the copyright application.

Third parties and contractors were identified as one of the most common sources of sensitive data exposure. The leaked information included security assessment and penetration tests. In addition, Digital Shadows identified consumer backup devices that were misconfigured to be internet-facing and inadvertently making private information public.

Digital Shadows has attempted to notify, where practical, organisations that are leaking data.

"We have a responsible disclosure process in place and we have alerted some of the organisations," said Rick Holland, VP Strategy. "We couldn't always attribute the data to specific organisations and the scale was also problematic."

Some of the files identified as leaked might be encrypted but the extent of this is unclear.

"We did a sampling of the data and we did find some archive/backup files that were encrypted," Holland added. "As we continue to develop both our collection and analysis framework we could do further investigation into specific file/desk encryption software (given that they use unique extensions)." ®