On March 16th I have found an unprotected and thus publicly available Elasticsearch instance which appeared to be managed by a UK-based security company, according to the SSL certificate and reverse DNS records. The irony of that discovery is that it was a ‘data breach database’, an enormously huge collection of previously reported security incidents spanning 2012-2019 era.

NOTE: Company’s data and customer records were not exposed, incident involved only previously reported data breaches collections.

A French-based IP was indexed by BinaryEdge search engine on March 15th.

Elasticsearch cluster in question had two collections:

leaks_v1 , with 5,088,635,374 records (more than 5 Billion records)

, with 5,088,635,374 records (more than 5 Billion records) leaks_v2, with more than 15 million records, updating in real-time

Data was very well structured and included:

hashtype (the way a password was presented: MD5/hash/plaintext etc)

leak date (year)

password (hashed, encrypted or plaintext, depending on the leak)

email

email domain

source of the leak (I was able to confirm a few of the most prominent ones: Adobe, Last.fm, Twitter, LinkedIn, Tumblr, VK and others).

Example of records structure:

I have immediately sent a security alert to the company which seemed to be responsible for the exposure but never received a reply. Database, however, has been taken offline within an hour after notification sent.

Read more in the company’s statement.

Dangers of exposed data

Even though most of the data seems to be collected from previously known sources, such large and structured collection of data would pose a clear risk to people whose data was exposed. An identity thief or phishing actor couldn’t ask for a better payload.

Fraudsters might target affected people with scams and phishing campaigns, using their personal information to craft targeted messages

Phishing messages often impersonate trusted people or organizations to trick victims into giving up sensitive information or money. They often contain links to phishing websites, which mimic genuine websites. In fact, they exist only to steal information, such as passwords and payment information.

How and why we discovered this exposure

Our goal is to help to protect data on the Internet by identifying data leaks and following responsible disclosure policies. Our mission is to make the cyber world safer by educating businesses and communities worldwide.

Our extensive cybersecurity knowledge lends itself well to searching for and analyzing data leaks. Our due diligence demands that we make every attempt to identify who is responsible and notify them as quickly as possible.

Our hope is to minimize harm to end users whose data was exposed. We take steps to find out what each database contained, for how long it was exposed, and what threats to end users might arise as a result. Our findings are compiled into reports like this one to raise awareness and curb misuse of personal data by malicious parties.