Only a couple of weeks ago, there were a lot of news headlines about how Germany had banned an internet-connected doll called "Cayla" over fears hackers could target children. One of their primary concerns was the potential risk to the privacy of children:

conversations between the child and others can be recorded and forwarded

The Germans had a good point: kids' toys which record their voices and send the recordings up to the web pose some serious privacy risks. It's not that the risks are particularly any different to the ones you and I face every day with the volumes of data we produce and place online (and if you merely have a modern phone, that's precisely what you're doing), it's that our tolerances are very different when kids are involved. I've got young kids myself and frankly, I'm with the Germans on this one; I don't see a need for them to have things like their voices recorded and stored online. That's not to say I don't want them to have an online presence and I'm gradually exposing both of them to more and more modern internet things, but I don't particularly want innocent childish behaviour like playing with a toy to be recorded and stored on other people's computers.

Cayla isn't the first connected toy to raise concerns either. Just over a year ago it was "Hello Barbie" making the headlines for precisely the same reasons. Yes, it's a cool idea but no, I (and many others) don't want my kids exposed in that way. In fact, just before that, we had the VTech data breach which exposed a huge amount of very personal information after parents bought their kids connected tablets, joined them to the wifi network and created accounts for them. Those accounts were ultimately exposed and included the kids' names, genders, birth dates, photos and links to parent with full physical addresses. That should have been the wakeup call where we all said "hey, if we put our kids' data on the web, we need to expect it to be leaked", but evidently it hasn't stopped the flood of connected toy things.

Which brings us to CloudPets (a brand owned by Spiral Toys) which is a toy that represents the nexus of both the problems discussed above: kids' voices being recorded and their data consequently being leaked. The best way to understand what these guys do is to simply watch the video:

Now firstly, put yourself in the shoes of the average parent, that is one who's technically literate enough to know the wifi password but not savvy enough to understand how the "magic" of daddy talking to the kids through the bear (and vice versa) actually works. They don't necessarily realise that every one of those recordings – those intimate, heartfelt, extremely personal recordings – between a parent and their child is stored as an audio file on the web. They certainly wouldn't realise that in CloudPets' case, that data was stored in a MongoDB that was in a publicly facing network segment without any authentication required and had been indexed by Shodan (a popular search engine for finding connected things).

Unfortunately, things only went downhill from there. People found the exposed database online. Many people and the worrying thing is, it's highly unlikely anyone knows quite how many. The first I knew of it was when earlier last week, someone sent me data from the table holding the user accounts, about 583k records in total (this subsequently turned out to be a subset of the total number in the CloudPets service). I started going through my usual verification process to ensure it was legitimate and by pure coincidence, I was in the US running a private security workshop at the time and one of the guys in my class had a CloudPets account. Sure enough, his email address was in the breach and it was time-stamped Christmas day, the day his daughter had been given the toy. His record looked somewhat like these, the first few in the data I was given:

The password was stored as a bcrypt hash and to verify it was legitimate, he gave me his original password (I asked him to change it on CloudPets first) and I successfully validated that the hash against his record was the correct one (I'd previously validated the Dropbox data breach by doing the same thing with my wife's account). The data was real.

CloudPets left their database exposed publicly to the web without so much as a password to protect it.

Getting back to who sent it to me, this is someone who travels in data breach trading circles so I have no idea how far the data had actually circulated. However, I subsequently discovered that the database had definitely been accessed well beyond just this individual. But first, it gets worse still...

The guy who sent me the data had tried to contact CloudPets three times to warn them about the exposure. The first was on December 31 when he reached out to the email address listed on their support page but the message immediately bounced back. He subsequently followed up with the email address on their WHOIS record (it appears the WHOIS contact is a marketing company called On Demand):

4 days later after still not getting a response, he contacted their hosting provider:

Note the record count here – he'd identified "over 820k users" – the 583k in circulation was not the full amount. So 3 attempts to warn the organisation of a serious security vulnerability and not a single response. I've said many times before in many blog posts, public talks and workshops that one of the greatest difficulties I have in dealing with data breaches is getting a response from the organisation involved. Time and time again, there are extensive delays or no response at all from the very people that should be the most interested in incidents like this. If you run any sort of online service whatsoever, think about what's involved in ensuring someone can report this sort of thing to you because this whole story could have had a very different outcome otherwise. (For reference, check out Tesla's Security Vulnerability Reporting Policy which is beautiful in its simplicity.) But it gets worse still and this brings me back to the earlier point about multiple people having accessed the data...

Now knowing that not only was the data legitimate, it was highly likely to be circulating and the company in question wasn't responding to emails, I reached out to Lorenzo from Motherboard. I worked with Lorenzo on that VTech data breach I mentioned earlier and he's both someone who knows how this industry works and a guy I trust to be fair, accurate and responsible in dealing with these incidents. Journalists have a knack of getting responses from organisations and I had confidence that he'd do his utmost to approach CloudPets in an appropriate fashion and alert them to this incident before publishing anything. When he heard the details of the breach, he was immediately interested and took up the task of investigating it. And then he found these images in previous communications he'd had with someone else:

He had already been contacted about CloudPets! It wasn't just the images either, he'd received small samples of data from a selection of the tables in the exposed DB. Journos get a heap of communication from people about all number of things and a bunch of it turns out to be red herrings but in this case, it was exactly the same vulnerability identified by someone completely different. Not only that, but this individual had also contacted CloudPets on December 30 and sent this message:

I want to inform you that 45.79.147.159 is running a MongoDB instance which appears not to be correctly configured or protected by a firewall allowing connections via port 2701

Clearly, CloudPets weren't just ignoring my contact, they simply weren't even reading their emails.

4 attempts (that we know of) were made to contact CloudPets and warn them of this risk.

The images sent to Lorenzo both confirm the findings of the individual who sent me the data (it shows 821,296 records) and tell us new things about the extent of the exposure. For example, we can see 2,182,337 voice recordings in the system which seems to be a feasible number for 821k registered users. We can also see two databases of identical size at almost 10GB each, "cloudpets-staging" and "cloudpets-test". Assuming their names are self-explanatory, these support both a staging environment and a test environment although they're both facing the public web and have real customer data in them. This breaks the cardinal rule of never putting production data into a non-production system (read my post on test data done right for more on this). It also potentially exposes the production system (and production customer data) to developers building the software (another cardinal rule broken), but at this stage when it's entirely open to the internet anyway, that would be the least of their worries. The point is, what's disclosed in the images above suggests the problems go deeper than data exposure alone.

There are references to almost 2.2 million voice recordings of parents and their children exposed by databases that should never have contained production data.

But then I dug a little deeper and took a look at the mobile app:

This app communicates with a website at spiraltoys.s.mready.net which is on a domain owned by Romanian company named mReady. That URL is bound to a server with IP address 45.79.147.159, the exact same address the exposed databases were on. That's a production website there too because it's the one the mobile app is hitting so in other words, the test and staging databases along with the production website were all sitting on the one box. The most feasible explanation I can come up with for this is that one of those databases is being used for production purposes and the other non-production (a testing environment, for example).

I wanted to understand more about how the application was communicating with the server so I took a closer look at the traffic. With the help of a Have I been pwned (HIBP) subscriber I found in the data, I was added as a friend and was then able to observe how the app communicated with the back end services. One of the first things I noticed was that the profile picture I uploaded (I just screen-captured a bunny and used that), was stored in an Amazon S3 bucket:

My profile photo: https://cloudpets-prod.s3.amazonaws.com/d0b5962f8a168dbb1c7761bf795f929e_CB53FC23-9E61-4551-9661-BC252975A931.jpeg

As you can see by loading the image, all that's required to access the file is the path which is returned by the app every time my profile is loaded. That profile also contains other personal information; the data sent to Lorenzo shows that along with references to their profile photos, it contained the names of children and their day and month of birth (although not year). It also contains relationships to parents and "friends" (i.e. grandmother, uncle) that have been authorised to share messages with the child.

I was curious as to whether or not the voice recordings would demonstrate the same behaviour, so I left a test message for my new "friend" and discovered a similar pattern:

My voice recording: https://cloudpets-prod.s3.amazonaws.com/990e5b17ad63146f4aa729209b5fea7a_3FBDE242-0954-466E-800D-421BF0CB0951.wav

Once again, an Amazon S3 bucket with no specific authorisation required, merely knowledge of the file path which is obviously stored in the app itself (returned via the API). Based on how CloudPets position their toys, you can imagine the sorts of voice messages the system contains. By virtue of the support I got from members of the service, I was given access to some of the short clips recorded directly on the toy. One little girl who sounded about the same age as my own 4-year old daughter left a message to her parents:

Hello mommy and daddy, I love you so much

Another one has her singing a short song, others have precisely the sorts of messages you'd expect a young child to share with her parents. I didn't download either pictures or recordings from other parties, only those I was specifically granted access to by HIBP subscribers, but the risk was clear:

The services sitting on top of the exposed database are able to point to the precise location of the profile pictures and voice recordings of children.

But even if you didn't know the exact location of the files on AWS, there's still another risk which would expose them and that relates to the passwords on profiles. Here's where things are a bit of a double-edged sword: on the one hand, CloudPets stored passwords as a bcrypt hash which is a good thing. It's a slow hashing algorithm designed to be more resilient to cracking should the hashes be leaked in precisely the sort of circumstance we have here. However, counteracting that is the fact that CloudPets has absolutely no password strength rules. When I say "no rules", I mean you can literally have a password of "a". That's right, just a single character. Not only that, check out how the tutorial demonstrates account creation and particular, how to choose a password:

The password used here in the demonstration is literally just "qwe"; 3 characters and a keyboard sequence. What this meant is that when I passed the bcrypt hashes into hashcat and checked them against some of the world's most common passwords ("qwerty", "password", "123456", etc.) along with the passwords "qwe" and "cloudpets", I cracked a large number in a very short time:

That's just a very small sample from a brief run, but the figures showed there would be thousands of passwords adhering to this very small handful of bad examples.

Due to there being absolutely no password strength requirements whatsoever, anyone with the data could crack a large number of passwords, log on to accounts and pull down the voice recordings.

By now it's pretty obvious that multiple parties identified the exposed database, it remained open for a long period of time and it exposed some very personal data. It would be a safe bet to assume that many other parties located and then exfiltrated the same data because that's what people do; scanning for this sort of thing is enormously prevalent and that data – including the kids' and parents' intimate audio clips – is now in the hands of an untold number of people. But it gets even worse again...

My good friend Niall Merrigan was doing some really good work cataloguing exposed MongoDBs recently so I reached out to him for some support to investigate the extent of CloudPets' exposure. He used Shodan's API to go back and look at the historical states of the IP address and the exposed databases running on it. December 25 was the first recorded instance of the data being exposed (that's as far back as Shodan's search will take us):

You're looking the JSON emitted by Shodan's API here and it showed both those databases present on Xmas day. As of Jan 5, both databases were still there. However, a couple of days later on Jan 7 they were joined by a new one:

This is where things took a turn for the worse because as innocuous as the name may seem, it has a more sinister meaning:

Niall is highlighting a pattern he began to see only just before this new database appeared which was the use of that name to demand a ransom. He was seeing databases named "PLEASE_READ" appear across many compromised systems containing a ransom as follows:

You DB is backed up on our servers, send 1 BTC to 1J5ADzFv1gx3fsUPUY1AWktuJ6DF9P6hiF then send your ip address to email:kraken0@india.com

Whilst Shodan doesn't index the contents of exposed databases it finds, it's a safe bet that the exposed CloudPets one contained the same message as so many other compromised ones with the same name did. The analysis that Niall was doing at the time showed that at this stage, the two original CloudPets databases had been deleted which is what you'd expect when a ransom is being demanded.

On January 8, it got worse again:

Like the earlier image, these are yet more indicators of compromise (IOC) consistent with the ransom demands that were going around for MongoDBs in early Jan. Niall called them out later that month as part of his commentary on how the whole saga was unfolding:

After 2 1/2 weeks the amount of data lost in the #Mongodb ransack is about 150TB (screenshots 08-01 and 25-01) pic.twitter.com/xJQ4Jf5GAq — Niall Merrigan (@nmerrigan) January 25, 2017

There were many malicious parties taking action against exposed databases during this period and we frequently saw the same system accessed multiple times by different actors, each demanding their own ransom. It wasn't until Jan 13 that Shodan reported no publicly accessible databases remained on CloudPets' IP Address.

The CloudPets data was accessed many times by unauthorised parties before being deleted and then on multiple occasions, held for ransom.

For a great write-up on how MongoDBs were being compromised in this fashion, have a read of Extortionists Wipe Thousands of Databases, Victims Who Pay Up Get Stiffed by Brian Krebs.

So just to tie the whole saga together in a neat chronological fashion, here's the entire timeline of events as I know them:

Dec 30: Lorenzo's contact attempted to alert CloudPets Dec 31: My contact attempted to alert CloudPets via their published support address Dec 31: My contact attempted to alert CloudPets via the WHOIS record contact Jan 4: My contact attempted to alert Linode, CloudPets' hosting provider Jan 7: The original databases were deleted and a ransom demand was left on the exposed system via the IOC named "PLEASE_READ" Jan 8: Another ransom demand was left for "README_MISSING_DATABASES" and another again for "PWNED_SECURE_YOUR_STUFF_SILLY" Jan 13: No remaining databases were found to still be publicly accessible

It's impossible to believe that CloudPets (or mReady) did not know that firstly, the databases had been left publicly exposed and secondly, that malicious parties had accessed them. Obviously, they've changed the security profile of the system and you simply could not have overlooked the fact that a ransom had been left. So both the exposed database and intrusion by those demanding the ransom must have been identified yet this story never made the headlines. Certainly, the guy in my workshop whose data was in the set I was provided had never heard anything about the exposure and that his private conversations with his daughter had potentially been illegally accessed.

Unauthorised access must have been detected but impacted parents were never notified.

There's another angle to all this as well which helps explain why nobody was returning emails or phone calls (Lorenzo tried in vain to phone both CloudPets and Spiral Toys) as well as why there appear to be some serious shortcuts taken with the hosting situation. One look at the stock price for Spiral Toys tells the story:

Spiral Toys is worth less than half a cent per share. Since late 2015, they've been in rapid decline to the point where the company is near worthless with a market cap of only $262k (that's down more than 99% of their peak value). Not even the launch of their new connected piggy bank (yes, you read that correctly) back in November could save them; they saw a momentary bump in the share price then it went back downhill and stayed flat. The CloudPets Twitter account has also been dormant since July last year so combined with the complete lack of response to all communications, it looks like operations have well and truly been shuttered.

Circling back to the parents' position for a moment, you must assume data like this will end up in other peoples' hands. Whether it's the Cayla doll, the Barbie, the VTech tablets or the CloudPets, assume breach. It only takes one little mistake on behalf of the data custodian – such as misconfiguring the database security – and every single piece of data they hold on you and your family can be in the public domain in mere minutes. If you're fine with your kids' recordings ending up in unexpected places then sobeit, but that's the assumption you have to work on because there's a very real chance it'll happen. There's no doubt whatsoever in my mind that there are many other connected toys out there with serious security vulnerabilities in the services that sit behind them. Inevitably, some would already have been compromised and the data taken without the knowledge of the manufacturer or parents.

If you think you've been impacted by the CloudPets breach, you can now search for your email address in HIBP. You can also read Lorenzo's full writeup in Internet of Things Teddy Bear Leaked 2 Million Parent and Kids Message Recordings.

Update 1 (about 8 hours later): I've just seen the first actual response from Spiral Toys and am a bit stunned. Let's dissect this here:

The MongoDB was exposed. That's not negotiable, the data is now in the wild. Mark said "Were voice recordings stolen? Absolutely not", which I suspect implies they were not obtained from the exposed MongoDB which is correct. The MongoDB contains references to both profile pictures and voice records which are stored in Amazon S3. If you know the reference to the S3 file, you can download it without authorisation, for example: https://cloudpets-prod.s3.amazonaws.com/990e5b17ad63146f4aa729209b5fea7a_3FBDE242-0954-466E-800D-421BF0CB0951.wav If you can crack a password, you can login to an account using the app and access the voice recordings. This is the intended function of the app. With regards to there being no rules on password strength, Mark suggested there needs to be a balance and questions "How much is too much?". Allowing a password of "a" is too little. Creating a tutorial showing a password of "qwe" is too little. Usually, 6-8 characters of mixed types is a bare minimum and the decision to have absolutely no password strength requirements whatsoever means many passwords can be readily cracked, thus granting access to the voice recordings for the account. There is a claim made that "the company never received the warnings". There were many messages sent over a period of time as detailed above. Lorenzo also tried to get in touch within the last week via multiple channels and as he said in his story, CloudPets "could not be reached for comment". There is also a record in their ZenDesk support ticket system warning them of the risk from December 31. Mark is quoted as saying that "the company found no evidence that any hackers broke into customer accounts", which indicates they knew of the incident before this blog post. Their databases were deleted and 3 different ransoms were left. Unauthorised parties downloaded their databases.

To suggest that the exposure and ransom of a database containing 821k user records and providing access to millions of voice recordings from and to children represents "a very minimal issue" is just unfathomable. Further to this, California (where Spiral Toys is based), has mandatory data breach reporting laws:

California law requires a business or state agency to notify any California resident whose unencrypted personal information, as defined, was acquired, or reasonably believed to have been acquired, by an unauthorized person.

(As of the first of January, that scope was expanded to also include encrypted personal information.)

Personal information includes "A user name or email address, in combination with a password or security question and answer that would permit access to an online account". Data breach disclosure laws require impacted parties be notified when their personal information (such as email addresses and passwords) is disclosed. It is not up to the company who lost the data to make a judgement decision on how likely it was that malicious parties then logged into individuals' accounts.

Update 2 (about 23 hours after posting):

CloudPets have now sent a notice to the California Attorney General's Office which is, well, let's just go through and correct it all here, bit by bit:

"Spiral Toys was told about a potential breach on February 22": No, they were told about it on December 31, here's the ZenDesk ticket "after receiving an inquiry from Canadian Vice Media journalist Lorenzo Franceschi-Bicchierai": No, Lorenzo is from Brooklyn, just like it says on his profile (along with "He's also a defrocked lawyer from Barcelona, Spain—although he is actually Italian") "After receiving [Franceschi-Bicchierai's] email, we carried out an internal investigation": No, they literally just said "you don't respond to some random person about a data breach" (unless they're saying that they acted on it but just didn't respond) "we carried out an internal investigation and detected an issue with a migration server MongoDB": No, the server that exposed the data was 45.79.147.159 which is the IP address that spiraltoys.s.mready.net resolves to (the host name the production app hits) "We immediately conducted a comprehensive check of the development site and confirmed that the data breach was fixed on January 9th as the server was being developed": So they knew about this more than 6 weeks before acknowledging contact and seem to be acknowledging that they put production data on their development database (real customer data was exposed) "the data breach was part of a massive cyber attack on MongoDB that affected over 28,000 instances globally": Let us be clear - this wasn't a 0-day in the product like Heartbleed was, they didn't put a password on the database and they left it facing the public internet rather than keeping it on a private network segment "we took extra precautions and also researched if the message and

image date were exposed": What's an image date? Is it a typo of "data"? "At that time the data was on a different server and

could not have been affected by the security breach": The message and image data was stored on AWS S3; the MongoDB contains references to the URIs to retrieve the data from S3 "The statement that 2M+ messages were leaked is misleading readers into believing that all messages and images on our servers were obtained by hackers": There were 2,182,337 records in the "VoiceMessage" table with references to the files on AWS S3; we don't know how many voice messages or profile photos were subsequently downloaded using these references "In the leaked data all passwords were encrypted": No, no passwords were encrypted, they were hashed with bcrypt and because of the design decision to enforce absolutely zero password strength, many are easily cracked in the presence of a good adaptive hashing algorithm "The messages and images of a customer account could not be accessed unless a hacker “guessed” the password": This is what hash cracking is and it's a highly-automated process that's particularly effective against databases that had no password rules "The hacker could have stolen the email addresses and could start running tests to find simple passwords such as “1234” or “password”": Yes, the world's most common passwords would be tested for because CloudPets allowed customers to use them! "In the CloudPets terms of use we do recommend customers to use complex passwords": The CloudPets Getting Started tutorial literally shows how to create a password of "qwe" (also, nobody reads the terms of use!) "Since there is a potential that hackers could try to guess passwords to acquire customers information we have invalidated all current passwords": I can still login with the password I used when I created an account last week (it's "abc") "For the protection of our users we are now requiring users to choose new increased security passwords": The iOS app was last updated on Jan 27, but the back end service no longer allows me to use "abc" as a password; I had to use "cloudpets" instead "The CloudPet services have been running safely since March 2015": No, they haven't, multiple people accessed the open database and took data out whilst 3 other parties tried to ransom it "It is very unfortunate that during a standard development we were

exposed to a cyber attack": It's not unfortunate, it's an egregious error that was then covered up and not reported "Spiral Toys was not contacted by any cyber security professionals nor a hacker holding the data for ransom": Yes they were, it's all documented above and whilst the ransoms didn't involve contacting them, when the only databases remaining on the system demand Bitcoin in exchange for the return of data, that's a bit of a red flag "The CloutPets production server": That's not how you spell the product name "The CloutPets production server and app were at no time affected by this incident": We've been through this already - 45.79.147.159 is serving production traffic "We will be contacting all of our customers with emails, around 500,000 users": I was sent 583,503 email addresses and that was only about 71% of the user records

CloudPets need to get someone local on-board that can help them through this both technically and legally because the way they're going at the moment, it likely won't end well.