Harvesting Cb Response Data Leaks for fun and profit

Carbon Black’s Cb Response product is one of the more popular endpoint detection and response (EDR) tools available in an ever-growing marketspace. However, as a function of how the tool is architected, it is also a prolific data leaker.

This threat report blog will help security organizations understand how our vulnerability assessment experts harvested data from Carbon Black’s Cb Response customers and how it is nearly impossible to stop this with the architecture they devised.

How severe is the problem? Our experts could recover the following types of information from several Fortune 1000 companies:

Cloud keys (AWS, Azure, Google Compute) – which could provide you with access to all cloud resources

App store keys (Google Play Store, Apple App Store) – letting you upload rogue applications that will be updated in place

Internal usernames, passwords, and network intelligence

Communications infrastructure (Slack, HipChat, SharePoint, Box, Dropbox, etc.)

Single sign-on/two factor keys

Customer data

Proprietary internal applications (custom algorithms, trade secrets)

The leaked data exist primarily around various executable formats (we haven’t seen evidence of this in documents or pdfs yet). However, if handled incorrectly, even executables can easily contain serious data leakage of information that can be hazardous to a company’s security posture.

Carbon Black Background:

Carbon Black started life as an application whitelisting company called Bit9. Bit9 (now called Cb Protection), like many other whitelisting solutions, decided that the way that signature-based antivirus worked was broken, and that it was safer and easier to choose those files that you want to run, rather than continuously chasing the ones you do not want to run or could be harmful. While this concept has merit, the largest challenge to maintaining a whitelisting solution is keeping up with a list of all known good files. Good files change quite often, and this can make quite a lot of work for administrators to maintain by themselves.

To deal with the onslaught of new files (updates, new versions, newly purchased or developed applications, etc.), Carbon Black created a cloud lookup service that would tell you if a file was good or bad, to assist in making the right decisions. The problem was, without a large sample set to determine what was good or bad already available to their users, Carbon Black deferred the decision to a cloud-based multiscanner service. Ultimately Carbon Black would have a bunch of antivirus (AV) solutions decide which files were bad, and remove the offending file from the set of things customers could whitelist. This worked well for their customers, but it brought a new wrinkle into the equation. What about the good files that haven’t been seen on the cloud-based multiscanner? The answer was obvious. The files must be uploaded, have all the AV engines scan them, and then use those scores. So, Carbon Black began uploading files from their customers to their cloud, and from their cloud to the multiscanner solution.

Over time, Bit9 acquired a company called Carbon Black (which became the name of the new joint entity). Carbon Black (now called Cb Response) was an early player in EDR, or endpoint detect and response. As a function, EDR solutions record all the activities happening on endpoints and aggregates this information to a central location. The ability to dig deep onto an endpoint and understand what had really happened after a security incident is an excellent forensics tool, and enterprise customers quickly adopted the Carbon Black/Bit9 solution. However, Cb Defense customers soon faced a significant challenge. The EDR approach generated too much data (noise) for most organizations to staff accordingly and the volume of information took a significant amount of time to proactively make relevant decisions (signal). In response, Carbon Black started leveraging cloud-based multiscanner lookups to accelerate the time needed for reviewing files. Just as before, Cb Defense takes suspect or new files from the local system, forwards these files to a local server or cloud server, which in turn sends the files onto another cloud-based multiscanner.

After Bit9, Carbon Black’s next acquisition was a company called Confer (now called Cb Defense), which is a solution providing “next-generation AV”. Cb Defense is powered with a new engine; however, it appears to also be using a cloud-based multiscanner for part of its processing and analysis.

The real cost of transitive trust:

Many Cb Response customers, were sold on the benefits of using a cloud-based multiscanner. After all, why use one AV when you get 50? However, few customers were aware of the costs. The real cost here, beyond the enormous storage, compute, headcount, and performance cost, are in abused trust relationships.

Trust is a funny thing. If I trust you and tell you a secret, I may believe you won’t tell anyone else. But if you do, I’m implicitly trusting anyone else you tell that secret to. The same goes with data. For example, when I send you sensitive information, and you send that sensitive information to another entity, as the owner of that sensitive information I am assuming the risk of all of it—not you. This is called transitive trust. So, what happens if a multiscanner, in this case, Cb Response, passes files out? They leak.

Cloud-based multiscanners operate as for-profit businesses. They survive by charging for access to advanced tools sold to malware analysts, governments, corporate security teams, security companies, and basically whomever is willing to pay. Access to these tools includes access to the files submitted to the multiscanner corpus (it’s hard to analyze malware that you don’t have). This means that files uploaded by Cb Response customers first go to Carbon Black (or their local Carbon Black server instance), but then are immediately forwarded to a cloud-based multiscanner, where they are dutifully spread to anyone that wants them and is willing to pay.

Welcome to the world’s largest pay-for-play data exfiltration botnet.

How big is it, exactly? According to Carbon Black’s own website, “The company expects that by the end of 2015 it will achieve 7 million+ software licenses sold, almost 2,000 customers worldwide.”

When you think about 2,000 customers and 7 million endpoints (by end of 2015, presumably larger now) uploading every new file to a trusted partner that gives these files to anyone who pays, it starts to come into focus. Additionally, Gartner has called EDR a “1% solution”, meaning that these endpoints likely crosscut the most sensitive, serious companies who would be most adversely affected by a leak of sensitive information, such as financial services and banking companies.

How could this happen? As previously stated, when a new file appears on a protected endpoint, a cryptographic hash is calculated. This hash is then used to look the file up in Carbon Black’s cloud. If Carbon Black has a score for this file, it gives the existing score, but if no entry exists, it requests an upload of the file. Since Carbon Black doesn’t know if this previously unseen file is good or bad, it then sends the file to a secondary cloud-based multiscanner for scoring. This means that all new files are uploaded to Carbon Black at least once.

Generally speaking, this isn’t such a big deal. Take a Windows update for example. The first customer of Carbon Black that gets a Windows update and then uploads it doesn’t leak much information. However, let’s extrapolate these along real-world lines. Not every file is a Windows update, and many of them contain sensitive details and change frequently. This degree of change is what spurred Carbon Black in its Bit9 form to create this system in the first place.

Imagine you have this solution deployed on a developer workstation. Each time a new piece of code is compiled, that new complied code is a file that nobody has ever seen. It gets uploaded. Now imagine a build or deployment system that packages up a bunch of executables (and configuration files). You could easily imagine the types of combined data that could constitute a “new file”.

Discovering the Vulnerability

In mid 2016, some of our staff were responding to a potential breach at a customer’s site. As part of our process, we were analyzing a potential piece of malware using the analyst interface of a large cloud-based multiscanner. One of the useful features of this multiscanner is that they allow searching for similar malware to get some context, and in doing so, we stumbled across a couple of files that were very different. These seemed to be internal applications from a very large (and completely unrelated to our original customer) telecommunications equipment vendor. After determining they were unrelated, we became curious about how such files could have gotten up onto the multiscanner corpus to begin with.

We noticed that the other files were all uploaded by a similar uploader. This service obscures the uploader behind an API key, in this case: 32d05c66.

By doing some research, we determined that this is the primary key for uploading files by Carbon Black for Cb Response. By searching for similar uploads from this key, we found hundreds of thousands of files comprising terabytes of data. We started downloading some of these and digging a little deeper.

We downloaded about 100 files (we found JAR files and script files to be the easiest to analyze by script), and ran these files through some simple pattern matching. When we got hits, we’d try to extrapolate where they came from. We were not trying to be exhaustive in analysis, and only repeated this operation a few times to see if it still held true.

Here are a few actual use cases, kept anonymous to respect our customers’ privacy.

Case 1 – Large Streaming Media Company:

We identified several pieces of compiled java bytecode associated with this company. While this may not be that odd, as this company is prolific in its support of open source, a bit of analysis proved there was a serious leak here. Upon inspection of the programs (which were being uploaded about every 10 seconds), we identified the following leaked data:

Amazon Web Services (AWS) Identity and Access Management (IAM) Credentials for the Company

Slack API Keys for the Company

The Company’s Crowd (Atlassian Single Sign On) Admin Credentials

Google Play keys

Apple Store ID

Figure 1 – Example of Slack API information found in content uploaded from Cb Response

Figure 2 – Example of google oath settings in content uploaded from Cb Response

Case 2 – Social Media Company

Several months after our initial discovery, we had more time to explore and again went looking for files being uploaded via Carbon Black’s key. After some brief sifting of recent uploads, we identified files pertaining to a large social media company. This time it was a series of scripts (python scripts) that contained:

Hardcoded AWS and Azure keys

Other internal proprietary information, such as usernames and passwords.

Case 3 – Financial Services Company

We replicated this process for a third time, and again found another customer’s data in the stream of files Carbon Black’s key is uploading. These files included:

Shared AWS keys that granted access to customer financial data

Trade secrets that included financial models and possibly direct consumer data.

Figure 3 – Example of AWS and database information found in java code uploaded by Cb Response

Figure 4 – Example of AWS secret key located in java code uploaded by Cb Response

This leak led us to decide to make this public.

We have not tried to create an exhaustive search for leaks. This is almost certainly a broader scope problem than we have time to explore. Additionally, it is imminently likely that there are other EDR sources and products to exploit (perhaps even other keys being used by Carbon Black’s solutions and even other vendors). Over the last couple years, there have been over 50 EDR companies launched, and likely, some of them may follow the same inspection model as Carbon Black.

The problems we noted seemed to exist more prevalently around developer and build/deploy systems, but this could easily be a bias in our search approach. This isn’t universal, but for the types of problems we were analyzing, it seems likely that these systems had the Cb Response agents deployed on them, and due to the architecture of a solution that is sending data up to a third-party, cloud-based multiscanner.

In all cases, the customers were notified and the leaks presumably stopped or slowed (we checked shortly after notification, but didn’t do any follow-up). Our intention with releasing this information was not to attack customers or security vendors, and we don’t pretend that we’ve performed an exhaustive analysis of the breadth of the leaks. We only know that every time we looked, we found this same serious breach of confidentiality. We also do not know if this is the only key Carbon Black uses, nor if this problem is unique to Carbon Black, only that Carbon Black’s prevalence in the marketspace and the design of their solution’s architecture seems to be providing a significant amount in data exfiltration.

Recommendations: Protecting Your Sensitive Data

If you are a CB Response user, here are four steps you can take to make sure your sensitive data isn’t being leaked to third parties.