Protecting the internal network as well as the users of Facebook is an unenviable task. Facebook users constantly are the target of all manner of phishing, malware and other attacks, and the company’s own network is a major prize for attackers, as well. To help better defend those assets, Facebook’s security team has built an internal framework known as ThreatData that sucks up and processes massive amounts of threat information and helps the company respond more quickly to emerging threats.

Many large enterprises build their own custom security tools and analysis systems, but the details of those systems typically never see the light of day. Under the old security saw that publishing any information about your defensive methods is tantamount to giving aid and comfort to the enemy, most companies prefer to remain silent about what they’re doing on this front. But Facebook has on a couple of occasions been quite open about the way it protects its internal networks and Facebook users. Last year at the CanSecWest conference, a pair of Facebook security employees detailed a complex red team exercise that the company ran in preparation for a real-world hack.

And now the company is providing an inside look at the ThreatData framework that Facebook security engineers created to help them stay abreast of the new malware and phishing threats that emerge every day. Facebook’s answer was to build a set of feeds of malicious URLS, malware hashes and other information, store it in a database that has a couple of custom search capabilities and then push the data through a custom processing engine to look for new threats that need immediate responses.

“The ThreatData framework is comprised of three high-level parts: feeds, data storage, and real-time response. Feeds collect data from a specific source and are implemented via a light-weight interface. The data can be in nearly any format and is transformed by the feed into a simple schema we call a ThreatDatum. The datum is capable of storing not only the basics of the threat (e.g., evil-malware-domain.biz) but also the context in which it was bad. The added context is used in other parts of the framework to make more informed, automatic decisions,” Mark Hammel, a threat researcher at Facebook, wrote in an explanation of the framework.

The ThreatData framework consumes feeds from VirusTotal, malicious URL repositories, paid data from vendors and other sources, the data is pushed into the company’s Hive and Scuba data repositories. Hive helps analysts answer long-term queries about whether the system has seen a specific piece of malware before, while Scuba is focused on shorter-term problems, like emerging phishing site clusters. Facebook’s team can then implement various responses, such as sending any new malicious URLs to a blacklist that is used on Facebook.

The system has enabled the Facebook security team to identify some ongoing attacks that were affecting the company’s users, as well as other victims. One example is an odd spam campaign last summer that was sending links from fake Facebook accounts that led users to a piece of malware designed to infect some feature phones. The malware had the ability to steal contact list data and send premium-rate SMS spam messages.

“With this discovery, we were able to analyze the malware, disrupt the spam campaign, and work with partners to disrupt the botnet’s infrastructure,” Hammel said.

“We realize that not all aspects of this approach are entirely novel, but we wanted to share what has worked for us to help spark new ideas. We’ve found that the framework lets us easily incorporate fresh types of data and quickly hook into new and existing internal systems, regardless of their technology stack or how they conceptualize threats.”

Image from Flickr photos of Coletivo Mambembe.