New System Uses Machine Learning to Scan Tweets for Security Flaws

Machine learning and Twitter could be the future of catching security flaws and vulnerabilities early.

The future of security flaws and vulnerabilities could come down to the popular social media website, once known for telling your friends what you are having for lunch. Researchers are hoping to tap into the community of Twitter users who tweet about security vulnerabilities 24/7 by building a piece of free software that automatically tracks tweets to pull out hackable software flaws and rate their severity.

Researchers at Ohio State University, the security company FireEye, and research firm Leidos published a paper describing the new system that reads millions of tweets for mentions of software security vulnerabilities, and then, using their machine-learning-trained-algorithm, assesses the threat level they represent based on how they've been described.

The researchers found that Twitter can not only predict the majority of security flaws that will show up days later on the National Vulnerability Database, but that they could also use natural language processing to roughly predict which off those vulnerabilities will be give "high" or "critical" severity rating with better than 80 percent accuracy.

"We think of it almost like Twitter trending topics," says Alan Ritter, an Ohio State professor who worked on the research and will be presenting it at the North American Chapter of the Association for Computational Linguistics in June. "These are trending vulnerabilities."

Ohio State's Ritter cautions that despite promising results, their automated tool probably shouldn't be used as anyone's sole source of vulnerability data—and that at the very least, a human should click through to the underlying tweet and its linked information to confirm its findings. "It still requires people to be in the loop," he says. He suggests that it might be best used, in fact, as a component in a broader feed of vulnerability data curated by a human being.

Given the accelerating pace of vulnerability discovery and the growing sea of social media chatter about them, Ritter suggests it might be an increasingly important tool to find the signal in the noise.

"Security has gotten to the point where there's too much information out there," he says. "This is about creating algorithms that help you sort through it all to find what’s actually important."