Editor’s note: I’ve updated the article with some new (and in some cases) clarifying detail from Jeremi. I’ve left changes in where they were made. The biggest changes: 1) an updated link to slides 2) clarifying that VCL refers to Virtual OpenCL and 3) that the quote regarding 14char passwords falling in 6 minutes was for LM encrypted – not NTLM encrypted passwords. Long (8 char) NTLM passwords would take much longer…around 5.5 hours. 😉 – Paul

There needs to be some kind of Moore’s law analog to capture the tremendous advances in the speed of password cracking operations. Just within the last five years, there’s been an explosion in innovation in this ancient art, as researchers have realized that they can harness specialized silicon and cloud based computing pools to quickly and efficiently break passwords.

A presentation at the Passwords^12 Conference in Oslo, Norway (slides available here – PDF), has moved the goalposts, again. Speaking on Monday, researcher Jeremi Gosney (a.k.a epixoip) demonstrated a rig that leveraged the Open Computing Language (OpenCL) framework and a technology known as Virtual OpenCL Open Cluster (VCL) to run the HashCat password cracking program across a cluster of five, 4U servers equipped with 25 AMD Radeon GPUs and communicating at 10 Gbps and 20 Gbps over Infiniband switched fabric.

Gosney’s system elevates password cracking to the next level, and effectively renders even the strongest passwords protected with weaker encryption algorithms, like Microsoft’s LM and NTLM, obsolete.

In a test, the researcher’s system was able to churn through 348 billion NTLM password hashes per second. That renders even the most secure password vulnerable to compute-intensive brute force and wordlist (or dictionary) attacks. A 14 character Windows XP password hashed using LM NTLM (NT Lan Manager) , for example, would fall in just six minutes, said Per Thorsheim, organizer of the Passwords^12 Conference.

[Note of clarification from Jeremi: “LM Is what is used on Win XP, and LM converts all lowercase chars to uppercase, is at most 14 chars long, and splits the password into two 7 char strings before hashing — so we only have to crack 69^7 combinations at most for LM. At 20 G/s we can get through that in about 6 minutes. With 348 billion NTLM per second, this means we could rip through any 8 character password (95^8 combinations) in 5.5 hours.” ]

“Passwords on Windows XP? Not good enough anymore,” Thorsheim said.

Tools like Gosney’s GPU cluster aren’t suited for an “online” attack scenario against a live system. Rather, they’re used in “offline” attacks against collections of leaked or stolen passwords that were stored in encrypted form, Thorsheim said. In that situation, attackers aren’t limited to a set number of password attempts – hardware and software limitations are all that matter.

The clustered GPUs clocked impressive speeds against more sturdy hashing algorithms as well, including MD5 (180 billion attempts per second, 63 billion/second for SHA1 and 20 billion/second for passwords hashed using the LM algorithm. So called “slow hash” algorithms fared better. The bcrypt (05) and sha512crypt permitted 71,000 and 364,000 per second, respectively.

In an IRC chat with Security Ledger, Gosney said he has been working on CPU clustering for about five years and GPU clustering for the last four years.

“Then we just started trying to build the biggest GPU rigs we could, packing as many GPUs into a single server as possible so that we wouldn’t have to deal with clustering or distributing load,” Gosney wrote.

He started developing the new platform since stumbling on VCL in April , after trying his hand at pooling traditional CPUs for password cracking .

“I was extremely disappointed that setting up a clustered VMware instance wouldn’t allow me to create a VM that spanned all the hosts in the cluster. E.g. if i had five VMware ESX hosts with 8 processor cores, I wanted to be able to create a single vm with 40 cores and use all nodes in the cluster,” he wrote.

Then he came across VCL, or Virtual Open Cluster, a small and heretofore little recognized project from the scientists who manage the MOSIX distributed operating system first released in the 1970s.

“It did just what I wanted, not with an entire OS per se, but with an entire OpenCL application. and that’s good enough for me.”

After playing around with VCL for a while, Gosney approached Prof. Amnon Barak, one of Mosix’s creators. Gosney was interested in adding features to VCL that would allow it to run the HashCat password cracking tool.

“Once we convinced Amnon that we did not aspire to turn the world into one giant botnet, he was very cooperative in working with (us) to resolve issues with VCL that was preventing it from working 100% with hashcat,” he said.

VCL makes load balancing across the cluster – once an arduous task that required months of custom scripting – a trivial matter. As a result, Gosney said that his team is at a point where their implementation of Hashcat on VCL could be scaled up far above the 25GPU rig he has created – supporting “at least 128 AMD GPUs.

“I always had these dreams of doing very simple and very manageable grid/cloud computing,” Gosney wrote. “It really is the marriage of two absolutely fantastic programs, which allows us to do unprecedented things,” he wrote.

Gosney is no stranger to password cracking. After 6.4 million Linkedin password hashes were leaked online, Gosney was one of the first researchers to decrypt them and analyze the findings. He and a partner were ultimately able to crack between 90% and 95% of the password values.

Gosney’s GPU cluster is just the latest leap forward in password cracking in a year that has already seen prominent encryption algorithms deemed compromised by an onslaught of cheap compute power. In June, Poul-Henning Kamp, creator of the md5crypt() function used by FreeBSD and Linux-based operating systems was forced to acknowledge that the hashing function is no longer suitable for production use – a victim of GPU powered systems that could perform “close to 1 million checks per second on COTS (commercial off the shelf) GPU hardware,” he wrote. Gosney’s cluster cranked out more than 70 times that number – 77 million brute force attempts per second against MD5crypt.

Recent years have also seen the launch of services like Moxie Marlinspike’s WPACracker and then CloudCracker, a cloud-based platform for penetration testers that can do lookups of password hashes and other encrypted content against a dictionary of over hundreds of millions – or even billions – of potential matches — all for under $200. And if that price is too rich, a team of U.S. based researchers have shown how you can do the same thing – on the cheap – by leveraging Google’s MapReduce and cloud based browsers. Then, in 2011, researcher Thomas Roth, who developed the Cloud Cracking Suite (CCS) – a tool that leveraged eight Amazon EC2-based Nvidia GPU instances to crack the SHA1 encryption algorithm and dispense with tens of thousands of passwords per second.

Gosney said he plans to “make a bit of money” off his invention, either by renting out time on it or by offering it as a paid password recovery and domain auditing service. “I have way too much invested in this to not get some kind of return out of it,” he wrote.