Criterion of “good” passwords has long been defined by confusing rules, outdated advice, and recently by those annoying colored “password strength” bars on websites. Ultimately, understanding password strength requires you understand what you’re defending against. Passwords have to be stored in a database by the site that requires them of you – where it can (and will) be stolen by hackers. Password databases are protected against this eventuality, but once stolen hackers can go to work cracking them at their leisure.

Ergo, the primary requirement for a password is that it be as hard as possible for a hacker to crack. This equates “hard to guess,” because almost every technique to crack passwords revolves around guessing the password. Passwords in a database are typically protected by “hashing;” they’re fed into a clever mathematical algorithm that scrambles them up in such a way that it’s impossible to “reverse engineer” what the original password string was. The algorithm is typically an openly known standard (like SHA-1 or MD5 to give two examples;) the security lies in the service provider only storing the derived data, the scrambled-up string from which the password cannot be “reverse-engineered.” Every time you enter your password on the site, it’s quickly hashed and the scrambled (“hashed”) value compared to the one in the database. This way, the vendor/service provider’s website never actually stores the passwords themselves where they can be stolen by hackers.

What an attacker can do, however, is attack the password from the other end. The need to quickly compute the hash every time a user enters their password (for quick comparison) means that generating hashes for comparison to a stolen database of password hashes can be done very fast. Thus guessing the password is a viable attack. It’s the oldest trick in the book, and exactly the thing most old password rules (symbols, numbers, limited login attempts) were meant to defeat. That was enough in days of yore, especially when sensitive systems tended to be mainframes with only on-premises terminal access, but modern hackers steal and download entire password databases. Yesteryear threats were like safecrackers with a stethoscope. Modern threats involve stealing the whole safe and taking it home to meet the cutting torches.

Old Tricks, New Tools

The simplest way of guessing a password has always been trying every possible password combination as fast as possible till one matches up – the “brute force” attack. (This is why sites almost always require a minimum password length; each additional character increases the number of possible combinations exponentially.) For a long time “brute forcing” passwords was an uncommon threat; it’d take a state actor (or a determined corporate rival) to build an expensive computer cluster to muster the raw computing power needed to get results before the heat death of the Universe. Moore’s law shook that paradigm, the advent of dual and then quad-core processors shifted it, and modern video cards (whose processors are optimized for highly repetitive video-processing tasks very similar to password cracking processing) definitively killed it. A modern computer with a single video card can process billions of password guesses every second against “fast” hashes (which most sites use for aforementioned reasons.)

Brute-force attacks alone are still relatively easy to defeat, of course. The fact that they’re even possible on consumer-grade equipment simply illustrates how much computing power is now available to password crackers. When that power’s used intelligently it gets much, much worse. Cryptanalysis is a science centuries old – combining it with modern computing power puts password cracking into the hands of even untrained “script kiddies.” The threat vector is vast.

Pattern Recognition

From its earliest days, cryptanalysis has always been about recognizing patterns, and unfortunately password requirements hand would-be hackers pre-set patterns to exploit:

The passwords are being made by human beings, who tend to gravitate to the same general solutions when faced with the same problem.

All these people have the same problem – they need to be able to remember the password.

And they need to adhere to the “password rules” enforced by the website/service itself in the name of making sure their passwords are secure.

Those considerations instantly remove billions of possible combinations from the pool of possible guesses just with a few simple “rules.” Combined with databases of commonly-used words (“dictionary attacks”) and some educated guesstimations from the crackers, the efficiency of the password-guessing can be increased drastically. As illustrated by this excellent arsTechnica article, a hacker can crack 10,000 hashed passwords in just 16 minutes with a combination of brute-forcing, a few educated guesses, and the use of dictionary attacks to limit the “possibility space” to something manageable in minutes rather than months.

The sheer amount of data available to hackers to study is what makes these guesses so educated. Once upon a time this was the realm of clever government spies perusing peer-reviewed psychology studies – but now anyone can download millions of cracked passwords dumped on the internet over the years for use as a vast database that quickly reveals common trends in the passwords people make. The advent of “machine learning” (programming a computer to recognize patterns on its own) makes this terrifyingly effective. Hackers turn their software loose on that vast body of data to “train” their software very well. Worse, it makes site-specific attacks much more effective: weak passwords broken by simple brute-force/word-list attacks can be used to “train” the software in patterns that the longer, more difficult passwords are likely to follow as well. Even comparatively simple machine-learning techniques can be effective, such as the Markov chain – who’s mathematics were invented over one hundred years ago. If such dated techniques can produce solid results, one shudders to think what a good intruder (with funding, motivation and support) could accomplish.

Thankfully, you can fight back against this effectively – by removing the patterns entirely.

Randomization and Password Management

Passwords – unlike messages or metadata – can be entirely arbitrary; they’re not required to convey any specific information on their lonesome. Thus one can make them entirely random, which at one stroke removes any and all patterns that all those sophisticated cracking methods sniff out. A good random password generator can generate perfectly random passwords with a simple mouse-click.

Unfortunately perfectly random passwords are ungainly things that are hard to remember – and due to the usual security risks, you need a different password for every website/service, as the point of stealing passwords is to gain access to multiple other sites using the same credentials.

This is what password managers are for – and precisely why a password manager needs a good password generator to go with it. FreePAVE/TeamPAVE include one for this very reason; as do all serious competitors. Care must be taken when using password generator web apps: Generating properly (cryptographically secure) random data in browsers is hard (the naive Math.random() method is not random enough and window.crypto not supported by all browsers,) and it is difficult for users to determine whether a website is using a safe algorithm – password managers are the safer bet.

This leaves us with a final problem – you still need a password to get into your password manager, and you need to be able to remember it. Pass phrases made of random words (as suggested by XKCD) are easy to remember and can be just as safe as other passwords – as long as they are properly randomized (with a password generator), and not “made up”. Diceware pass phrases (with six or seven words) are easily generated, lacking other alternatives.Still, utilizing software to automate password management is the best – perhaps the only – solid counter to the increased automation of password cracking in the modern era.