Password Handling in 2018

Many developers know that they should hash passwords. Most know that they should use a per-password-salt to mitigate rainbow table attacks (What is a rainbow table?). Also most developers know that they shouldn’t use SHA-* , but instead a KDF— a hashing function specifically designed for password-hashing. [1][2][3]

In this short “my best practices” we will cover the things mentioned above and a bit more. Firstly, we will permute the password on the client-side. Secondly, we will encrypt the final hash before writing it into our database, similar to how Dropbox does it. Following the “Dropbox-Way” of presenting your password protection I generated a diagram that shows all the cryptographic layers. In the figure below, everything regarding encoding is emitted (e.g. base64 , hex , etc.) [4]

Multiple layers of protection for passwords ✔️

Although I really like the onion-diagram above, I think for explanation purposes another figure, based on the flow through the system, is easier to understand.

The password flow

Everything starts with our users entering the password into our website and submitting the login. Here comes the first layer most developers think is irrelevant. Before we send the username and password over the wire we perform a single SHA3-512 round on the plain-text password plus a unique name for our service (for example the domain — this means if the user would use the login system at auth.example.com we would do the following: SHA3-512("plain-text-password-from-user" + "auth.example.com") ). Why we add this public, well-known, for every user of the service equal, “salt” is explained later.

Password flow through the software system, with ever increasing security 🔒

Right now most developers think: “Don’t we have https for keeping the password secure?”. And that’s right. But keeping the password secure from eavesdropping, etc. was never the intention of this step. In fact try to see this step as converting-function — after this round the permuted version (e.g. the hash) becomes the user’s so-to-say “actual” password, which gets submitted to the server-side.

Why should we do this?

Simple: It shows respect for the user’s password and that you are aware of the fact that, in most cases, is not exclusive to your software. Additionally we gain a few smaller security bonuses (castle approach): There is no way we could ever accidentally store the user’s plain-text password in our logging system, unlike GitHub and Twitter, which both admitted in May 2018, that they have found plain-text passwords in their logging systems. Also the user password would be slightly protected in a MITM attack or a compromised server. “Slightly” because the strength of a single SHA3-512 hash, purely depends on the user’s input, which is admittedly most times not very good. The last point is that client-side hashing is the only simple way to prove you are not “farming passwords” ✔️ [5][6][7][8]

Let’s think about the last statement and our assumption about user’s reusing their passwords: If now every site would start to use the client-side hashing approach with SHA3-512(password) the complete idea of respecting the user’s password privacy would be destroyed, as every service could use the hash against other services (as currently with plain-text passwords). Therefore this approach wouldn’t give as any enhancement if deployed widely. However if every login-system would add a global unique salt (e.g. domain — SHA3-512(password + domain) ), each website’s server-side would get different “permuted” passwords, even if the user takes the same password for every service.

The only two drawbacks to this approach are that:

you cannot enforce password policies on your server-side. Although, whether password policies make sense or not is a different question anyway. I’ll cover that in a later post.

in case you ever change your company’s domain, you need to ether keep your old domain in the hashing code or you do hashing scheme updates transparently during user authentication (e.g. you do the client-side hashing with the old and the new domain, send both to the server, check if the old domain-calculated hash matches the database entry and update it with the new domain-calculated value)

Normalization

OK let’s continue with our password flow: The username and the permuted version of the password get transmitted over https (!!) to our server. Normally is is recommended to perform a single round of SHA3-512 now. This would be done to normalize the output to fixed 64 bytes, because a few password hashing functions truncate after N bytes (for example bcrypt truncates its input after 72 bytes or a NUL byte), which reduces the entropy of the password. Other password hashing algorithms ( PBKDF2 ) are vulnerable against DoS attacks, if passwords can be arbitrarily long. [9]

Because the client-side permutation of the password already was a normalization, this shouldn’t concern us, as long as we check whether the client-side provided string is a valid representation of a SHA3-512 hash. If it is we pass it into our KDF — if not we must abort, as we got a tampered malicious input.

Client-side password permutation and normalization

KDF (Password hashing functions)

🔑 Speaking about KDFs: There are a few acceptable algorithms from which you can choose — namely: Argon2 , bcrypt , scrypt , PBKDF2 .

Argon2 has won the password hashing competition in July 2015, out of 24 candidates. Since then nobody has found a real attack vector against it. Therefore most cryptographers believe that Argon2 is highly unlikely to fall victim to attacks that make it worse in practice than one of the others and subsequently recommend using it. In Argon2 you can not only specify a cost-parameter, like in bcrypt , but rather 3 parameters: number of iterations, memory consumption and number of threads. Despite all the benefits, bcrypt is out there since 1999 — that’s close to 20 years without major vulnerabilities! Therefore it can be seen as much more battle proofed than Argon2 . Also not all cryptography libraries provide first class Argon2 support. In these cases you should use bcrypt . [10][11]

In 2009 scrypt , a brypt like function, which requires more RAM, and subsequently makes it more resistant for hardware accelerated attacks, was published. Unfortunately due to its massive memory requirements it‘s very hard to scale and practically not usable for an authentication system. Lowering the memory usage is not feasible as it then becomes, technically, weaker than bcrypt . Therefore its main usage is only in places, where spending hundreds of megabytes of memory and multiple seconds worth of CPU time for a single hash computation, aren’t a problem (e.g. protecting the encryption key for your computers main hard disk).

The most widely deployed algorithm is probably PBKDF2 , although it shouldn’t be your choice if you build a new application nowadays, except if you need FIPS-certification.

In the end everything comes to your personal flavour, and how conservative you are (usually you should be, when thinking about cryptography). I personally have only used bcrypt till now, but I will switch to Argon2 for the next project. [12]

OWASP, a big online community that tries to increase web application security, through freely-available articles, methodologies, documentations and tools, recommends the following in their “Password Storage Cheat Sheet”: [13]

- Argon2 is the winner of the password hashing competition and should be considered as your first choice for new applications;

- PBKDF2 when FIPS certification or enterprise support on many platforms is required;

- scrypt where resisting any/all hardware accelerated attacks is necessary but support isn’t.

- bcrypt where PBKDF2 or scrypt support is not available.

Usage of KDF

The usage of KDFs is pretty self-explanatory: The credential-specific salt is loaded from the database and used together with the client-side provided hash to compute the KDF output.

Server-side salting with a strong KDF. On the left the conservative way with bcrypt. On the right the futuristic version with Argon2d

As you have seen in the diagram above, after the initial password permutation in the fronted, I maintain two different branches. The left one which is a little bit more conservative and the right one which is a little bit more futuristic. Both are completely safe and it mainly depends on one’s preferences.

Symmetric Encryption

🔒 After the KDF our password is computationally secure (e.g. implausible to recover — note that nothing is impossible, therefore implausible refers to the computational hardness assumption: is the hypothesis that a particular problem cannot be solved efficiently (where efficiently typically means “in polynomial time”)). [14][15]

Still we perform a last step before persisting it into our database. We encrypt the hash using a symmetric encryption algorithm like AES256-GCM or ChaCha20-Poly1305 as this makes a database dump absolutely worthless for brute-force attacks. That’s a fact that can be inferred from thermodynamics: [16]

“These numbers have nothing to do with the technology of the devices; they are the maximums that thermodynamics will allow. And they strongly imply that brute-force attacks against 256-bit keys will be infeasible until computers are built from something other than matter and occupy something other than space.”, Bruce Schneier

IF we manage to keep the key secure (and of course no significant vulnerabilities are found in the used algorithm and its implementation). The algorithms AES256-GCM and ChaCha20-Poly1305 are used, because these provide AEAD. [17]

Symmetric encryption before persisting the hash into a database. On the left the conservative way with AES256-GCM. On the right the futuristic version with ChaCha20-Poly1305

In order to keep the key as secure as we can (without taking advantage of HSMs) we make use of Hashicorp’s Vault and its ability to do EaaS (Encryption-As-A-Service 😅). Therefore we send the output of the KDF to our Vault instance, get the encrypted hash back and store it inside our database. Next time the user wants to log in we load the encrypted hash from the database, decrypt it with Vault and compare it with the generated hash for this authentication cycle. Don’t forget to do Constant-Time-Comparison (e.g. be resistant against a timing-side-channel attack). Some people probably say it’s not important as we only compare hashes. I would advise you to do it anyways, as it’s a good attitude, if you’re making a comparison related security decisions. For example: the — very good — supplementary cryptography package provided by the Go team also does it in its bcrypt implementation here. [18][19][20][21][22]

The final result