Why Salt Your Hashes?

Note: Post has been updated below

Salted hashes? Have I decided to blog about breakfast?

No. By “Hash”, I mean “cryptographic hashes” and by “Salt”, I mean “additional input added to a one way hashing function”. Back in Episode 4 of my Podcast, I talked about a system that was written from the ground up to manage users, passwords, and permissions. During my little rant, I talk about storing passwords as the result of a one-way hashed value, but I didn’t really elaborate.

I realize that many of my regular readers may know this information, but I’ve been surprised at how many that I’ve found who do not. Hopefully, I can shed some light to those who don’t know and also become a viable source in search engine results for when the question is asked.

Let’s get the easy part out of the way first. We KNOW not to store plain text passwords, right? Some people know that and choose instead to store the passwords via two-way cryptography, meaning they can encrypt and then decrypt the password to compare it or email it you. That is also a terrible idea. Now, your entire system is only as secure as the security around your decryption key or decryption certificate. You’ve just made an attacker’s job very easy.

The better way to store passwords is to only store the result of a one-way hash. Then, when someone presents their password for authentication, you just hash the input and compare that to what you have stored in the database. However, even though this is good, it is still not right.

Take this for instance. Here is a sample table with hashed passwords.

user password pete b68fe43f0d1a0d7aef123722670be50268e15365401c442f8806ef83b612976b bill 59dea5f67aea4662c26a5ac6452233e783407d55c4f96d6c4df6f0d7c06c58af jeff b68fe43f0d1a0d7aef123722670be50268e15365401c442f8806ef83b612976b andy b6642c42bd670b0c070dd45d087877a4bc8d6ee29c88df59273ea48ed72b76c4 ron b68fe43f0d1a0d7aef123722670be50268e15365401c442f8806ef83b612976b

Right away, you should be able to see a problem. The hashes for pete, jeff, and ron are all the same. A common attack against hashed passwords is a rainbow table. In that case, dictionary words (or common known phrases) are pre-hashed and those hashes can then be compared against a compromised database. Let’s take a look.

password SHA-3 (256) Value password b68fe43f0d1a0d7aef123722670be50268e15365401c442f8806ef83b612976b letmein ceaa5fd0a764ad8202f43f2efc860d8c7472911ca9d1ccea2dc232713ae1fc0d blink182 aadfce5bdba224673c168fb861f45cdd6ebf4e34d35001ae933bd53b7f6b337f password1 abbe6325ea0d23629e7199100ba1e9ba2278c0a33a9c4bfc6cd091e5a2608f1a

Now, by comparing, we can see that the password for pete is the word password. That means that the password for jeff and ron are also “password”. By only cracking one hash, we gain access to two other accounts. This is not good.

The fix is to “salt” the password before hashing it. You want that salt to be a unique value. Some people create a random value and then store the salt alongside the password in another database column. Others derive the salt from something like the row’s primary key, etc. Either way is fine (as long as your derived value won’t change).

Now, let’s examine our user table.

user salt password pete I7Yrs9THQyLxpVllSwbf 9b7ec6d82075a9e7d8227897e8919785031b9a7cdab5750dea044390d1fd1f46 bill K0kJJCQcVVqfLzykcpbP 297d00ae29ff3c32fe874c00d0154085ac862a154b061c17cd465de7f1cdee9a jeff NwV7PdmPUKY6GgScEUqu c2936d36583d0513980e496005872e4954d142ed823b7b0b1abf28211efc538f andy GpHrXjbQRTjObZWM7jbd 0338bd60f7d761ce9c8922087e87c9ccb7936bb5f9c5c28d72fd28f4d8708e6b ron iHh8SX7fQEF2WFUOfxEp 07f459276c9be7d63aa8d57dac7468c8b16dd4367e91615fb9972543a707c403

We notice right away that none of the user’s hashes are the same. I didn’t change the passwords, but the salt values made the passwords unique so that they all hashed differently. We can no longer tell whose passwords are identical. Also, our plain dictionary attack no longer works. Even though we’ve telegraphed to the attacker what salt to use, the attacker would have to generate rainbow tables across their entire dictionary for each individual salt.

This isn’t 100% secure (nothing is), but this is a best practice and certainly will slow the attackers down. This method of storage, combined with strong passwords should keep your data as safe as it can be.

Thoughts? Disagreements? Share them in the comments section below.

EDIT (5/16/2014): I talked on my podcast referenced above about how easy it is to get behind or to overlook things if you do your own security as yet another reason NOT to do it. I recommended just using existing products or frameworks that have already been hardened over rolling your own. As a perfect example, I talked about doing all of this, but forgot about bcrypt (and others) that are much more secure, salt the value for you, and already have libraries in all of the major languages.