

Author: “Sergeant Major” Hare Job Title: Security-Enforcing Developer Hobbies: Disciplining, More Disciplining, Even More Disciplining

Abstract. Password hashing to provide certain security if the hash is compromised, is a recurring theme for security research. One necessary element for proper password hashing is having rather slow hashing functions, which in turn may cause a significant server load, for example, in mass reconnect scenarios. Present article is discussing client-plus-server password hashing as a way to improve security of hashed passwords without increasing server load.



Previous Work

There’s been a lot of research in the field of “server relief” (i.e. moving some of server-side hashing costs to the server side). However, most of these works are either considering “server relief” as a part of the password hashing function ([Catena]), or (such as [ChangEtAl]) are different from Schema 3 recommended here (and more along the lines of Schema 2, which suffers from the information leak described below, which is difficult to address). One interesting work in this field is the one by Scott Contini[Contini], which provides analysis which is close to ours in many areas, though [Contini] requests client-side salt from the server side and has “residual information leaks” similar to those described in Schema 2. Also, as we’ve found after we posted this article, a schema which is substantially similar to our Schema 3, has been published earlier in a StackExchange question by user paj28 [paj28] (the analysis here being different from [paj28]).

Introduction

“While I'm usually very critical of the widely-spread misperception that password hashing is the (almost) only thing application developers need to care about security-wise, I do accept that password hashing is one (out of a few dozens) of the things every security-conscious developer shall implement.For a while, a topic of password hashing remains popular in software development circles (and to a lesser extent – in security circles). While I’m usually very critical of the widely-spread misperception that password hashing is the (almost) only thing application developers need to care about security-wise, I do accept that password hashing is one (out of a few dozens) of the things every security-conscious developer shall implement.

In this article, we are not going to discuss intrinsics of existing password hashing best practices, they’re already described in detail in several places, including [CrackStation]. Instead, we’ll discuss a few major points to make it clear what this article is about.

The whole problem password hashing aims to consider, is not about online brute-force attacks (where the attacker tries to brute-force the password by simply logging in) – these attacks can and should be thwarted by limiting login rate (implementation details vary, but are quite well-known). The attack password hashing aims to address, is an offline attack on password database, where the attacker has already got the whole password database, and now tries to get passwords out of it to be able to impersonate users; if the passwords are stored in plain – it is trivial, if the passwords are hashed but not salted – it is not that trivial, and so on. While some (myself included) consider this a relatively unimportant scenario, and think that most of the efforts shall be aimed to not allowing attacker access to the password database in the first place, password hashing does provide additional protection and shall be used as one of dozens of the security-related tools in the developer toolbox.

One of cornerstones of modern password hashing is per-user hash salting (i.e. hashing password concatenated with a ‘salt’). This is necessary, in particular, to protect against rainbow-table-based attacks. Note that ‘salt’ shall be unique for each user (usually stored in database alongside the hash itself) and shall be a long enough (such as 128-256 bits) crypto-safe random number.

The second cornerstone of secure password hashing is to have password hash function slow. It is necessary to make the attacker spend more time on a brute-force attack. For example, if we take a typical 8-char password with upper-case and lower-case letters and digits, it will require about (26+26+10)^8=2.2e14 tries to get through the whole search space, so now it is all about the time which is necessary to try one hash. If the hash is fast (such as SHA256()), a single stock GPU can get as many as 1e9 tries per second [HashCat], so with even a single GPU dedicated to the purpose, it will take about 2.5 days (twice less on average) to crack one password in a brute-force way. On Amazon EC2, the rent of a comparable GPU (with a lot of other stuff we don’t really need) costs $0.65/hour, so the (upper-bound) cost of breaking one SHA256()-based password becomes a measly $39 (significantly less in practice). In other words, if the cost of the information which is associated with the password, is more than about $20 – the protection which is provided by simple SHA256()-based 8-character password hashing is inadequate (even if the hash is properly salted etc. etc.)

Hash Functions which are Intentionally Slow

“As the answer to these brute-force attacks, developers have started to use hash functions which are intentionally slowAs the answer to these brute-force attacks, developers have started to use hash functions which are intentionally slow; several of such functions are widely used (including PBKDF2, bcrypt, and scrypt); in this article we will not argue which of these functions is better, and which parameters need to be used for them. For our purposes it is important that all of these functions are slow (i.e. they take a substantial chunk of server resources, and this is exactly the reason why they’re more secure against brute-force attacks than the fast hash functions).

Let’s say we’re using a hash function which is 100’000x slower than a single SHA256(). Then1, the cost of attack will grow 100’000x-fold too, and will go into much more comfortable $4’000’000 range (which might be still not enough for certain tasks, but we’d argue that in such cases using a single-factored password-only-based authentication is a Really Bad Idea to start with).

Server-Side CPU Cost

However, on the server side (where we don’t want to use GPUs in practice) such a 100’000x-slower-than-SHA256() function would take on the order of 0.005 sec on a single core (as typical benchmarks for SHA256 on modern non-GPUs show up to 20M SHA256/second). It means that if we have 50’000 users per server (which, according to my colleague ‘No Bugs’, is a typical number [NoBugs]), and all of them are coming at the same time (which is often the case when, for example, a mass reconnect occurs in a multi-player game), processing them will take 250 seconds on a single core (and dividing by a typical number of cores per such a server, we’ll get into a 20-or-so-second range). While in practice such a delay is rarely fatal, it is certainly annoying. In addition, keeping into account future improvements in GPU etc., it would be nice to get several-orders-of-magnitude safety margin, without loading our server for these several orders of magnitude. So, is there something that we can do about it?

What we can (and shall) try to do is to add some of computations on the client side (which is essentially the same as “server relief” known in the field).

Schema 1: Unsalted-Client-Plus-Salted-Server Hashing (marginally better than simple Salted-Server Hashing)

The first hashing schema we’re about to consider, is very simple:

The user enters UserId and password P Password P is hashed on the Client side first, using a hash function HCli() without ‘salt’, obtaining at least 256-bit hash P’. HCli() is a slow hash function, with benefits of being slow discussed below UserId and P’ are transferred over the wire (more precisely – over protected channel, such as TLS) P’ is hashed again, calculating P”=HSrv(server-salt||P’), where || denotes concatenation. HSrv() is also a slow hash function. P” is compared against database-stored password for user with UserId

This schema is very simple, and is easy to implement. Now we need to see what it means in practice. One usage scenario of this client-plus-server-hashing schema is described below.

Usage Scenario: An Extension to Existing Hashing

In this scenario we’re using HSrv() which is exactly the same hashing function, as has been used for server-side hashing before we start our work on improving security (i.e. it shall be using per-user salt, and should be rather slow itself). It means that when we’re switching to client-plus-server hashing schema above, we’re just adding client-side hashing.

Security-wise, client-side hashing doesn’t change security much (except for brute-force attacks). In short, it can be said that for security purposes, it doesn’t really matter much whether we transfer P or P’ over the wire; in fact, when transferring P’ instead of P, P’ effectively becomes a password (see, for example, discussion in [CrackStation] on this matter).

Schema 1: Security Against Brute-Force Attack

“The first option for the attacker is to attack P'; however, as we've said, P' is at least 256 bits long, so brute-forcing it would require exhausting search space of 2^256, which is not feasible.When it comes to a brute-force attack, and client-side hash function HCli() is slow enough, then the situation is different. With the schema above, the attacker has two options for a brute-force attack. The first option for the attacker is to attack P’; however, as we’ve said, P’ is at least 256 bits long, so brute-forcing it would require exhausting search space of 2^256, which is not feasible. The second option for the attacker is to attack P; this is certainly possible; however, to mount this kind of attack the attacker will need to hash each candidate password using both HCli() and HSrv(), therefore incurring costs of both HCli() and HSrv().

To rephrase the same thing in a different way: by adding HCli() and P’, we’re trading one attack vector (on P via HSrv()) to two attack vectors (on P’ via HSrv(), and on P via HCli()+HSrv()), but both of these two attack vectors are significantly more expensive than the original attack vector. The first attack vector on P’ would require using only HSrv(), but as the attack on P’ will require search space increased from 2e14 for the attack on 8-char password to 2^256=1e77 for the attack on 256-bit P’, it will be more than 60 orders of magnitude more expensive than original attack on P via HSrv(). The second attack vector (on P via HCli()+HSrv()) has the same search space as the original attack, but has the cost increased by the cost of HCli(). Out of these two attack vectors, in most practical cases the second one will be cheaper for the attacker, and this is the one which we will refer to in our further brute-force attack cost analysis.

However, with Schema 1 as described, for the purposes of attacking HCli(), it is possible to pre-calculate all the HCli() hashes for 8-char passwords, and therefore to eliminate the cost of HCli() during the attack itself. Rainbow table attack seems to apply here too. While these attacks on HCli() do not make Unsalted-Client-Plus-Salted-Server Hashing more vulnerable than simple Salted-Server Hashing (the one without client-side hashing at all), they will reduce advantages from having HCli() greatly. I won’t go as far as to tell that Schema 1 is useless, but it’s usefulness is relatively limited.

These two attacks (pre-compute and rainbow table) significantly rely on one feature missing from Schema 1: namely, on the lack of salt when hashing HCli(). Apparently, it is possible to introduce random hashing to HCli() too (with a minimal change in protocols), which brings us to Schema 2.

Schema 2: Randomly-Salted-Client-Plus-Salted-Server Hashing (significantly better than simple Salted-Server Hashing, but with Subtle Information Leak)

With schema 2, for each user we’re storing a password, and two (not one) salts, let’s name them client-salt and server-salt. Both salts need to be crypto-random numbers of sufficient lengths, and they shall be independent from each other. Then, Schema 2 looks as follows:

The user enters UserId and password P Client app requests client-salt for user identified by UserId from the server Server returns client-salt for this UserId (see discussion below on handling non-existing UserIds) Password P is hashed on the Client side first, calculating P’=HCli(client-salt||P). As a result of HCli() computation, we’re obtaining at least 256-bit hash P’. Once again, HCli() is a slow hash function UserId and P’ are transferred over the wire (more precisely – over protected channel, such as TLS) P’ is hashed again, calculating P”=HSrv(server-salt||P’). HSrv() is also a slow hash function (though potentially less slow than HCli(), see further discussion below). P” is compared against database-stored user password for the user identified by UserId

Schema 2: Security Against Brute-Force Attack

Schema 2 is substantially more secure against brute-force attacks than schema 1: as pre-computation and rainbow table attacks no longer apply (due to random client-salt), the attacker will indeed need to incur costs of both HCli() and HSrv() (see analysis in Schema 1 above).

In practice, if client-side app is used, costs of HCli() can easily be in 0.5 sec range, making it a 100x improvement over a server-side only hashing. I.e. with this HCli() in place, the attacker will need to spend 100x more time on breaking the password, bringing the cost in billions range2

Schema 2: Dealing with Information Leak

There is one potential issue for Schema 2. The issue is that we’re giving away the client-salt for UserId to any attacker (it is given away before authentication has occurred). As long as it is perfectly random, it won’t give any real information to the attacker, except for the scenario when attacker is simply phishing for existing UserIds; giving away information whether certain UserId exists, is generally considered as poor. So, if UserId doesn’t exist, what should server do in Step #3 of Schema 2? We cannot return an error (it will give away the information about non-existent UserId right away), and we cannot return a random number (as attacker is able to perform multiple requests to the same UserId, detecting the difference between two requests).

“I draw your attention to the fact that this approach still leaves a residual information leakWhat we can do, however, is to return (only if UserId is not found in the database) a PRNG(UserId), where PRNG is calculated, for example, as SHA256(site-wide-salt-for-nonexisting-users||UserId). With site-wide-salt-for-nonexisting-users being unavailable to the attacker (at least to the attacker who needs to ‘phish’ the UserIds, so he didn’t get access to the database), it will be impossible for him to find out if the client-salt he got is real or not.

I draw your attention to the fact that this approach still leaves a residual information leak (one scenario is when user is added to the database, there is a potential for the attacker to figure it out). A related question is a question of the password change (should password change cause client-salt to be different or not?) with further implications for security due to this information leak. [Contini] provides more detailed analysis on the way of dealing with information leak, and on residual information leaks in Schema 2 and other schemas which request client-salt from the server side.

Schema 3: UserId-Salted-Client-Plus-Random-Salted-Server Hashing (significantly better than simple Salted-Server Hashing, no Information Leak)

With schema 3, for each user we’re storing a password, and only one salt, the server-salt. Then, Schema 3 looks as follows:

The user enters UserId and password P Client app calculates P’ as HCli(public-site-salt||UserId||P), where HCli(), as before, is a slow hash function, || denotes concatenation, and public-site-salt is a publicly known string (embedded into client app), which is supposedly unique for the site (it is added as an additional precaution to avoid pre-computation per-user attacks to work across different sites). UserId and P’ are transferred over the wire (more precisely – over protected channel, such as TLS) P’ is hashed again, calculating P”=HSrv(server-salt||P’). HSrv() is also a slow hash function (though potentially less slow than HCli(), see discussion for Schema 2 above). P” is compared against database-stored user password for the user identified by UserId

Schema 3: Security Against Brute-Force Attack

Schema 3 is substantially more secure against brute-force attacks than schema 1: pre-computation and rainbow table attacks are next-to-impossible to mount (while it is possible to build a rainbow table for a specific user on a specific site and re-use it in case of user changing password, practicality of such an attack is extremely limited, as described below).

Therefore, while in theory giving away random client-salt and replacing it with UserId-based salt is a disadvantage, in practice this difference doesn’t seem to provide too much benefit to the attacker in this particular case. The most practical attack in this regard seems to be the following one: if the attacker is interested in a password of one specific user (such as “admin”), and he has stolen a password database once, he can brute-force P and in the process write down valid values of P’ (for 8-char password the database of valid P’ will take about 7000 TB, which is a big size, but still might be feasible for certain classes of attackers). These values can be re-used for a subsequent attack, this time not incurring the cost of HCli()[Contini2].

This has the following practical implications:

Two-factor authentication for critical accounts remains a requirement (it is a Really Good Idea in any case; protecting really valuable data against the attacker who has 7000TB of storage, with a simple 8-char password is extremely imprudent at the very least) for critical accounts, it is useful to have HSrv() more expensive (the cost of HCli() still needs to be the same to avoid information leak), and the limit on minimum password length higher (the latter is again a Good Idea regardless of password hashing schema) making HSrv() a really fast function (such as plain SHA256()) is to be avoided. While HSrv() can be faster than HCli() (due to restrictions on server load), it still should be as slow as possible.

“In general, security of Schema 3 is a pure improvement over simple Salted-Server hashing (and can be easily implemented on top of it just by adding client-side hashing).If we rule out single-user attacks (for example, with a precaution measure #1 above), practical resilience against brute-force attack for Schema 3 is the same as for Schema 2. In other words, in terms of additional CPU power required by the attacker to break “at least some” or “all” passwords in a stolen database, Schema 3 provides as much improvement as Schema 2 (i.e. for our 0.005 sec server-side hashing example above – the improvement will be from 10x to 100x).

It is important that unlike Schema 2, Schema 3 doesn’t give away any information, so the issue of information leak from Schema 2 is non-existent. In general, security of Schema 3 is a pure improvement over simple Salted-Server hashing (and can be easily implemented on top of it just by adding client-side hashing). For most cases, this is the password hashing schema which I argue for use in practice.

Applicability to Client Apps and Web-Based Apps

While in theory, hashing schemas discussed here don’t depend on the nature of the client, their practical application can depend on the client being a downloadable client-side app, or a web-based app.

For client-side apps, these schemas are clearly applicable, with the effect on the cost for the attacker depending on the time user is ready to wait, multiplied by the CPU power of the minimal user device which you need to support. For example, for (modern ARM-based) smartphones I expect numbers that correspond to very roughly 1e7x improvement over SHA256 costs (and therefore about 2 orders of magnitude over our 0.005 sec example server-side hashing).

For web-based apps, it is much less clear how expensive CPU-wise HCli() can be made in practice without hurting user experience too much; this kind of analysis is out of scope of the present article, and I would appreciate any information about further research in this field.

Conclusion

Several client-plus-server hashing schemas have been presented, with an expected improved resilience to brute-force attacks on the stored hashes, while avoiding to cause excessive requirements to CPU resources on the server side. Unless/until the flaw in the analysis above is found, schema #3 (originally proposed in [paj28]) provides significant security improvement over existing practices with respect to brute-force attack resilience (and without incurring unreasonable server-side costs).

Acknowledgement

Cartoons by Sergey Gordeev from Gordeev Animation Graphics, Prague.