Juggling hashes with your eyes closed

a simple method to identify the improper use of == operator in PHP applications from a black box perspective to exploit a type juggling functionality

This blog post provides some insight on a problem that can often be found in PHP applications, more specifically when comparing strings using the equal == operator. There’s currently several posts related to this topic however I will be focussing on how this can be exploited from a black box perspective against any web app in a way that is suitable for a common penetration test assessment. Firstly I will be analysing the root cause of this problem in order to better understand how it works and how we can get the most out of it.

The Problem

In 2011, an interesting thread on the PHP official bug tracking system [1] refers to some weird behaviours regarding comparison of strings with numbers. This thread wasn’t specifically from a security standpoint however you can see one comment below;

php > var_dump('0xff' == '255'); bool(true)

In fact, this specific example is not a bug but the result of a documented behaviour known as 'type juggling' that PHP provides [2] when using loose comparison operators such as == . Essentially, for certain comparison operators ( == , !=, <> ), PHP first tries to figure out their type based on different things [3 , 4] and only then the comparison takes place. These transformations may affect the expected result with important security implications as this privilege escalation demonstrates [5 , 6] or this insecure password validation reported in Full Disclosure [7].

Gynvael has a great blog post on this , PHP equal operator ==' [8] which covers it extensively for different data types where you can find a great comparison reference table [9] and several examples as follows:

"1.00000000000000001" == "0.1e1" → bool(true) "+1" == "0.1e1" → bool(true) "1e0" == "0.1e1" → bool(true) "-0e10" == "0" → bool(true) "1000" == "0x3e8" → bool(true) "1234" == " 1234" → bool(true)

As you can see, these numeric strings are compared as actual numbers when using == which is particularly interesting from a security perspective. In this case you can get a string with the representation of a number in scientific notation which PHP will evaluate as a number. This output format could actually come from a hashing algorithm (usually represented as hexadecimal) and if this number for example is 0 elevated to any other number then it will always match 0 on a loose comparison. For a given hashing algorithm such passwords would become interchangeable i.e. slightly more likely to be accepted since their hash once converted into a number in scientific notation will match several others that happen to represent that same number even if in some other way.

Finally, this problem became more notorious with a post from Ed Skoudis [10] and later on with Robert Hansen who wrote 'Magic Hashes' [11] where he compiled a table of numbers that for different hashing algorithms generate an output that matches ^0+ed*$ format i.e. 0 in scientific notation and others [12].

From a black box perspective

It's relatively trivial to spot these bugs from a static analysis point of view but what can we do from a black box perspective? For any account within an application, if we can find a pair of these interchangeable passwords for the most popular hashing algorithms (SHA1, MD5,..) that are both simultaneously accepted, with all likelihood you have determined improper use of a loose comparison for password hashes. If we consider the typical penetration testing engagement where you want to be provided with at least one account, you can try to set your password to one of these and then try to login with the other. Obviously this depends on which hashing algorithm is in use so you need to have two of these passwords per hashing algorithm and that is assuming that no salt is in use.

Now, before bringing the heat to find those password pairs, there is one more consideration to have in mind - password requirements. It would be a shame to work on finding a couple of these passwords and later on to fail to prove our vulnerability just because our passwords do not meet password complexity requirements. So let's make sure we look for passwords with length > 8, letters in mixed case, numbers and at least one special character, as follows:

1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

11.

12.

13.

14.

15.

16.

17. import random

import hashlib

import re

import string

import sys



prof = re . compile ( "^0+ed*$" )



prefix = string . lower ( sys . argv [ 1 ])+ '!' + string . upper ( sys . argv [ 1 ])+ "%s"

num = 0



while True :

num += 1

b = hashlib . sha256 ( prefix % num ). hexdigest ()

if ( b [ 0 ]== '0' and prof . match ( b )):

print ( prefix + str ( num ), b )



I did not go into great lengths to optimise performance, just a carefully written python script running on all available cores of my AMD FX8350 using the PyPy interpreter. I used hashlib with OpenSSL's implementation for the hashing functions and to make sure I don't get bitten by the Python GIL, I just spawned independent processes where each one of them is looking at a different key space. For that, I used an ultra-sophisticated technique which is to just have a different prefix for each password (which in turn is derived from a user input provided letter), as you can see above.

Results

Within about an hour I got not 2 but 4 hashes in SHA1 and surprisingly it took me a bit longer to get 4 of those passwords for MD5. I'm sure if you run this in your machine it won't take long until you uncover many more of these but we really only need 2 here per hashing algorithm.

Here you have the colliding passwords:

MD5

word hash c!C123449477 0e557632468345060543073989263828 d!D206687225 0e749889617409631915178731435707 e!E160399390 0e680455198929448171766997030242 f!F24413812 0e666889174135968272493873755352

SHA-1

word hash aA1537368460! 0e98690883042693380036268365370177656718 aA3539920368! 0e80128521090954700858853090442722395969 cC6593433400! 0e65495612893131014886449602893230063369 fF3560631665! 0e49205137236861153120561516430463247071

And to prove, you can pick any of those two and compare:

php > var_dump(md5('c!C123449477') == md5('d!D206687225')); bool(true) php > var_dump(sha1('aA1537368460!') == sha1('fF3560631665!')); bool(true)

If the above doesn't work and you are feeling lucky you could try to concatenate your password with the username hoping that it will be used as a salt and look for these numerical hashes. You can do it with a very simple change on the script provided here.

The Solution

PHP provides a solution for this, if you want to compare hashes you should either use password_verify() [13] or hash_equals() [14] functions. These enforce a strict comparison and close this vector. Note that hash_equals() may also be used for strings other than specifically hashes and should be preferred over === because it offers protection against timing attacks.

Conclusions

Whilst trivial to execute, the approach described here may provide valuable knowledge that one can get from a black box perspective. If the passwords provided above are found to be interchangeable for a given application, several conclusions can be drawn (that can go much beyond the PHP datatype juggling issue itself), such as:

Strict comparisons are missing for Security related processes (hash_equals(), password_verify(), === operator) and it may be possible to login onto some accounts through this type of password hash collisions;

Fast and Weak hashing algorithms should not be used to validate passwords. Use SCRYPT or BCRYPT instead;

Salts should also be in use and should be randomly generated for each password;

Last but not least, this is an indicator of a design weakness - password validation functions should only return a boolean response (e.g. out of a DB query) whether the password is valid or not. In other words, if we consider a typical n-tier deployment the Application itself should not be able to retrieve user passwords/hashes on its own but only to validate them when provided. The existence of functions that provide user credentials (even if hashes) adds increased risk.

Going Beyond

There is so much more that can be done to explore this further. An attacker could include those passwords into a wordlist and launch a horizontal password brute force attack against all users. Also if the application has an insecure password recovery mechanism, attackers may be able to trigger an arbitrary number of password resets against a targeted account and keep trying until successful. This is still quite unlikely to work, at least without having that account with a "few" millions of password reset emails but note this problem may not be exclusive to password validation, it could happen in any security sensitive comparison including session validation (sending a session ID == 0 multiple times against custom session validations in PHP, could work!).

It would also be interesting to get a couple of these passwords for SHA-2 but a rough estimation tells me this is not going to fly, not with the approach described above. SHA-2 hashes are longer so the odds of finding a string whose hash matches our desired output format are too slim. For this reason this is also less likely to become exploitable in a realistic scenario but still it would be a nice exercise for someone with a well tuned GPU rig.

Source Code

Check here [PY].

References

[1] PHP.net Bug #54547 wrong equality of string numbers

[2] PHP: Type Juggling - Manual

[3] PHP: Strings - Manual

[4] PHP: Comparison Operators - Manual

[5] PT-2012-29: Administrator Privilege Gaining in Simple Machines Forum

[6] Exploiting Exotic Bugs: PHP Type Juggling

[7] Humhub insecure password validation and reset design

[8] PHP equal operator ==

[9] PHP equal operator == reference table

[10] SANS: PHP Weak Typing Woes —

[11] Magic Hashes

[12] Type-coercing comparison operators will convert numeric strings to numbers

[13] PHP: password_verify - Manual

[14] PHP: hash_equals - Manual