NodeJS Security Updates

You might have heard about the high impact security vulnerability issue recently fixed and announced by nodeJS team. This post will attempt to explain the issue, how and why it happened.

Thanks to the amazing node team, this vulnerability has been fixed immediately across all lines (4.x, 6.x, 7.x, and 8.x).

Make sure you upgrade your node version as soon as you can (preferably when you finish reading this post!). For upgrading your node head to the official nodeJS blog for details.

The vulnerability roots from HashTables, so lets start with a quick recap on HashTables, just in case you missed your computer science classes.

HashTables

HashTables fulfill the dream of *constant time* insertions and access.

Some of the places where HashTables are used:

Method lookups.

Sets, Objects, and Maps.

HTTP headers.

JSON representations.

URL-encoded post form data.

A HashTable is a group of linked lists, the linked-lists must be small in order for the performance to be good.

The create HashTable function pre-allocates the number of linked lists it can contain, throughout this post, the number of linked-lists the HashTable can hold is denoted by the word l .

l = 2**n

The best scenario for optimal performance is to have n/l entries in each linked list, where n is the number of entries in the HashTable. so if we have 256 entries in the HashTable, the perfect hash function would distribute them into the 256 lists, each holding only 1 entry.

Example: Storing strings one to ten in a HashTable

In this example, our hash function is the following:

H(s) = first byte of s

l = 256

In our “one” to “ten” example, inserting the values into the HashTable will result in something like the following:

{

“o” : [“one”],

“t” : [“two”, “three”, “ten”],

“f” : [“four”, “five”],

“s” : [“six”, “seven”],

“e” : [“eight”],

“g” : [],

“h” : [],

“n” : [“nine”],

“r” : [],

…

}

Notice that we have empty linked-lists in the HashTable, because they are not filled up by any value. Additionally, there is a pile up on the t list in the HashTable. This is a bad hash function, since it does not do a good job in distributing the strings through the linked lists. No matter how many linked lists you have l , if the hash function does a bad job in the distribution among them you will get bad performance, since linked lists are slow, hash tables are fast by having only short linked lists in them.

To avoid overflowing the HashTable, since there are more H(s) results than l , the hash function is always reduced by l using mod

H(s) % l

H(s) = first byte of s is not a good hashing function. NodeJS implements a very good hash function which quickly look through the whole string, and gives randomly looking results to spread the results across the HashTable.

Since we have a good hashing function, normal usage should *never* cause HashTables to over-flood. However an attacker just might.

Hashing malicious strings.

Attacker provides strings where H(s)%l will result in the same HashTable index, hence all these strings will be stored in the same linked list, making the app go super slow.

To solve such issue, programming languages looked at different solutions, some are the following:

Solution #1

Replace linked lists with another structure, such as red black trees or any balanced tree structure.

However, most programs go with the simpler approach of using a simple linked lists to avoid bugs and heavy re-writings, debugging, and the implementation complexity.

Solution #2

Stop inserting into the linked list if it gets more than n entries.

This solution looks pleasant, but for real world applications, you cannot just discard newly coming results, hence this solution is not viable for all situations. It can be only used in caching solutions.

Solution #3

Use crypto hashing functions like sha256.

Crypto hash functions should be collision-resistant, you should not find any two strings with the same output.

But this solution is troublesome for two main reasons:

The hash function should to be fast, crypto hashes are not the fastest. Even if the crypto hashes were fast enough, this does not solve the problem. The attacker does not need to find collisions in the crypto hashing function itself, instead, they need to find collisions in the output which is reduced to the number of l linked lists specified. H(s)%l is not collision resistant based on the hashing function, since the output will always be reduced to the 2**n lists. So the attacker might run a few million iterations to find critical amounts of collisions.

Current Solution

Randomizing the hash function with a secret key.

These attacks were repeated many times during the years, with different names, low bandwidth denial of service attacks, hash flooding, algorithmic complexity attacks, etc.

Attacks where targeted at many programming languages, such as Perl, Redis, and python.

The response to these attacks was to secretly randomize the hash function on each application boot, or hashtable initialization, this way the attacker has no way to guess the hashing funciton, the attacker cannot find collisions because they do not know the secret key of the hashing function.

NodeJS uses the same approach, by randomizing its “HashTable seed”.

**Wether this is actually secure or not is a debate for another time.