There are many remaining mysteries in the Gauss and Flame stories. For instance, how do people get infected with the malware? Or, what is the purpose of the uniquely named Palida Narrow font that Gauss installs?

Perhaps the most interesting mystery is Gauss encrypted warhead. Gauss contains a module named Godel that features an encrypted payload. The malware tries to decrypt this payload using several strings from the system and, upon success, executes it. Despite our best efforts, we were unable to break the encryption. So today we are presenting all the available information about the payload in the hope that someone can find a solution and unlock its secrets. We are asking anyone interested in cryptology and mathematics to join us in solving the mystery and extracting the hidden payload.

The containers

Infected USB sticks have two files that contain several encrypted sections. Named System32.dat and System32.bin, they are 32-bit and 64-bit versions of the same code. These files are loaded from infected drives using the well-known LNK exploit introduced by Stuxnet. Their primary goal is to extract a lot of information about the victim system and write it back to a file on the drive named .thumbs.db. Several known versions of the files contain three encrypted sections (one code section, two data sections).

The decryption key for these sections is generated dynamically and depends on the features of the victim system, preventing anyone except the designated target(s) from extracting the contents of the sections.

By the way, the 64-bit version of the module has some debug information left in it. The module contains debug assertion strings and names of the modules:

.loader.cpp

NULL != encSection

Path

NULL != pathVar && curPos < pathVarSize

NULL != progFilesDirs && curPos < progFilesDirsSize

NULL != isExpected

NULL != key

(NULL != result) && (NULL !=str1) && (NULL != str2)

.encryption_funcs.cpp

The data

The mysterious encrypted data is stored in three sections:

The files also contain an encrypted resource 100 that seems to be the actual payload, given the relatively small size of the encrypted sections. It is most likely that the section .exsdat contains the code for decrypting the resource and executing its contents.

The algorithm

The code that decrypts the sections is very complex compared to any regular routine we usually find in malware. Here is a brief description of the algorithm:

Validation

Make a list of all entries from GetEnvironmentVariableW(Path), split by separator ; Append the list with all entries returned by FindFirstFileW / FindNextFileW by mask %PROGRAMFILES%*, where cFileName[0] > 0x007A (UNICODE z)

Note: in essence, this means the specific program which is installed in %PROGRAMFILES% has a name which starts either with a special char such as ~, as in our example, or uses an UNICODE special char table, such as Arabic or Hebrew, where all chars are higher than 0x007A.

Make all possible pairs from the entries of the resulting list. For each pair, append the first hard-coded 16-byte salt and calculate MD5 hash.

Example of the string pair, second string starting from ~dir and first salt

Calculate MD5 hash from the hash ( i.e. hash = md5(hash) ), 10000 times. Compare if the MD5 hash matches the hard-coded value. If not, then exit.

Decryption

The sections are decrypted in the following order: .exsdat, .exrdat, .exdat

Use the PATH/PROGRAMFILES pair that was used to generate the expected MD5 hash in the validation code above. Append the pair with the second hard-coded 16-byte salt and bytes 0x15, 0x00

Example of the string pair, second string starting from ~dir and first salt

Calculate MD5 hash from the resulting buffer. Calculate MD5 hash from the hash ( i.e. hash = md5(hash) ), 10000 times. Derive the RC4 key from the resulting hash using WinAPIs CryptDeriveKey(hProv, CALG_RC4, hBaseData, 0, &hKey). Decrypt the section (RC4), treating its first DWORD as the length of the buffer to decrypt and encrypted buffer starting at offset 4 of the section. Compare DWORDs in the decrypted buffer at positions 0 or 7 with magic value 0x20332137. Proceed only if any of the DWORDs match. Increase the last WORD in the pair+salt buffer (the one initially set to 0x0015) by 1. Decrypt another section, goto 3.

After all the sections are decrypted: call the function at the beginning of the .exsdat section.

Sample data for validating the algorithm:

The string pair is created by concatenating the strings. The strings and the salt buffer are not separated by any character.

Sample test Strings, Unicode (without quotes):

“C:Documents and SettingsjohnLocal SettingsApplication DataGoogleChromeApplication”

“~dir1”

First salt, hex dump: 97 48 6C AA 22 5F E8 77 C0 35 CC 03 73 23 6D 51

MD5 at validation step 6: 76405ce7f4e75e352c1cd4d9aeb6be41

Second salt, hex dump: BB 49 4E 77 F9 25 EE C0 3B 89 FC ED C2 22 4A 21

MD5 at decryption step 5: 00916031b3e9513044436ee42b6aa273

Join the quest

We have tried millions of combinations of known names in %PROGRAMFILES% and Path, without success. The check for the first character of the folder in %PROGRAMFILES% indicates that the attackers are looking for a very specific program with the name written in an extended character set, such as Arabic or Hebrew, or one that starts with a special symbol such as “~”.

Of course, it is obvious that it is not feasible to break the encryption with a simple brute-force attack. We are asking anyone interested in breaking the code and figuring out the mysterious payload to join us.

The resource section is big enough to contain a Stuxnet-like SCADA targeted attack code and all the precautions used by the authors indicate that the target is indeed high profile.

We are providing the first 32 bytes of encrypted data and hashes from known variants of the modules. If you are a world class cryptographer or if you can help us with decrypting them, please contact us by e-mail: theflame@kaspersky.com.

Source data

We are providing up to 32 bytes from the beginning of each encrypted section, skipping the DWORD that contains the length of the encrypted buffer. Please contact us by e-mail theflame@kaspersky.com if you need more encrypted data.

Sample 56e4fb972828fafbbdc11158a1b5fa72 Salt 1 97 48 6C AA 22 5F E8 77 C0 35 CC 03 73 23 6D 51 Reference MD5 758EA09A147DCBCAD6BD558BE30774DE Salt 2 BB 49 4E 77 F9 25 EE C0 3B 89 FC ED C2 22 4A 21 Exsdat 4C CC BA E2 E0 BA 2E 44 C7 60 17 9A 72 F4 2F 27 DD FD DB 11 03 94 E3 4B 0A 16 66 F3 36 97 6C D8 Exrdat C9 27 BE 67 4D 3B 39 36 AB 14 44 32 88 60 7A 64 B0 92 9B 3A A1 5B C5 21 A7 6E 09 0C F8 71 84 87 Exdat B8 EB 6D 61 2B 4F 70 65 75 A2 1C 03 1C DF 26 2F

Sample 695056ffacef1fdaa326d7c8bb0f88ba Salt 1 6E E3 47 2C 06 A5 C8 59 BD 16 42 D1 D4 F5 BB 3E Reference MD5 EB2F172398261ED94C8D05216650919B Salt 2 8F 42 B5 87 E8 9A B2 32 C8 1C 1A EC B5 2D 55 19 Exsdat CE 31 D0 5D 7D CB 57 9A 83 06 09 8D 42 2B 44 34 24 13 B2 39 22 48 8F F3 76 E5 9C DA 87 8F BC 42 Exrdat 50 1F F8 BA 18 1B 3E 36 23 9D 95 DC 5A 07 E4 EC 76 38 78 79 BA 84 A5 4E 24 BA 0E 27 94 63 F7 3D Exdat 9D 5B B8 3B B2 17 00 DC 76 81 1D 4E 54 80 9B 31

Sample 089d45e4c3bb60388211aa669deab26a Salt 1 0E A5 01 D1 24 71 CD CD 0E 9E AC 6E 48 5A F9 32 Reference MD5 52DD4D6B792D84C422E6A08E4272ACB8 Salt 2 38 F9 A6 5B 82 08 E7 61 1D 10 73 53 50 BC B4 F0 Exsdat D3 CA 9D 9F 87 FB 25 43 7E C6 57 7C D9 06 10 8D D2 5B B2 88 18 6E FD B4 C4 30 12 2E 1E EC E0 64 Exrdat B4 43 8F B8 0A 67 7D 88 C1 CD F3 E8 D9 61 1B E9 5A 8A 41 16 8B 8A 18 AD 25 5A 81 87 8F 8D 1A 40 Exdat F6 C9 81 C9 86 27 16 0C B7 33 93 AB 3E 71 5B E2