Laooo 1337 H4x0!2





Join Date: Aug 2010 Posts: 126 Reputation: 4306

Rep Power: 251

Recognitions Member of the Month (1)

Points: 9,651, Level: 11 Level up: 87%, 149 Points needed Activity: 2.3% Last Achievements

An example of reversing an undocumented game file format



The following is a tutorial I wrote about 3 years ago. I found it very recently while going through old files and figured it would be helpful to some people here. I chose to leave it mostly unedited, so I hope you don't mind my past writing style / way of explaining things. There may also be some inconsistencies here and there.





------------------------------------------------



Preface



Hello,

I haven't seen a lot of threads discussing the reversal of unknown file formats. People generally tend to use available extractors, however sometimes no public information is available for the format in question (for example, when the developer company uses their own format to protect their files).

Formats can be very different, hence the importance of the title: this is simply an example. The aim of this tutorial is to show you some of the basic steps to work with an unknown format.



Prerequisites

An understanding of the following languages is required:

- C++

- x86 assembly (Intel syntax)

I will attempt to make it as easy as possible for anyone to follow, but make sure you're not completely clueless about the above.



Tools

I will be using the following tools:

- HexEdit (or an hex editor of your choice)

- OllyDBG (with the Stealth64 plugin, choose whatever works best for you)

- IDA Pro (I'm using 6.8) with the Hex-Rays Decompiler

I also assume you're familiar with each of those tools.





So, what will we be working on exactly?

We're going to be analyzing the file format Brawl Busters uses for protecting its data. Brawl Busters is an action-combat game that is not being published anywhere anymore, luckily the setup is still available for download - you can find it very easily with a simple Google search.

Throughout the tutorial, I will also provide some pseudocode to show basic steps of making an extractor based on what we will have analyzed.



Setup

After downloading the game, open its installation folder, and then the "bin" folder. You should see a file called "pbclient.exe" : that's the game client.

I've uploaded a patched version of pbclient.exe, which is the one I have been using. I've simply patched some conditional jumps to prevent XTrap from loading. In the bin folder, rename pbclient.exe to anything you like and put the patched pbclient.exe in there.



https://www.unknowncheats.me/forum/d...=file&id=26703







Once that's done, we should have the same installation folder. Time to get started!



Taking a look at the root folder, the most logical thing to do would be to find out where the actual game data is stored. This isn't hard to find at all, since there is a "Data" folder, so let's open it and see what's inside:





This does look like game data indeed! If you scroll down, you will also see some config files (.ini) as well as a few images (Splash.png for example).

Most of the files in this folder have a weird .bus extension and seem to store most of the UI data. And that's pretty much what we will be interested in.

For starters, what could .bus mean? The game is called "Brawl Busters" so it's very likely that ".bus" stands for ".busters".



We haven't done much so far, but we already know that the game stores most of its data in .bus files, and that it definitely looks like a proprietary format (Google won't give you any information).

So, what do they look like? This is where the hex editor will be useful. I always use it before using Notepad, because it's highly likely that the file can't be read as a plain text file. Let's open a small one, such as "Anim.bus":





That doesn't tell us much... it looks like the data is encrypted.

Let's open a larger one, say, "Anim_Mob_Zombie_Unique_Heavy.bus":





This isn't really telling us much either... except one similarity: the first 33 bytes of each file are zeroed. If we open more .bus files, it turns out the first 33 bytes are always filled with 0's. This might be a useful detail, so keep it in mind.

There is another detail: If you scroll down to 0x4ED0, you will notice some plain text:





This is way more interesting already! If we Google "Gamebryo File Format, Version 20.6.0.0", we can see that it's a known format called "NIF" (it stands for NetImmerse File, but that isn't important).

NIF files seem to be models, and their textures are loaded using DDS files. If we search "DDS" in HexEdit, we also get some results such as "texname="FX_GhostTrail_A_00.dds". This means some bus files are also storing DDS files.

Searching for "Gamebryo" again returns several results. What conclusions can we draw from this? Well, .bus files seem to be archives, that is, they store multiple files at once. The data before the file data is most probably encrypted metadata (information about the archive as well as its files).

Let's make a quick recap of what we know:



- Most of the game data is stored in .bus files

- These .bus files can store multiple files at once, so they are actually file archives

- The game uses the Gamebryo Engine (or something that is based on it) since the stored files seem to be either NIF / KFM / DDS (if you want to verify that yourself, open other files and search for these three strings).



What's next? We now have to figure out how the client decrypts / loads these files. This is where it gets more interesting.



Static analysis in IDA Pro

Let's go back to the "bin" folder, open pbclient.exe with IDA Pro, and let it analyze the file.

The first thing we will be looking for is imports. What imports exactly? Well, since the client has to open these files, it's likely that it will use APIs such as CreateFile or fopen.

Let's search for the first one, but wait a second:





CreateFileMapping takes a handle to a file which is typically returned by an API such as CreateFile, let's cross-reference that API instead.

The first reference we find is in this function:



(NOTE: For ease of understanding, I will use the decompiler most of the time, its output should not be fully trusted but it gives us a general overview of what the function does)



Code: char __userpurge [email protected]<al>(int [email protected]<eax>, HANDLE *[email protected]<edi>, _DWORD *a3, DWORD a4) { char v4; // [email protected] HANDLE v5; // [email protected] void *v6; // [email protected] char v7; // [email protected] DWORD v9; // [email protected] HANDLE v10; // [email protected] char v11; // [email protected] LPVOID v12; // [email protected] char v13; // [email protected] HANDLE hObject; // [sp+Ch] [bp-4h]@1 v4 = 0; v5 = CreateFileA(*(LPCSTR *)(a1 + 20), 0x80000000, 1u, 0, 3u, 0, 0); v6 = v5; hObject = v5; if ( v5 == (HANDLE)-1 ) { v7 = GetLastError(); sub_40FB04(); sub_4E7D8B(5, 16, "CreateFileA - Failed (%d)", v7); return 0; } v9 = GetFileSize(v5, 0); if ( a4 > 0 && v9 < a4 ) { CloseHandle(v6); return 0; } v10 = CreateFileMappingA(v6, 0, 0x4000002u, 0, v9, 0); *a2 = v10; if ( !v10 ) { v11 = GetLastError(); sub_40FB04(); sub_4E7D8B(5, 16, "CreateFileMappingA - Failed (%d)", v11); CloseHandle(hObject); return 0; } v12 = MapViewOfFile(v10, 4u, 0, 0, 0); *a3 = v12; if ( v12 ) { v4 = 1; } else { v13 = GetLastError(); sub_40FB04(); sub_4E7D8B(5, 16, "MapViewOfFile - Failed (%d)", v13); CloseHandle(*a2); } CloseHandle(hObject); return v4; }

Let's look at the second reference to CreateFileMappingA, which is in this function:



Code: char __stdcall sub_52AE36(int a1, char a2, int a3, int a4, int a5, int a6, int a7) { HANDLE v7; // [email protected] void *v8; // [email protected] int v9; // [email protected] DWORD v10; // [email protected] const void *v11; // [email protected] int v12; // [email protected] const void *v13; // [email protected] int v14; // [email protected] int v15; // [email protected] int v16; // [email protected] int v17; // [email protected] int v18; // [email protected] int v19; // [email protected] int v20; // [email protected] char v21; // [email protected] int v22; // [email protected] char v23; // [email protected] int v24; // [email protected] int v26; // [sp-34h] [bp-148h]@15 int v27; // [sp-30h] [bp-144h]@15 int v28; // [sp-2Ch] [bp-140h]@15 int v29; // [sp-28h] [bp-13Ch]@15 int v30; // [sp-24h] [bp-138h]@15 char v31; // [sp-20h] [bp-134h]@12 int v32; // [sp-1Ch] [bp-130h]@12 int v33; // [sp-18h] [bp-12Ch]@12 int v34; // [sp-14h] [bp-128h]@12 int v35; // [sp-10h] [bp-124h]@5 int v36; // [sp-Ch] [bp-120h]@12 LPCSTR v37; // [sp-8h] [bp-11Ch]@2 char *v38; // [sp-4h] [bp-118h]@2 char v39; // [sp+0h] [bp-114h]@13 char v40; // [sp+Ch] [bp-108h]@15 char v41; // [sp+20h] [bp-F4h]@12 char v42; // [sp+24h] [bp-F0h]@15 char v43; // [sp+3Ch] [bp-D8h]@15 char v44; // [sp+50h] [bp-C4h]@12 int v45; // [sp+68h] [bp-ACh]@12 char v46; // [sp+74h] [bp-A0h]@15 int *v47; // [sp+90h] [bp-84h]@15 int v48; // [sp+94h] [bp-80h]@11 char v49; // [sp+ACh] [bp-68h]@12 char v50; // [sp+C4h] [bp-50h]@1 char v51; // [sp+DCh] [bp-38h]@1 LPCSTR lpFileName; // [sp+F0h] [bp-24h]@1 LPCVOID lpBaseAddress; // [sp+F4h] [bp-20h]@6 HANDLE hObject; // [sp+F8h] [bp-1Ch]@1 HANDLE hFileMappingObject; // [sp+FCh] [bp-18h]@4 char v56; // [sp+103h] [bp-11h]@1 int v57; // [sp+104h] [bp-10h]@4 int v58; // [sp+110h] [bp-4h]@1 v58 = 1; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>( &v50, Default, &v56); LOBYTE(v58) = 3; sub_44B825(&v51); LOBYTE(v58) = 4; v7 = CreateFileA(lpFileName, 0x80000000, 1u, 0, 3u, 0, 0); v8 = v7; hObject = v7; if ( v7 == (HANDLE)-1 ) { v38 = (char *)GetLastError(); v37 = lpFileName; sub_40FB04(v9); sub_4E7D8B(5, 16, "Invalid bus file - filename:(%s), Error:(%d)", (char)v37); LABEL_3: LOBYTE(v58) = 3; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::~basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>(&v51); LOBYTE(v58) = 0; goto LABEL_21; } v57 = GetFileSize(v7, 0); v10 = GetFileSize(v8, 0); hFileMappingObject = CreateFileMappingA(v8, 0, 0x4000002u, 0, v10, 0); if ( !hFileMappingObject ) { v38 = (char *)GetLastError(); sub_40FB04(5); sub_4E7D8B(v35, 16, "CreateFileMappingA - Failed (%d)", (char)v38); CloseHandle(v8); goto LABEL_3; } v11 = operator new(0x90u); lpBaseAddress = v11; LOBYTE(v58) = 5; if ( v11 ) v12 = sub_52B811(v11); else v12 = 0; LOBYTE(v58) = 4; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::erase( v12 + 20, *(_DWORD *)(v12 + 40), *(_DWORD *)(v12 + 36)); sub_40293D(*(_DWORD *)(a1 + 108)); stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::operator=(v12 + 68, &v51); v13 = MapViewOfFile(hFileMappingObject, 4u, 0, 0, 0); lpBaseAddress = v13; if ( v13 ) { sub_44BA2C(&v48); LOBYTE(v58) = 6; if ( (unsigned __int8)sub_52B8BA(v12, (void *)v13, v57, a1 + 32, (int)&v48) ) { sub_498F89(&v49); LOBYTE(v58) = 7; v15 = stlp_std::locale::locale((stlp_std::locale *)&v57); LOBYTE(v58) = 8; sub_42E050(v15); LOBYTE(v58) = 7; stlp_std::locale::~locale((stlp_std::locale *)&v57); v16 = stlp_std::locale::locale((stlp_std::locale *)&v57); LOBYTE(v58) = 9; sub_50C2A7(&v49, v16); LOBYTE(v58) = 7; stlp_std::locale::~locale((stlp_std::locale *)&v57); v38 = (char *)v12; v57 = (int)&v32; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>( &v32, &v49); LOBYTE(v58) = 7; v17 = sub_52B518(&v41, v31, v32, v33, v34, v35, v36, v37); LOBYTE(v58) = 11; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>( &v44, v17); LOBYTE(v58) = 12; v45 = *(_DWORD *)(v17 + 24); LOBYTE(v58) = 13; v56 = *(_BYTE *)(sub_52B4C8(&v44) + 4); LOBYTE(v58) = 11; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::~basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>(&v44); LOBYTE(v58) = 7; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::~basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>(&v41); if ( v56 ) { sub_498D74(&v46); LOBYTE(v58) = 16; v19 = stlp_std::locale::locale((stlp_std::locale *)&v57); LOBYTE(v58) = 17; sub_42E050(v19); LOBYTE(v58) = 16; stlp_std::locale::~locale((stlp_std::locale *)&v57); v20 = stlp_std::locale::locale((stlp_std::locale *)&v57); LOBYTE(v58) = 18; sub_50C2A7(&v46, v20); LOBYTE(v58) = 16; stlp_std::locale::~locale((stlp_std::locale *)&v57); v57 = (int)&v33; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>( &v33, &v49); LOBYTE(v58) = 19; v47 = &v26; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>( &v26, &v46); LOBYTE(v58) = 16; v22 = sub_52B564(&v43, v21, v26, v27, v28, v29, v30, v31); LOBYTE(v58) = 21; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>( &v40, v22); LOBYTE(v58) = 22; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>( &v42, v22 + 24); LOBYTE(v58) = 24; v23 = *(_BYTE *)(sub_4E75FD(&v40) + 4); LOBYTE(v58) = 21; sub_49324B(&v40); LOBYTE(v58) = 16; sub_52B331(&v43); if ( v23 ) { LOBYTE(v58) = 7; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::~basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>(&v46); LOBYTE(v58) = 6; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::~basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>(&v49); UnmapViewOfFile(lpBaseAddress); CloseHandle(hFileMappingObject); CloseHandle(hObject); LOBYTE(v58) = 4; sub_402821(&v48); LOBYTE(v58) = 3; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::~basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>(&v51); LOBYTE(v58) = 0; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::~basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>(&v50); v58 = -1; sub_402821(&a2); return 1; } v38 = "std::make_pair(strLeaf, str) - Failed"; v37 = (LPCSTR)16; v36 = 5; sub_40FB04(v24); sub_4E7D8B(v36, (int)v37, v38, v39); UnmapViewOfFile(lpBaseAddress); CloseHandle(hFileMappingObject); CloseHandle(hObject); LOBYTE(v58) = 7; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::~basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>(&v46); } else { v38 = "std::make_pair(str, pPkg) - Failed"; v37 = (LPCSTR)16; v36 = 5; sub_40FB04(v18); sub_4E7D8B(v36, (int)v37, v38, v39); UnmapViewOfFile(lpBaseAddress); CloseHandle(hFileMappingObject); CloseHandle(hObject); } LOBYTE(v58) = 6; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::~basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>(&v49); } else { sub_40FB04(v14); sub_4E7D8B(5, 16, "Pkg Load Failed", v39); UnmapViewOfFile(v13); CloseHandle(hFileMappingObject); CloseHandle(hObject); } LOBYTE(v58) = 4; sub_402821(&v48); } else { v38 = (char *)GetLastError(); sub_40FB04(5); sub_4E7D8B(v35, 16, "MapViewOfFile - Failed (%d)", (char)v38); CloseHandle(hFileMappingObject); CloseHandle(hObject); } LOBYTE(v58) = 3; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::~basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>(&v51); LOBYTE(v58) = 0; LABEL_21: stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::~basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>(&v50); v58 = -1; sub_402821(&a2); return 0; }

Before we continue, let's clear something up first. This line appears often and can seem to be very cryptic (line 56):

Code: stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>(...)

Code: stlp_std::



Code: basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::

Code: template < class charT, class traits = char_traits<charT>, // basic_string::traits_type class Alloc = allocator<charT> // basic_string::allocator_type > class basic_string;

In our case the char type is used as the charT parameter, which means this would do the same thing:

Code: stlp_std::basic_string<char>::

Code: typedef basic_string<char> string;

Code: stlp_std::string::ctor_0(...) //std::string constructor, I added "_0" because std::string has different constructors





(NOTE: I recommend you do this for every every constructor/function of std::string, that's also why I am numbering them. It makes the decompiler output clearer)



Back to the actual function! What interests us next is the call to MapViewOfFile (line 100). If you don't know what it does exactly:

https://msdn.microsoft.com/en-us/lib...(v=vs.85).aspx

In this case v13 will contain the base address of the mapped view (the whole file is mapped into memory). It is then passed as the second parameter to this function:

Code: sub_52B8BA(v12, (void *)v13, v57, a1 + 32, (int)&v48);



Code: char __stdcall sub_52B8BA(int a1, void *Src, int a3, int a4, int a5) { int v5; // [email protected] void *v6; // [email protected] int v7; // [email protected] int v8; // [email protected] int v9; // [email protected] int v10; // [email protected] int v11; // [email protected] char *v12; // [email protected] int v13; // [email protected] int v14; // [email protected] int v15; // [email protected] int v16; // [email protected] char v17; // [email protected] int v19; // [email protected] int v20; // [email protected] int v21; // [email protected] int v22; // [email protected] int v23; // [email protected] int v24; // [email protected] char v25; // [email protected] rsize_t v26; // [email protected] int v27; // [email protected] int v28; // [sp-3Ch] [bp-130h]@4 int v29; // [sp-38h] [bp-12Ch]@4 int v30; // [sp-34h] [bp-128h]@4 int v31; // [sp-30h] [bp-124h]@4 int v32; // [sp-2Ch] [bp-120h]@4 int v33; // [sp-28h] [bp-11Ch]@4 char v34; // [sp-24h] [bp-118h]@4 char v35; // [sp-20h] [bp-114h]@4 int v36; // [sp-1Ch] [bp-110h]@4 char v37; // [sp-18h] [bp-10Ch]@4 char *v38; // [sp-Ch] [bp-100h]@3 rsize_t v39; // [sp-8h] [bp-FCh]@3 char *v40; // [sp-4h] [bp-F8h]@2 char v41; // [sp+10h] [bp-E4h]@4 char v42; // [sp+50h] [bp-A4h]@4 char *v43; // [sp+9Ch] [bp-58h]@12 char v44; // [sp+A0h] [bp-54h]@3 char v45; // [sp+B8h] [bp-3Ch]@4 char v46; // [sp+D0h] [bp-24h]@12 char v47; // [sp+D4h] [bp-20h]@12 int v48; // [sp+D8h] [bp-1Ch]@2 char *v49; // [sp+DCh] [bp-18h]@3 int *v50; // [sp+E0h] [bp-14h]@8 int v51; // [sp+E4h] [bp-10h]@3 int v52; // [sp+F0h] [bp-4h]@3 v5 = a1; v6 = Src; if ( Src ) { v48 = 0; memcpy((void *)(a1 + 108), Src, 0x21u); v40 = *(char **)(v5 + 104); v8 = sub_40CF95(v7); sub_50B9A2(v8, (char *)v6 + 33, 0x110u, v40); v9 = *(_DWORD *)(v5 + 104); if ( v9 ) { v10 = *(_DWORD *)(v9 + 264); v40 = *(char **)(v5 + 104); v39 = v5 + 20; v38 = &v44; *(_DWORD *)v5 = v10; v11 = sub_50C22C(v38, v39, v40); v52 = 0; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::operator=(a5, v11); v52 = -1; sub_402821(&v44); sub_4DC6D9(); v12 = *(char **)(*(_DWORD *)(v5 + 104) + 268); v13 = 276 * *(_DWORD *)(*(_DWORD *)(v5 + 104) + 268) + 305; v49 = v12; v51 = v13; if ( !v12 ) { sub_498D74(&v45); v52 = 1; v14 = stlp_std::locale::locale((stlp_std::locale *)&Src); LOBYTE(v52) = 2; sub_42E050(v14); LOBYTE(v52) = 1; stlp_std::locale::~locale((stlp_std::locale *)&Src); v15 = stlp_std::locale::locale((stlp_std::locale *)&Src); LOBYTE(v52) = 3; sub_50C2A7(&v45, v15); LOBYTE(v52) = 1; stlp_std::locale::~locale((stlp_std::locale *)&Src); a3 = 0; v51 = 0; v49 = &v34; BYTE3(Src) = 0; BYTE3(a1) = 1; sub_52C06B((char *)&a1 + 3, (char *)&Src + 3, &a3, v5 + 68); v29 = v16; v28 = v16; LOBYTE(v52) = 4; Src = &v28; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>( &v28, &v45); LOBYTE(v52) = 1; sub_52BFEC(&v41, v17, v28, v29, v30, v31, v32, v33, v34, v35, v36, v37); LOBYTE(v52) = 6; sub_52C09F(&v42); LOBYTE(v52) = 7; sub_52BCA1(&v42); LOBYTE(v52) = 6; sub_50CA47(&v42); LOBYTE(v52) = 1; sub_50CA47(&v41); sub_52BCF1(a5); v52 = -1; stlp_std::string::dtor(&v45); return 1; } a1 = 0; if ( !v12 ) return 1; a5 = 276; while ( 1 ) { v50 = (int *)operator new(0x114u); v52 = 8; if ( v50 ) { v19 = sub_52B7F1(); v13 = v51; v20 = v19; } else { v20 = 0; } v52 = -1; v40 = (char *)v20; v39 = 276; v38 = (char *)Src + v48 + 305; v21 = sub_40CF95(Src); sub_50B9A2(v21, v38, v39, v40); if ( v20 ) { v48 = a5; sub_49C45F(&v44, v20); v52 = 9; sub_498D74(&v45); LOBYTE(v52) = 10; v22 = stlp_std::locale::locale((stlp_std::locale *)&v47); LOBYTE(v52) = 11; sub_42E050(v22); LOBYTE(v52) = 10; stlp_std::locale::~locale((stlp_std::locale *)&v47); v23 = stlp_std::locale::locale((stlp_std::locale *)&v46); LOBYTE(v52) = 12; sub_50C2A7(&v45, v23); LOBYTE(v52) = 10; stlp_std::locale::~locale((stlp_std::locale *)&v46); v50 = (int *)(v13 + *(_DWORD *)(v20 + 264)); v43 = &v34; sub_52C06B(v20 + 273, v20 + 272, &v50, v5 + 68); v29 = v24; v28 = v24; LOBYTE(v52) = 13; v50 = &v28; stlp_std::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>::basic_string<char,stlp_std::char_traits<char>,stlp_std::allocator<char>>( &v28, &v45); LOBYTE(v52) = 10; sub_52BFEC(&v41, v25, v28, v29, v30, v31, v32, v33, v34, v35, v36, v37); LOBYTE(v52) = 15; sub_52C09F(&v42); LOBYTE(v52) = 16; sub_52BCA1(&v42); LOBYTE(v52) = 15; sub_50CA47(&v42); LOBYTE(v52) = 10; sub_50CA47(&v41); v26 = *(_DWORD *)(v5 + 96); v40 = &v44; v39 = v26; if ( v26 == *(_DWORD *)(v5 + 100) ) { sub_52BDBB(v39, v40); } else { sub_52C0F4(v39, v40); *(_DWORD *)(v5 + 96) += 24; } LOBYTE(v52) = 9; stlp_std::string::dtor(&v45); v52 = -1; sub_402821(&v44); v13 = v51; } v27 = *(_DWORD *)(v20 + 264) + *(_DWORD *)(v20 + 268); v40 = (char *)v20; if ( v13 + v27 > (unsigned int)a3 ) break; operator delete(v40); ++a1; a5 += 276; if ( a1 >= (unsigned int)v49 ) return 1; } operator delete(v40); } } return 0; }

As we already know, "Src" is the address of our mapped view.

Code: memcpy((void *)(a1 + 108), Src, 0x21u); //Src: address of mapped file view



The next interesting function is only three lines below:

Code: sub_50B9A2(v8, (char *)v6 + 33, 0x114u, v40); //v6 = Src

Code: int __stdcall sub_50B9A2(int a1, void *Src, rsize_t DstSize, void *Dst) { memcpy_s(Dst, DstSize, Src, DstSize); return sub_50B91E(a1, Dst, DstSize); }





We still don't know what v8 is though. It is set by a function:

Code: int __thiscall sub_40CF95(void *this) { if ( !dword_F57C24 ) sub_40D5EF(); return dword_F57C24; }

dword_XXXX are global variables. First, it checks if dword_F57C24 is not null. If not, it will call a function (the class' constructor) that will set its value. This is useful when you only need one instance of a class throughout the whole program.

If you know about virtual function tables and take a look at the constructor code, you will notice the class actually has one. It would be nice to see if we could get RTTI info from that (if RTTI is enabled, that is). We will see about it later.



We now know that v8 is definitely a class. Let's leave it for now, go back to 50B9A2 and take a look at the 50B91E function:

Code: _BYTE *__stdcall sub_50B91E(int a1, int a2, unsigned int a3) { unsigned int v3; // [email protected] unsigned int v4; // [email protected] _BYTE *result; // [email protected] v3 = 0; v4 = 0; if ( a3 ) { do { result = (_BYTE *)(v4 + a2); *result ^= *(_BYTE *)(a1 + v3++ + 116); if ( v3 >= 0x1000 ) v3 = 0; ++v4; } while ( v4 < a3 ); } return result; }

Code: _BYTE *__stdcall sub_50B91E(int a1, int pDestBuffer, unsigned int uiSize) { unsigned int v3; // [email protected] unsigned int v4; // [email protected] _BYTE *result; // [email protected] v3 = 0; v4 = 0; if ( uiSize ) { do { result = (_BYTE *)(v4 + pDestBuffer); *result ^= *(_BYTE *)(a1 + v3++ + 116); if ( v3 >= 0x1000 ) v3 = 0; ++v4; } while ( v4 < uiSize ); } return result; }

Code: *result ^= *(_BYTE *)(a1 + v3++ + 116);

This might be a bit hard to understand at first, that's why we're going to switch tools and see it in action now.



Dynamic analysis with OllyDbg



Load pbclient.exe in OllyDbg and let it analyze it. Once that's done, make sure you're in the right module by clicking the "E" button and selecting the "pbclient.exe" module.

We're going to place a breakpoint at the first parameter to the CreateFileA call (remember? It's the one in 0x52AE36), which will be our file name.

(NOTE: DON'T remove breakpoints after you set them, we will probably need them later)

It's at 0x52AE82:





Let's start the game now! If it crashes, I advise you download the Stealth64 plugin if you're on Windows 7 64 bit. I enabled all options except the ones in the "Misc" section, if that helps.

If everything went fine, it should now break here:





We now know it will be working with Anim.bus (also, we have the confirmation it uses this function). Let's place another breakpoint right after the call to MapViewOfFile to get the address of our mapped view (return values are typically stored in EAX):





(by the way, make sure Anim.bus is not opened anywhere, else it might fail!)

If it succeeded, the address of the mapped view will be in the EAX register, right click it in the Registers window and click "Follow in Dump":





Let's go directly to 0x50B91E (where the XORs happen) and place another breakpoint there, then resume the game and the breakpoint should be hit again.



Use F8 to step through the instructions until this one:

Code: 0050B931 |. 8A540A 74 |MOV DL,BYTE PTR DS:[EDX+ECX+74] ; 74h => 114

In more understandable pseudocode, it may look like

Code: uint8_t DL = a1->xor_array[ECX]; //xor_array starts 74h into a1, and ECX is incremented after each iteration

Code: 0050B935 |. 03C6 |ADD EAX,ESI

Follow EAX in the dump, set a breakpoint right after the loop (POP ESI), resume the game and look at the dump window:





It seems to have decrypted the buffer! But that's not everything:





We skipped 33 bytes and decrypted 272. But if we take a closer look at Anim.bus in the Data folder, it tells us that Anim.bus' size is 305 bytes. And it happens that 272 + 33 is...

305! The whole Anim.bus file was decrypted.



So, what does the decrypted buffer tell us? Not much, except we can now read what looks like a folder name.

You can now write a simple application that does exactly what the game does. Use CreateFile, OpenFileMapping and MapViewOfFile to map your file into memory and decrypt the buffer (remember the starting position is the buffer + 33 bytes!). To decrypt it you will also need the whole array of bytes used as XOR keys, which is 4096 bytes long (4KB). It's very easy to get it: close and run the game again, and once you hit that instruction:

Code: 0050B931 |. 8A540A 74 |MOV DL,BYTE PTR DS:[EDX+ECX+74] ; 74h => 114

Here it is if you want to try it out but are having trouble with it:

http://pastebin.com/skZKAx27



My decryption function looks like this:

Code: bool DecryptData(uint8_t* puiBuffer, const size_t uiSize) { if (!uiSize) { return false; } for (size_t i{ 0 }, j{ 0 }; i < uiSize; ++i) { puiBuffer[i] ^= bus::crypt::XOR_ARRAY[j++]; if (xor_idx >= XOR_ARRAY_SIZE) { j = 0; } } return true; }

With that in mind, we can rename them:









If we encounter them again we won't need to waste time looking into them since we already know what they do.





I believe a quick recap of what we know so far is in order:

- Most of the game data is stored in .bus files

- These .bus files can store multiple files at once, so they are actually file archives

- The game uses the Gamebryo Engine since the stored files seem to be either NIF / KFM / DDS

- The game client uses the CreateFile, OpenFileMapping and MapViewOfFile functions to map a whole .bus file into memory

- It then decrypts 272 bytes starting from the mapped view's base address + 33 bytes



Let's get back to the function that calls CopyAndDecrypt. First, it checks if the destination buffer exists, else it will simply return 0. Let's skip directly to this line:

Code: v11 = *(char **)(*(_DWORD *)(v5 + 104) + 268); //dest_buffer + 268

Code: v11 = *reinterpret_cast<uint32_t*>(dest_buffer + 268); //this does the same



Let's analyze the next line:

Code: v12 = 276 * *(_DWORD *)(*(_DWORD *)(v5 + 104) + 268) + 305;

Code: v12 = (276 * v11) + 305;

Now, we can clearly see there are two different structures. Let me explain in case you don't see them:

- The first structure (let's call it S1) is 272 bytes in size and contains a string in form of a char array, which is a folder name. The last 4 bytes of the structure (offset 268) are an integer (v11).

- The second structure (let's call it S2) is 276 bytes in size and a single .bus file seems to be able to store a several ones. If we look at the code above, we can safely assume there are at least 'v11' S2 structures in the .bus file.

- The list of S2 structures start at the offset 305. This is because the first structure occupies 272 bytes, add on top of that the first 33 bytes of the file (they are always 0's).





We now have a much better idea of how a .bus file's metadata is laid out:

OFFSET 0: Zeroed bytes (Size: 33)

OFFSET 33: S1 structure (Size: 272)

OFFSET 305: First S2 structure (Size: 276)

OFFSET 581 (305 + 272): Second S2 structure (Size: 276)

...and so on.



It should now be clearer why Anim.bus is only 305 bytes in size: There are no S2 structures at all in the file.



Let's continue our analysis:



Code: if ( !v11 ) { sub_498D74(&v45); v52 = 1; v13 = stlp_std::locale::locale((stlp_std::locale *)&Src); LOBYTE(v52) = 2; sub_42E050(v13); LOBYTE(v52) = 1; stlp_std::locale::~locale((stlp_std::locale *)&Src); v14 = stlp_std::locale::locale((stlp_std::locale *)&Src); LOBYTE(v52) = 3; sub_50C2A7(&v45, v14); LOBYTE(v52) = 1; stlp_std::locale::~locale((stlp_std::locale *)&Src); a3 = 0; v51 = 0; v49 = &v33; BYTE3(Src) = 0; BYTE3(a1) = 1; sub_52C06B(&v51, (int)&v33, (_BYTE *)&a1 + 3, (_BYTE *)&Src + 3, &a3, v5 + 68); v28 = v15; v27 = v15; LOBYTE(v52) = 4; Src = &v27; std::string::ctor_1(&v27, &v45); LOBYTE(v52) = 1; sub_52BFEC(&v41, v16, v27, v28, v29, v30, v31, v32, v33, v34, v35, v36); LOBYTE(v52) = 6; sub_52C09F(&v42); LOBYTE(v52) = 7; sub_52BCA1(&v42); LOBYTE(v52) = 6; sub_50CA47(&v42); LOBYTE(v52) = 1; sub_50CA47(&v41); sub_52BCF1(a5); v52 = -1; stlp_std::string::dtor(&v45); return 1; }

But wait... would this REALLY interest us? Anim.bus was fully decrypted already and all it seemed to contain was a folder name. We don't want to have a perfect understanding of how the client does everything, we just need an understanding of how the client decrypts .bus files and parses them so that we can reproduce it and extract them without the help of the client.



With that in mind, the next step will be to find a file whose in which the following:

Code: *reinterpret_cast<uint32_t*>(dest_buffer + 268);

- first, remove the breakpoint you set at 0x50B91E.

- We will now place a breakpoint at 0x52B94F. This is the instruction that initializes v11. If [EAX + 0x10C] (10Ch: 268) is 0, then we know it won't help us.

- Press F9 to resume the game and it should hit the CreateFileA breakpoint again. Copy the file name just in case. Press F9 again until you land on the breakpoint you just set at 0x52B94F. As said above, if [EAX + 0x10C] (EAX holds the address of our S1 structure) is 0, we can skip it.

- Continue to press F9 until [EAX + 0x10C] isn't 0 anymore.



If you've done everything correctly, [EAX + 0x10C] will be equal to 0x47 (or 71) at some point, and the mapped .bus file should be called "Anim_Mob_Zombie_Melee_Zako.bus". According to the conclusions we drew earlier, this means there are 71 S2 structures in the file, starting at offset 305.

Since v11 is not 0 anymore, these instructions will be executed instead:

Code: a1 = 0; if ( !v11 ) return 1; a5 = 276; while ( 1 ) { v52 = (int *)operator new(0x114u); v54 = 8; if ( v52 ) { v19 = sub_52B7F1((int)v19); v12 = v53; v20 = v19; } else { v20 = 0; } v54 = -1; dest_buffer = (char *)v20; v41 = 276; v40 = (char *)Src + v50 + 305; v39 = Src; v21 = sub_40CF95(); CopyAndDecrypt(v21, v40, v41, dest_buffer); if ( v20 ) { v50 = a5; sub_49C45F(&v46, v20); v54 = 9; sub_498D74(&v47); LOBYTE(v54) = 10; v22 = stlp_std::locale::locale((stlp_std::locale *)&v49); LOBYTE(v54) = 11; sub_42E050(v22); LOBYTE(v54) = 10; stlp_std::locale::~locale((stlp_std::locale *)&v49); v23 = stlp_std::locale::locale((stlp_std::locale *)&v48); LOBYTE(v54) = 12; sub_50C2A7((int)&v47, v23); LOBYTE(v54) = 10; stlp_std::locale::~locale((stlp_std::locale *)&v48); v52 = (int *)(v12 + *(_DWORD *)(v20 + 264)); v45 = &v35; sub_52C06B((_DWORD *)(v20 + 268), (int)&v35, (_BYTE *)(v20 + 273), (_BYTE *)(v20 + 272), &v52, v5 + 68); v30 = v24; v29 = v24; LOBYTE(v54) = 13; v52 = &v29; std::string::ctor_1(&v29, &v47); LOBYTE(v54) = 10; v26 = (_DWORD *)sub_52BFEC((int)&v43, v25, v29, v30, v31, v32, v33, v34, v35, v36, v37, v38); LOBYTE(v54) = 15; sub_52C09F(v26, (int)&v44); LOBYTE(v54) = 16; sub_52BCA1(&v44); LOBYTE(v54) = 15; sub_50CA47(&v44); LOBYTE(v54) = 10; sub_50CA47(&v43); v27 = *(_DWORD *)(v5 + 96); dest_buffer = &v46; v41 = v27; if ( v27 == *(_DWORD *)(v5 + 100) ) { sub_52BDBB(v41, dest_buffer); } else { sub_52C0F4(v41, dest_buffer); *(_DWORD *)(v5 + 96) += 24; } LOBYTE(v54) = 9; stlp_std::string::dtor(&v47); v54 = -1; sub_402821(&v46); v12 = v53; } v28 = *(_DWORD *)(v20 + 264) + *(_DWORD *)(v20 + 268); dest_buffer = (char *)v20; if ( v12 + v28 > (unsigned int)a3 ) break; operator delete(dest_buffer); ++a1; a5 += 276; if ( a1 >= (unsigned int)v51 ) return 1; } operator delete(dest_buffer);

Code: int __usercall [email protected]<eax>(int [email protected]<esi>) { *(_BYTE *)(a1 + 272) = 0; *(_BYTE *)(a1 + 273) = 0; memset((void *)a1, 0, 261u); return a1; }

OFFSET 272: Unknown byte (Size : 1)

OFFSET 273: Unknown byte (Size : 1)



Since we already renamed the CopyAndDecrypt function it clearly appears in the decompiler output, so we can modify variable names accordingly:

Code: v55 = -1; dest_buffer = (char *)v21; size = 276; curr_src_buffer = (char *)Src + v51 + 305; v40 = Src; v22 = sub_40CF95(); CopyAndDecrypt(v22, curr_src_buffer, size, dest_buffer);

Code: curr_src_buffer = (char *)Src + v51 + 305;

Great, this seems to be decrypting a S2 structure! Let's switch to OllyDbg now. Step through the code until the arguments for the calls are pushed onto the stack:

Code: 0052BACE |. 57 |PUSH EDI 0052BACF |. 68 14010000 |PUSH 114 0052BAD4 |. 8D8401 3101000>|LEA EAX,DWORD PTR DS:[ECX+EAX+131] 0052BADB |. 50 |PUSH EAX 0052BADC |. 51 |PUSH ECX 0052BADD |. E8 B314EEFF |CALL pbclient.0040CF95 0052BAE2 |. 59 |POP ECX ; | 0052BAE3 |. 50 |PUSH EAX ; |Arg1 0052BAE4 |. E8 B9FEFDFF |CALL pbclient.0050B9A2 ; \pbclient.0050B9A2 0052BAE9 |. 85FF |TEST EDI,EDI

Let's take a look at the dump window after the call:

SS_1.png



Couldn't be any clearer in my opinion! We previously discovered that unencrypted models / textures / animations were stored in .bus files. This S2 structure stores the file name for one of the files in the archive. And since we also know there can be multiple S2 structures in one archive, this means there must be one for each file. And how do we know how many S2 structures there are?

Code: v11 = *reinterpret_cast<uint32_t*>(dest_buffer + 268); //S1 structure



Keep on stepping through until you reach:

Code: 0052BB62 |. 8B87 08010000 |MOV EAX,DWORD PTR DS:[EDI+108] ; + 264 0052BB68 |. 03C6 |ADD EAX,ESI

EAX contains the result of this expression (we've seen it earlier, remember?):

Code: v12 = (276 * v11) + 305; //Which we now know is... v12 = (sizeof(S2) * file_count) + sizeof(S1) + 33;

[METADATA]

OFFSET 0: Zeroed bytes (Size: 33)

OFFSET 33: S1 structure (Size: 272)

OFFSET 305: First S2 structure (Size: 276)

OFFSET 581 (305 + 272): Second S2 structure (Size: 276)

...and so on...

OFFSET v12: End of S2 structures



Let's continue to step through the code until we reach:

Code: 0052BB75 |. 8D43 44 |LEA EAX,DWORD PTR DS:[EBX+44] 0052BB78 |. 50 |PUSH EAX 0052BB79 |. 8D45 EC |LEA EAX,DWORD PTR SS:[EBP-14] 0052BB7C |. 50 |PUSH EAX 0052BB7D |. 8D87 10010000 |LEA EAX,DWORD PTR DS:[EDI+110] ;+ 272 0052BB83 |. 50 |PUSH EAX 0052BB84 |. 8D87 11010000 |LEA EAX,DWORD PTR DS:[EDI+111] ;+ 273 0052BB8A |. 50 |PUSH EAX 0052BB8B |. 8D87 0C010000 |LEA EAX,DWORD PTR DS:[EDI+10C] ;+ 268 0052BB91 |. E8 D5040000 |CALL pbclient.0052C06B

Code: 0052C06B /$ 55 PUSH EBP 0052C06C |. 8BEC MOV EBP,ESP 0052C06E |. 51 PUSH ECX 0052C06F |. 8B00 MOV EAX,DWORD PTR DS:[EAX] 0052C071 |. FF75 14 PUSH DWORD PTR SS:[EBP+14] 0052C074 |. 8365 FC 00 AND DWORD PTR SS:[EBP-4],0 0052C078 |. 8906 MOV DWORD PTR DS:[ESI],EAX 0052C07A |. 8B45 08 MOV EAX,DWORD PTR SS:[EBP+8] 0052C07D |. 8A00 MOV AL,BYTE PTR DS:[EAX] 0052C07F |. 8846 04 MOV BYTE PTR DS:[ESI+4],AL 0052C082 |. 8B45 0C MOV EAX,DWORD PTR SS:[EBP+C] 0052C085 |. 8A00 MOV AL,BYTE PTR DS:[EAX] 0052C087 |. 8846 08 MOV BYTE PTR DS:[ESI+8],AL 0052C08A |. 8B45 10 MOV EAX,DWORD PTR SS:[EBP+10] 0052C08D |. 8B00 MOV EAX,DWORD PTR DS:[EAX] 0052C08F |. 8D4E 10 LEA ECX,DWORD PTR DS:[ESI+10] 0052C092 |. 8946 0C MOV DWORD PTR DS:[ESI+C],EAX 0052C095 |. FF15 442DCD00 CALL DWORD PTR DS:[<&stlport.5.2.??0?$ba>; [email protected][email protected]@[email protected]@[email protected]@[email protected]@[email protected]@[email protected]@@Z 0052C09B |. 8BC6 MOV EAX,ESI 0052C09D |. C9 LEAVE 0052C09E \. C3 RETN

We can now add more information to our S2 structure:

[S2 STRUCTURE]

OFFSET 0: File name (Size: 264 (it could be less, but it's just so that it fits for now))

OFFSET 264 [EDI + 0x108]: Unknown integer (Size: 4)

OFFSET 268 [EDI + 0x10C]: Unknown integer (Size: 4)

OFFSET 272 [EDI + 0x110]: Unknown byte (Size: 1)

OFFSET 273 [EDI + 0x111]: Unknown byte (Size: 1)



Wait a minute... if you check every of these offset, you will notice there's a value at offset 268: 0xA53F (Decimal: 42303). And if we skip to this line:

Code: v28 = *(_DWORD *)(v20 + 264) + *(_DWORD *)(v20 + 268);

Now, what could a3 be? Let's just find cross references to the current function we're in (sub_52B8BA), and we land back here:

Code: if ( (unsigned __int8)sub_52B8BA(v12, (void *)v13, v57, a1 + 32, (int)&v48) )

Code: v57 = GetFileSize(v7, 0);

We can update our parameter names:

Code: char __stdcall sub_52B8BA(int a1, void *Src, int uiFileSize, int a4, int a5)

Code: v28 = *(_DWORD *)(v20 + 264) + *(_DWORD *)(v20 + 268); dest_buffer = (char *)v20; if ( v12 + v28 > (unsigned int)uiFileSize )

ITERATION 0: OFFSET 264: 0 - OFFSET 268: 42303



And If we look at the rest of the loop:

Code: ++a1; a5 += 276; if ( a1 >= (unsigned int)v51 ) return 1;

We now know that the function will return if either:

- [S2 + 264] + [S2 + 268] > File size

- i (a1) >= file count



Also, a5 is incremented by sizeof(S2) (276). If we scroll up, we can see that it was previously initialized to the same value (276).



Since a1 gets incremented during every iteration, we could see it as a for loop:

Code: //pseudocode for(size_t i{ 0 }, j{ 0 }; i < file_count; ++i, j += sizeof(S2)) { /* Decrypt data, ... */ if(value_at_offset264 + value_at_offset268 > uiFileSize) { break; } }

ITERATION 0: OFFSET 264: 0 - OFFSET 268: 42303 - a5: 552 (276 + 276)



What's going to matter the most now is the second iteration.

Code: dest_buffer = (char *)v20; v41 = 276; v40 = (char *)Src + v50 + 305;

Let's go back to OllyDbg and keep stepping until we reach the end of the loop and these instructions:

Code: 0052BACE |. 57 |PUSH EDI 0052BACF |. 68 14010000 |PUSH 114 0052BAD4 |. 8D8401 3101000>|LEA EAX,DWORD PTR DS:[ECX+EAX+131] 0052BADB |. 50 |PUSH EAX 0052BADC |. 51 |PUSH ECX 0052BADD |. E8 B314EEFF |CALL pbclient.0040CF95 0052BAE2 |. 59 |POP ECX ; | 0052BAE3 |. 50 |PUSH EAX ; |Arg1 0052BAE4 |. E8 B9FEFDFF |CALL pbclient.0050B9A2 ; \pbclient.0050B9A2





As expected, another file name. What we're more interested in though is the values at offset 264 and 268. Remember, first iteration:

ITERATION 0: OFFSET 264: 0 - OFFSET 268: 42303 - a5: 552 (276 + 276)



Now if we check the values at these offsets again, we get:

ITERATION 0: OFFSET 264: 42303 - OFFSET 268: 12125



Is this starting to make more sense? If you repeat those loops over and over again, you should clearly see a pattern.

OFFSET 264: The position of the beginning of the file in the archive

OFFSET 268: The file's length



We can update our S2 structure!

OFFSET 0: File name (Size: 264 (it could be less, but this is more convenient for us)

OFFSET 264 [EDI + 0x108]: Position in the file (Size: 4)

OFFSET 268 [EDI + 0x10C]: File length (Size: 4)

OFFSET 272 [EDI + 0x110]: Unknown byte (Size: 1)

OFFSET 273 [EDI + 0x111]: Unknown byte (Size: 1)



It wouldn't make sense if the position was 0 in the archive though, it has to be relative to something... And if you have a good memory, you may remember this:

[METADATA]

OFFSET 0: Zeroed bytes (Size: 33)

OFFSET 33: S1 structure (Size: 272)

OFFSET 305: First S2 structure (Size: 276)

OFFSET 581 (305 + 272): Second S2 structure (Size: 276)

...and so on...

OFFSET v12: End of S2 structures



It is highly likely that v12 is the beginning of the actual file data! We have enough information to verify this right now.:

- We know the .bus file "Anim_Mob_Zombie_Melee_Zako.bus" contains 71 files

- The size of a S2 structure is 276, so 276 * 71 = 19596

- We have to add the size of the S1 structure: 19596 + 272 = 19868

- And finally, we have to add the 33 zeroed bytes: 19868 + 33 = 19901



Let's open Anim_Mob_Zombie_Melee_Zako.bus in HexEdit and go to that position. Use this field to type in the position you'd like to go to:





You should land here:





This is indeed the beginning of a file. We know that file is 42303 bytes in size, so let's add it to 19901:





Another file!



It may not seem too obvious right but we now have everything we need to make an extractor. Let me show you something you may have thought of already:



Code: //S1 Structure + 33 bytes struct FILE_HEADER { char data[33]; char folder_path[256]; //0x0021 char pad[12]; // 0x100 size_t file_count; //0x10C }; //Size: 0x131 (305) Code: #pragma pack(push, 1) S2 Structure struct FILE_INFO { char path[256]; //0x0000 char pad[8]; //0x0100 uint32_t offset; //0x0108 uint32_t length; //0x010C uint8_t unk_01; //0x0110 bool encrypted; //0x0111 uint8_t unk_03; //0x0112 uint8_t unk_04; //0x0113 }; //Size: 0x114 (276) #pragma pack(pop)

So, how would we make a program that extracts files from a .bus archive? It may not be completely clear how, so let me explain it step by step:



- Map the whole file into memory using MapViewOfFile

- Decrypt the S1 structure which is located at the mapped view's base address + 33. The file_count value will tell you how many FILE_INFO structures are in the archive. There are as many FILE_INFO structures as there are files.

- If the file count isn't 0, get the offset to the beginning of the list of FILE_INFO structures, which we know is:

Code: sizeof(FILE_HEADER); //305

Code: std::vector<FILE_INFO> v_info; for (uint32_t i{ 0 }; i < header.file_count; ++i) { auto fi = *reinterpret_cast<FILE_INFO*>(sizeof(FILE_HEADER) + (i * sizeof(FILE_INFO)); v_info.push_back(fi); }

Code: sizeof(FILE_HEADER) + (file_count * sizeof(FILE_INFO))

Code: const size_t file_data_offset{ sizeof(FILE_HEADER) + (header.file_count * sizeof(FILE_INFO) }; size_t processed_size { 0 }; for (const auto& it : v_info) { auto src_buffer = reinterpret_cast<uint8_t*>(file_data_offset + processed_size); auto dest_buffer = std::make_unique<uint8_t[]>(it.length); memcpy_s(dest_buffer.get(), it.length, src_buffer, it.length); /* The current file data will now be contained in dest_buffer. You can implement your own function that dumps it to a file and call it here. */ processed_size += it.length; //Also, don't forget error checking, etc! I am only showing this as an example }





You can recreate the whole directory structure with this information.

Assuming you're going to do that, the data folder's root should look like this after extraction.

Hi everyone,The following is a tutorial I wrote about 3 years ago. I found it very recently while going through old files and figured it would be helpful to some people here. I chose to leave it mostly unedited, so I hope you don't mind my past writing style / way of explaining things. There may also be some inconsistencies here and there.------------------------------------------------Hello,I haven't seen a lot of threads discussing the reversal of unknown file formats. People generally tend to use available extractors, however sometimes no public information is available for the format in question (for example, when the developer company uses their own format to protect their files).Formats can be very different, hence the importance of the title: this is simply an example. The aim of this tutorial is to show you some of the basic steps to work with an unknown format.An understanding of the following languages is required:- C++- x86 assembly (Intel syntax)I will attempt to make it as easy as possible for anyone to follow, but make sure you're not completely clueless about the above.I will be using the following tools:- HexEdit (or an hex editor of your choice)- OllyDBG (with the Stealth64 plugin, choose whatever works best for you)- IDA Pro (I'm using 6.8) with the Hex-Rays DecompilerI also assume you're familiar with each of those tools.We're going to be analyzing the file format Brawl Busters uses for protecting its data. Brawl Busters is an action-combat game that is not being published anywhere anymore, luckily the setup is still available for download - you can find it very easily with a simple Google search.Throughout the tutorial, I will also provide some pseudocode to show basic steps of making an extractor based on what we will have analyzed.After downloading the game, open its installation folder, and then the "bin" folder. You should see a file called "pbclient.exe" : that's the game client.I've uploaded a patched version of pbclient.exe, which is the one I have been using. I've simply patched some conditional jumps to prevent XTrap from loading. In the bin folder, rename pbclient.exe to anything you like and put the patched pbclient.exe in there.Once that's done, we should have the same installation folder. Time to get started!Taking a look at the root folder, the most logical thing to do would be to find out where the actual game data is stored. This isn't hard to find at all, since there is a "Data" folder, so let's open it and see what's inside:This does look like game data indeed! If you scroll down, you will also see some config files (.ini) as well as a few images (Splash.png for example).Most of the files in this folder have a weird .bus extension and seem to store most of the UI data. And that's pretty much what we will be interested in.For starters, what could .bus mean? The game is called "Brawl Busters" so it's very likely that ".bus" stands for ".busters".We haven't done much so far, but we already know that the game stores most of its data in .bus files, and that it definitely looks like a proprietary format (Google won't give you any information).So, what do they look like? This is where the hex editor will be useful. I always use it before using Notepad, because it's highly likely that the file can't be read as a plain text file. Let's open a small one, such as "Anim.bus":That doesn't tell us much... it looks like the data is encrypted.Let's open a larger one, say, "Anim_Mob_Zombie_Unique_Heavy.bus":This isn't really telling us much either... except one similarity: the first 33 bytes of each file are zeroed. If we open more .bus files, it turns out the first 33 bytes are always filled with 0's. This might be a useful detail, so keep it in mind.There is another detail: If you scroll down to 0x4ED0, you will notice some plain text:This is way more interesting already! If we Google "Gamebryo File Format, Version 20.6.0.0", we can see that it's a known format called "NIF" (it stands for NetImmerse File, but that isn't important).NIF files seem to be models, and their textures are loaded using DDS files. If we search "DDS" in HexEdit, we also get some results such as "texname="FX_GhostTrail_A_00.dds". This means some bus files are also storing DDS files.Searching for "Gamebryo" again returns several results. What conclusions can we draw from this? Well, .bus files seem to be archives, that is, they store multiple files at once. The data before the file data is most probably encrypted metadata (information about the archive as well as its files).Let's make a quick recap of what we know:- Most of the game data is stored in .bus files- These .bus files can store multiple files at once, so they are actually file archives- The game uses the Gamebryo Engine (or something that is based on it) since the stored files seem to be either NIF / KFM / DDS (if you want to verify that yourself, open other files and search for these three strings).What's next? We now have to figure out how the client decrypts / loads these files. This is where it gets more interesting.Let's go back to the "bin" folder, open pbclient.exe with IDA Pro, and let it analyze the file.The first thing we will be looking for is imports. What imports exactly? Well, since the client has to open these files, it's likely that it will use APIs such as CreateFile or fopen.Let's search for the first one, but wait a second:CreateFileMapping takes a handle to a file which is typically returned by an API such as CreateFile, let's cross-reference that API instead.The first reference we find is in this function:(NOTE: For ease of understanding, I will use the decompiler most of the time, its output should not be fully trusted but it gives us a general overview of what the function does)This function seems to create a file based on a supplied file name (a1 is a string, more on that later). The a3 parameter will contain the base address of the mapped file (MapViewOfFile) if no error occured.Let's look at the second reference to CreateFileMappingA, which is in this function:Guess we're in luck! The client seems to be mapping bus files using CreateFileMappingA and MapViewOfFile. We can even see some error strings.Before we continue, let's clear something up first. This line appears often and can seem to be very cryptic (line 56):Let's break it down:The game client doesn't use the C++ STL directly: it uses stlport, and this is why stlp_std appears often. This is pretty much like std:: .If we google basic_string, we will stumble upon this:As you can see, "traits" and "Alloc" are default parameters, so we can omit them.In our case the char type is used as the charT parameter, which means this would do the same thing:What if we google it?Hopefully this cleared it up for you! That whole function could be reduced to:Right click it in IDA Pro, choose "Rename global item" and change it to "stlp_std::string::ctor_0". You should now see this instead:(NOTE: I recommend you do this for every every constructor/function of std::string, that's also why I am numbering them. It makes the decompiler output clearer)Back to the actual function! What interests us next is the call to MapViewOfFile (line 100). If you don't know what it does exactly:In this case v13 will contain the base address of the mapped view (the whole file is mapped into memory). It is then passed as the second parameter to this function:Let's have a look at the function:NOTE: You probably have noticed, but I am openly ignoring the rest of the arguments (and a lot of other functions). I will only talk about them if needed.As we already know, "Src" is the address of our mapped view.0x21h is 33 in decimal. 33... doesn't this remind you of something? That's right: the first 33 bytes of every .bus files are always filled with 0's. This will copy 33 bytes at the address a1 + 0x108.The next interesting function is only three lines below:0x114h is 272 in decimal, and it's passing the address of the source buffer + 33 (skipping all the zeroes). Let's look at what it does:A lot clearer already, isn't it? We now know that 272 bytes will be processed. If we go back to the previous function, we can now rename some variables:We still don't know what v8 is though. It is set by a function:This very much looks like the Singleton design pattern. If you're unfamiliar with it, I suggest you google it, but here's a brief explanation:dword_XXXX are global variables. First, it checks if dword_F57C24 is not null. If not, it will call a function (the class' constructor) that will set its value. This is useful when you only need one instance of a class throughout the whole program.If you know about virtual function tables and take a look at the constructor code, you will notice the class actually has one. It would be nice to see if we could get RTTI info from that (if RTTI is enabled, that is). We will see about it later.We now know that v8 is definitely a class. Let's leave it for now, go back to 50B9A2 and take a look at the 50B91E function:Looks like this is it! First, let's rename the last two parameters since we already know what they are:The most important line is definitely this one:XOR is commonly used in encryption / decryption processes. As you can see, it's going byte by byte into an array located at a1 + 116 + v3 and XORing every single byte of the encrypted buffer with that array. If v3 is equal to 4096 (0x1000h), then it is reset to 0. The XORing goes on until v4 (incremented at the end of each iteration) is equal to or higher than the uiSize parameter.This might be a bit hard to understand at first, that's why we're going to switch tools and see it in action now.Load pbclient.exe in OllyDbg and let it analyze it. Once that's done, make sure you're in the right module by clicking the "E" button and selecting the "pbclient.exe" module.We're going to place a breakpoint at the first parameter to the CreateFileA call (remember? It's the one in 0x52AE36), which will be our file name.(NOTE: DON'T remove breakpoints after you set them, we will probably need them later)It's at 0x52AE82:Let's start the game now! If it crashes, I advise you download the Stealth64 plugin if you're on Windows 7 64 bit. I enabled all options except the ones in the "Misc" section, if that helps.If everything went fine, it should now break here:We now know it will be working with Anim.bus (also, we have the confirmation it uses this function). Let's place another breakpoint right after the call to MapViewOfFile to get the address of our mapped view (return values are typically stored in EAX):(by the way, make sure Anim.bus is not opened anywhere, else it might fail!)If it succeeded, the address of the mapped view will be in the EAX register, right click it in the Registers window and click "Follow in Dump":Let's go directly to 0x50B91E (where the XORs happen) and place another breakpoint there, then resume the game and the breakpoint should be hit again.Use F8 to step through the instructions until this one:Here, ECX is used as the counter for the array. It's added to EDX, which is a1 (remember, a1 is a class). Then the offset to the beginning of the array is added (74h / 114).In more understandable pseudocode, it may look likeLet's look at the line below:EAX holds the address of the destination buffer. It's also incremented at the end of each iteration until 'uiSize' bytes have been XORed.Follow EAX in the dump, set a breakpoint right after the loop (POP ESI), resume the game and look at the dump window:It seems to have decrypted the buffer! But that's not everything:We skipped 33 bytes and decrypted 272. But if we take a closer look at Anim.bus in the Data folder, it tells us that Anim.bus' size is 305 bytes. And it happens that 272 + 33 is...305! The whole Anim.bus file was decrypted.So, what does the decrypted buffer tell us? Not much, except we can now read what looks like a folder name.You can now write a simple application that does exactly what the game does. Use CreateFile, OpenFileMapping and MapViewOfFile to map your file into memory and decrypt the buffer (remember the starting position is the buffer + 33 bytes!). To decrypt it you will also need the whole array of bytes used as XOR keys, which is 4096 bytes long (4KB). It's very easy to get it: close and run the game again, and once you hit that instruction:Look at the value of EDX, add 114 to it, that's the address of the beginning of the array. Follow it in the dump window, add 4096 to that address, that's the address of the end of the array. You just need to copy all the bytes between these two addresses and format everything in Notepad++.Here it is if you want to try it out but are having trouble with it:My decryption function looks like this:Let's get back to IDA. We have clearly identified the purposes of two functions. One copies memory and calls the decryption function, and the other is the decryption function itself.With that in mind, we can rename them:If we encounter them again we won't need to waste time looking into them since we already know what they do.I believe a quick recap of what we know so far is in order:- Most of the game data is stored in .bus files- These .bus files can store multiple files at once, so they are actually file archives- The game uses the Gamebryo Engine since the stored files seem to be either NIF / KFM / DDS- The game client uses the CreateFile, OpenFileMapping and MapViewOfFile functions to map a whole .bus file into memory- It then decrypts 272 bytes starting from the mapped view's base address + 33 bytesLet's get back to the function that calls CopyAndDecrypt. First, it checks if the destination buffer exists, else it will simply return 0. Let's skip directly to this line:The first thing it does is getting the address of the destination buffer, it then looks at the integer value 0x10C (268d) bytes into that address. This might make it clearer for you:In Anim.bus' case, v11 will be 0.Let's analyze the next line:We can reduce it to:305.. Rings a bell? Yep, that was the size in bytes of Anim.bus.Now, we can clearly see there are two different structures. Let me explain in case you don't see them:- The first structure (let's call it S1) is 272 bytes in size and contains a string in form of a char array, which is a folder name. The last 4 bytes of the structure (offset 268) are an integer (v11).- The second structure (let's call it S2) is 276 bytes in size and a single .bus file seems to be able to store a several ones. If we look at the code above, we can safely assume there are at least 'v11' S2 structures in the .bus file.- The list of S2 structures start at the offset 305. This is because the first structure occupies 272 bytes, add on top of that the first 33 bytes of the file (they are always 0's).We now have a much better idea of how a .bus file's metadata is laid out:OFFSET 0: Zeroed bytes (Size: 33)OFFSET 33: S1 structure (Size: 272)OFFSET 305: First S2 structure (Size: 276)OFFSET 581 (305 + 272): Second S2 structure (Size: 276)...and so on.It should now be clearer why Anim.bus is only 305 bytes in size: There are no S2 structures at all in the file.Let's continue our analysis:This is the part that will interest us for now, because since v11 is equal to 0, this will be executed and the function will then return.But wait... would this REALLY interest us? Anim.bus was fully decrypted already and all it seemed to contain was a folder name. We don't want to have a perfect understanding of how the client does everything, we just need an understanding of how the client decrypts .bus files and parses them so that we can reproduce it and extract them without the help of the client.With that in mind, the next step will be to find a file whose in which the following:...would not return 0. It's not a tedious task at all:- first, remove the breakpoint you set at 0x50B91E.- We will now place a breakpoint at 0x52B94F. This is the instruction that initializes v11. If [EAX + 0x10C] (10Ch: 268) is 0, then we know it won't help us.- Press F9 to resume the game and it should hit the CreateFileA breakpoint again. Copy the file name just in case. Press F9 again until you land on the breakpoint you just set at 0x52B94F. As said above, if [EAX + 0x10C] (EAX holds the address of our S1 structure) is 0, we can skip it.- Continue to press F9 until [EAX + 0x10C] isn't 0 anymore.If you've done everything correctly, [EAX + 0x10C] will be equal to 0x47 (or 71) at some point, and the mapped .bus file should be called "Anim_Mob_Zombie_Melee_Zako.bus". According to the conclusions we drew earlier, this means there are 71 S2 structures in the file, starting at offset 305.Since v11 is not 0 anymore, these instructions will be executed instead:First thing we immediately notice is, sizeof(S2) (276) bytes are allocated on the heap. sub_52B7F1 seems to be a constructor, let's take a look:This gives us some information about the S2 structure:OFFSET 272: Unknown byte (Size : 1)OFFSET 273: Unknown byte (Size : 1)Since we already renamed the CopyAndDecrypt function it clearly appears in the decompiler output, so we can modify variable names accordingly:This line is the most important:Src is the address of the mapped file view. We know that 305 is the offset to the start of the S2 structures. As for v51, if we look earlier in the code, it is initialized to 0.Great, this seems to be decrypting a S2 structure! Let's switch to OllyDbg now. Step through the code until the arguments for the calls are pushed onto the stack:Grab EDI's value and follow it in the dump window, this is our destination buffer. Now we simply need to step through the code until the call to CopyAndDecrypt (pbclient.0050B9A2) is executed.Let's take a look at the dump window after the call:Couldn't be any clearer in my opinion! We previously discovered that unencrypted models / textures / animations were stored in .bus files. This S2 structure stores the file name for one of the files in the archive. And since we also know there can be multiple S2 structures in one archive, this means there must be one for each file. And how do we know how many S2 structures there are?Right! We can now rename v11 to "file_count" because that's exactly what it is. There were no files stored in Anim.bus, and that explains why file_count was set to 0.Keep on stepping through until you reach:We now know the structure has 4 bytes of data (an integer) that may interest us, at offset 264. It is set to 0 in this structure though.EAX contains the result of this expression (we've seen it earlier, remember?):What is v12 exactly? It's the offset to the end of the list of S2 structures. Let's add it to our .bus metadata layout:[METADATA]OFFSET 0: Zeroed bytes (Size: 33)OFFSET 33: S1 structure (Size: 272)OFFSET 305: First S2 structure (Size: 276)OFFSET 581 (305 + 272): Second S2 structure (Size: 276)...and so on...OFFSET v12: End of S2 structuresLet's continue to step through the code until we reach:Interesting, it's using some of our S2 structure's data and passing their addresses as arguments to the call. Let's step into pbclient.0052C06B (F7):This seems to be setting up another structure. Do we care? For now, not really... we can make our own additional structures if needed, but we are mainly interested in what these fields actually mean, and that structure may help us, so let's keep it in mind.We can now add more information to our S2 structure:[S2 STRUCTURE]OFFSET 0: File name (Size: 264 (it could be less, but it's just so that it fits for now))OFFSET 264 [EDI + 0x108]: Unknown integer (Size: 4)OFFSET 268 [EDI + 0x10C]: Unknown integer (Size: 4)OFFSET 272 [EDI + 0x110]: Unknown byte (Size: 1)OFFSET 273 [EDI + 0x111]: Unknown byte (Size: 1)Wait a minute... if you check every of these offset, you will notice there's a value at offset 268: 0xA53F (Decimal: 42303). And if we skip to this line:What does this mean? There's also a value at offset 264! The values at these two offsets are added together and if they're greater than "a3" (which is the third argument passed to the current function), then it breaks.Now, what could a3 be? Let's just find cross references to the current function we're in (sub_52B8BA), and we land back here:a3 being the third parameter, we want to know what v57 is. And if we scroll up a bit...Self explanatory isn't it? v57 is simply the size of the bus file, and the size of the file is passed as the third argument to sub_52B8BA.We can update our parameter names:Scrolling back down, we now see this instead:Also, what we know about the current iteration so far:ITERATION 0: OFFSET 264: 0 - OFFSET 268: 42303And If we look at the rest of the loop:If you scroll up, you will notice that v51 is actually our file count.We now know that the function will return if either:- [S2 + 264] + [S2 + 268] > File size- i (a1) >= file countAlso, a5 is incremented by sizeof(S2) (276). If we scroll up, we can see that it was previously initialized to the same value (276).Since a1 gets incremented during every iteration, we could see it as a for loop:Let's update what we know about the iteration:ITERATION 0: OFFSET 264: 0 - OFFSET 268: 42303 - a5: 552 (276 + 276)What's going to matter the most now is the second iteration.If you check the code, you will see that v51 was initialized to 0 at the beginning of the function, and then it takes a5's value (which was 552).Let's go back to OllyDbg and keep stepping until we reach the end of the loop and these instructions:Once again, EDI is our destination buffer, so grab its value and follow it in the dumpAs expected, another file name. What we're more interested in though is the values at offset 264 and 268. Remember, first iteration:ITERATION 0: OFFSET 264: 0 - OFFSET 268: 42303 - a5: 552 (276 + 276)Now if we check the values at these offsets again, we get:ITERATION 0: OFFSET 264: 42303 - OFFSET 268: 12125Is this starting to make more sense? If you repeat those loops over and over again, you should clearly see a pattern.OFFSET 264: The position of the beginning of the file in the archiveOFFSET 268: The file's lengthWe can update our S2 structure!OFFSET 0: File name (Size: 264 (it could be less, but this is more convenient for us)OFFSET 264 [EDI + 0x108]: Position in the file (Size: 4)OFFSET 268 [EDI + 0x10C]: File length (Size: 4)OFFSET 272 [EDI + 0x110]: Unknown byte (Size: 1)OFFSET 273 [EDI + 0x111]: Unknown byte (Size: 1)It wouldn't make sense if the position was 0 in the archive though, it has to be relative to something... And if you have a good memory, you may remember this:[METADATA]OFFSET 0: Zeroed bytes (Size: 33)OFFSET 33: S1 structure (Size: 272)OFFSET 305: First S2 structure (Size: 276)OFFSET 581 (305 + 272): Second S2 structure (Size: 276)...and so on...OFFSET v12: End of S2 structuresIt is highly likely that v12 is the beginning of the actual file data! We have enough information to verify this right now.:- We know the .bus file "Anim_Mob_Zombie_Melee_Zako.bus" contains 71 files- The size of a S2 structure is 276, so 276 * 71 = 19596- We have to add the size of the S1 structure: 19596 + 272 = 19868- And finally, we have to add the 33 zeroed bytes: 19868 + 33 = 19901Let's open Anim_Mob_Zombie_Melee_Zako.bus in HexEdit and go to that position. Use this field to type in the position you'd like to go to:You should land here:This is indeed the beginning of a file. We know that file is 42303 bytes in size, so let's add it to 19901:Another file!It may not seem too obvious right but we now have everything we need to make an extractor. Let me show you something you may have thought of already:I will now use these names instead of S1 and S2, it should be more clear.So, how would we make a program that extracts files from a .bus archive? It may not be completely clear how, so let me explain it step by step:- Map the whole file into memory using MapViewOfFile- Decrypt the S1 structure which is located at the mapped view's base address + 33. The file_count value will tell you how many FILE_INFO structures are in the archive. There are as many FILE_INFO structures as there are files.- If the file count isn't 0, get the offset to the beginning of the list of FILE_INFO structures, which we know is:- Loop through all the FILE_INFO structures. You could do something along the lines of:- Calculate the offset to the beginning of the file data using the following formula:You can use the vector to read each file now. For example:If you're wondering where to extract each file exactly, have a look at the "path" field in the FILE_INFO structure.You can recreate the whole directory structure with this information.Assuming you're going to do that, the data folder's root should look like this after extraction. Last edited by Laooo; 18th April 2019 at 10:55 PM . Reason: I should consider reading the rules carefully before posting...