Login: Password: Remember Me Register Blogs >> AlexIonescu 's Blog

Created: Tuesday, May 16 2006 00:13.09 CDT Printer Friendly ... Debugging/Reversing NT System Binaries Author: AlexIonescu # Views: 5409

Here are some tips I thought I'd share in an blog entry... some of these may seem fairly obvious, but I've come across many reverse engineers who are not aware of the wealth of resources available for easier NT reversing and debugging. Feel free to message me any additional resources so that I may add them.



1) Checked builds.

This is your first priority. If you are reversing a retail binary, STOP NOW. You are missing out on a wealth of debugging messages, assertions and easier to read code. Here are some of the advantages of checked builds:



* Mostly FREE for ANY NT OS. That's right, if you want to compare code across NT versions, you don't need to carry your 15 CDs of every version released (or worse, beg around the Internet). NT Service Packs contain all the core system files you're likely to reverse, and their checked builds are free to download. Granted, you will be missing out on the retail versions, but now you don't need to buy Windows 2003 to reverse a Windows 2003 binary.

* Much, much, much easier code to read. Checked builds are not built with OMAP, the compiler technology which splits up functions in chunks and re-organizes them for better CPU caching. That means that functions are linear and a breeze to reverse.

* Debug prints. These are just awesome. Microosft developers are telling you what's going on in their code, so you don't have to guess. Sometimes you can even find warnings (ie: "This will crash if the user sents RTL_FOO!!"), unfixed bugs, etc. Some binaries have entire built-in dumping functions, such as Dbg_DumpSomeStructure, which will graphically print out some huge structure that you don't need to reverse anymore. Debug prints can also give you valuable flag names, constants and etc.

* Assertions. Good Microsoft code (especially core/system-level) is filled with assertions. These assertions are actually C code, and more often then not will give you the name of a flag, structure member, or other symbolic names which are not public. While reversing a file once, I was able to find the name (and thus function) of about 18 fields out of a 25 field structure, merely by reading the assertions.

* Run-time profiling, debugging, or other helpful functions. If you are feeling curious, you can actually try using a checked build live on your system (I recommend only the specific binary/set, however). Coupled with WinDBG, this could give you new ways to analyze the binary, create complex debug logs, and even use built-in profiling/timing code if your reversing project is performance related, or if you're just curious. Again, only in a checked build.

* Tracing and protection. This applies more for testing your code, but checked builds also enable many tracing options in the kernel, which can be useful for reversing. For example, you can track a heap block, or any kernel object, and see a list of all acquires/releases, creators and users, which can sometimes be more useful then putting a memory breakpoint on a structure.



OK, so where to get them? A good place for up-to-date links is on OSR's site:

http://www.osronline.com/article.cfm?id=259



2) PDBs (Symbols).

Perhaps I should've put this first, because it really is even more basic, but I'm going at this in logical order. PDBs. Symbols. Debug Databases. Whatever you want to call them, you should not be reversing without them. In their most basic form, they will give you the internal name of every function in your binary (except statics), as well as global variables. With an OMAP-binary, they also contain special information to link chunked functions together. This means that your call 080854 just became call AdvapipGenerateHash, making your job a lot easier. With something like HAL or the kernel, PDBs also contain a wealth of structures not publically documneted in the WDK/PSDK. IDA doesn't unfortunately parse them, but if you use the pdbPlus plugin (available on the site), IDA will automatically add them to its structure database.



3. WinDBG.

The Debugging tools for Windows (Windows Debugger/WinDBG) is an extremly valuable tool, not for its diassembler, but for the myriad of extensions that it provides, which also have built-in code to dump structures which are unavaialble anywhere else. For example, two of its extensions are able to dump CSR_PROCESS and CSR_THREAD, which are the structures used by CSRSS, and not documented anywhere. Again, having access to structures and symbolic/flag/constant names can go a long way toward understanding what a function does.



4. Information

Now that you're all setup with the tools and binary, there is one more thing you should do: learn, read, and get acquainted with what you're going to debug/reverse. Read all the documentation avaialble, browse internet sites, see what others have discovered. But please don't post excitely that "omfg, if you set fs:18h+874h & 0x5 >> 3 you get a bugcheck", when you haven't taken the time to understand what fs:18h is in the first place, what member of the TEB 874h is, and what the 0x5 flag's symbolic name/meaning is. Because you might as well have discovered that setting NtCurrentTeb()->CrashMode & PS_CRASH_IMMEDIATELY crashes, which isn't really interesting to know, or at the very least, makes a lot more sense if presented that way.



Bonus: 5. Code Quality

Apart from avoiding to produce un-symbolized crap like seen above, it's a good idea to:



* Comment your code. Properly. Extensively.

If I see some discovery or exploit that is poorly commented, I'm not going to assume you were lazy, since you spent all this time reversing it, I'm going to assume you don't really know why your code is doing what it's doing.



* Portability.

More often then not, low-level NT code is completely unportable. While this is by design in many cases, it wouldn't hurt to try remaining as compatible as possible. Don't do systemcalls by using sysenter directly. Yes, yes, you look cool and it's 2 cycles faster, but it'll also not run on my service pack. And again, don't hard-code offsets. Use actual structures/headers, which may be versionned so that if I want a compatible version on 2003, I can just build it by using NTDDI_VERSION.



* Thread safety, multi-processor.

Again, much low-level NT code seems to be saying "oh well, I'm hacking sh*t anyways, who cares if I do it badly and I don't respect actual coding methdology". Before you publish your code, try to test it on a variety of systems. Try to profile it and stress it. Identify potential race conditions and fix them. Make sure your code is thread-safe and multiprocessor compatible. Just because you're running on an uni-processor machine doest't mean everyone else is. There a are a great number of things you need to start worrying about in NT kernel mode when you're on a multi-processor machine. Use "volatile" in C when needed. Use Interlocked operations when required. Don't change a pointer that could be read by another thread in the same time. Don't do CPU-level modifications without synchronizing them to both CPUs (learn about IPI). Each CPU has its own IDT, GDT, etc. Remember that before you only hook one.



* 64-bit

Again, just because you don't have a 32-bit machine doesn't mean you don't have to make your code as compatible on 64-bit as possible. Sure, that's sometimes impossible, but at least use /Wp64 so you get warned about obvious 64-bit incompatibilities and broken code. Minimize your use of assembly if possible. Version1 14 of MSC (in the WDK or MSVC 2005) has many intrinsics that are portable when recompiled, including stuff like getting the return address, reading eflags, setting/reading/writng fs/gs/dr*/cr*, etc.



That's all I can think of for the moment, and I hope nobody takes this offensively. All the examples I've given were out of my head and I'm not targetting anyone in particular, these are just some considerations.







