antiflag.sys - Writing a kernel driver to remove the LLKHF_INJECTED flag

------------------------------------------------------------------------

#####################################################################

written by UsualSuspect for reddit.com/r/reverseengineering, 11/29/10

May be copied or redistributed as long as the full document as it is

stays intact.

#####################################################################

Introduction

------------

A while ago I came across a game that checked for injected keyboard input. It did so by the probably well-known method of using SetWindowsHookEx() with a hook type of WH_KEYBOARD_LL. When a key is pressed, Windows notifies all hooks and supplies the following data structure:

typedef struct tagKBDLLHOOKSTRUCT {

DWORD vkCode;

DWORD scanCode;

DWORD flags;

DWORD time;

ULONG_PTR dwExtraInfo;

} KBDLLHOOKSTRUCT, *PKBDLLHOOKSTRUCT, *LPKBDLLHOOKSTRUCT;

The only field of interest for us is flags. When keyboard input is generated by a program using SendInput() (amongst others, as far as I know), Windows sets a flag, which is LLKHF_INJECTED telling the hook procedure the origin of the given input is a program. The problem now is that if we want to cheat in a game, we usually want to fake input such as mouse movements or key presses. Using this hook though, the game would be able to find out we're cheating and possibly ban us. So I had to find I a way to remove this flag. There are dozens of spots where this could be done. We could for example hook the LowLevelKeyboardProc, remove the flag and pass the data on to the original proc.

The target I was writing the hack for was very careful though and checked each and every function for hooks. It hashed the memory, it made sure system DLLs (such as user32.dll which is of interest for us) weren't tampered with and so on. Although all of this took place in usermode and we could simply defeat all detection mechanisms in place, I decided to take another route. What I wanted to do is write a kernel driver to deal with this issue and thus transparently (for user mode applications) remove the flag. This way, I didn't have to deal with any detection mechanism the game implemented now or in the future.

Kernel land unfortunately was completely new to me, as I restricted my reverse engineering and hacking exclusivly to user mode so I had to find a way to get started. At this point, I'd like to point out I'm probably wrong about a few assumptions I made and I may state things that are simply wrong in this paper. Feel free to correct me, if this is the case. In the end though, my kernel driver worked, so I was satisfied.

Getting started

---------------

The goal was writing a kernel driver to globally remove the LLKHF_INJECTED flag so no game or application was enable to distinguish between regular and injected input. Because I've never done kernel stuff I had to read up on it and got the book "Rootkits: Subverting the Windows kernel" by Hoglund and Butler. There I learned how to set up my build environment, how to create the most basic driver and other important things like that. The book is what got me started and I heartily recommend it to readers wanting to do the same. Although I didn't have any hands-on practice with kernel mode code, I knew about a few things in theory, such as SSDT hooks, interrupt hooks, problems that may arise from paging (meaning memory could be paged out, that is written to the pagefile on the disk, and not be available) et cetera.

With all of this in mind, I set up my environment which consisted of a plain XP SP2 installed on a VirtualBox to which I connected with kd (the kernel debugger from Microsoft found in the Debugging Tools for Windows) using VirtualKD, which is a modification for VMWare/VirtualBox to allow for (way?) faster data transfer. You want to do the same as writing kernel code is tedious and an error usually means a bluescreen. There is no way I would have gone through with it if I didn't have a virtual machine to test my code.

In order to quickly test if I was successful, I wrote a little tool that installed a global keyboard hook and just tells us if the given input was injected (= flag set) or not, so I didn't have to fire up the game all the time.

Finding the weak spot

---------------------

Now that my setup was running, I needed to find where to "apply the wrench", i.e. where to apply hooks to patch out the LLKHF_INJECTED flag. To do so, I followed the obvious path of checking what SetWindowsHookEx() does. Following xrefs inside it, I quickly found out it jumps to the kernel counterpart of user32.dll, which happens to be win32k.sys, the kernel driver for GUI-related code.

.text:77D3E60D _NtUserSetWindowsHookEx@24 proc near ; CODE XREF: _SetWindowsHookEx(x,x,x,x,x,x)+2Ep

.text:77D3E60D mov eax, 1225h

.text:77D3E612 mov edx, 7FFE0300h

.text:77D3E617 call dword ptr [edx]

.text:77D3E619 retn 18h

.text:77D3E619 _NtUserSetWindowsHookEx@24 endp

SetWindowsHookEx() goes through a few calls and ends up here. What this does is call the kernel, so I fired up IDA and checked win32k.sys. This all was purely random, I just followed xrefs, looked at the exports/functions (let IDA download symbols!) etc. In the end I found a few interesting functions, for example:

NtUserCallNextHookEx

xxxCallNextHookEx

xxxCallNextHookEx2

and the key function turned out to be

xxxHkCallHook

There I found several calls that looked interesting, and one in particular was:

.text:BF9073E5 call fnHkINLPKBDLLHOOKSTRUCT(x,x,x,x,x)

KBDLLHOOKSTRUCT? Now that sounds exactly like the struct we want to modify! The function itself is rather short and doesn't really do more than calling KeUserModeCallback(). KeUserModeCallback() is the function the kernel uses to call (as the name implies) callbacks in the user mode.

The call:

.text:BF92A693 lea eax, [ebp+output_len]

.text:BF92A696 push eax ; output_len

.text:BF92A697 lea eax, [ebp+output]

.text:BF92A69A push eax ; output

.text:BF92A69B push 36 ; inputlen

.text:BF92A69D lea eax, [ebp+input]

.text:BF92A6A0 push eax ; input

.text:BF92A6A1 push 2Dh ; Call 77D5894D @ User32.dll

.text:BF92A6A1 ; = DispatchHook()

.text:BF92A6A3 call ds:KeUserModeCallback(x,x,x,x,x)

Thanks to some paper at UNINFORMED, I found the parameters for this function. 0x2D is the number of the user mode callback and it turned out to be user32!DispatchHook(). So if you feel like going the easy way, just modify this function in user mode to patch out the flag and you're done. Of course, this would be to easy.

So what do we do now? There are a handful of functions used in calling hooks. One function, fnHkINLPKBDLLHOOKSTRUCT() most probably is passed a KBDLLHOOKSTRUCT which should contain our flag. If this is the case, we found the weak spot where the actual hook struct is passed through to the user mode. That means we could hook this function, patch out the flag and the structure would end up in user mode with the LLKHF_INJECTED flag removed. To verify this idea, I fired up WinDbg and tried to look at the code.

kd> u BF92A6A3

win32k!fnHkINLPKBDLLHOOKSTRUCT+0x4c:

bf92a6a3 ?? ???

^ Memory access error in 'u BF92A6A3'

Huh? But this is inside win32k.sys, just show me the call already? As it turns out, this wouldn't be as easy as that. After googling my fingers sore, asking lots of sources and more, someone in /r/reverseengineering enlightened me. The problem is that win32k.sys isn't mapped into all processes. So if I happen to break in a process that doesn't have a GUI, there will be no win32k.sys mapped into the process' memory and I can't disassemble it. The solution is simple. Find a GUI process, set the process context to it and you're good. This is how it's done:

kd> !process 0 0

**** NT ACTIVE PROCESS DUMP ****

...

PROCESS 86565020 SessionId: 0 Cid: 0250 Peb: 7ffdf000 ParentCid: 0178

DirBase: 0bc7e000 ObjectTable: e100d928 HandleCount: 334.

Image: csrss.exe

!process 0 0 shows the running processes. From what I learned later on, csrss.exe is a GUI process, so we can just use its context. To switch, we simply do

kd> .process 86565020

Implicit process is now 86565020

WARNING: .cache forcedecodeuser is not enabled

And try disassembling the call again:

kd> u BF92A6A3

win32k!fnHkINLPKBDLLHOOKSTRUCT+0x4c:

bf92a6a3 ff1534aa98bf call dword ptr [win32k!_imp__KeUserModeCallback (bf98aa34)]

bf92a6a9 8bf0 mov esi,eax

bf92a6ab e88a63edff call win32k!EnterCrit (bf800a3a)

...

Nice! So now, let's place a bp (which can be done without changing the context to a GUI process, btw.) and examine the input buffer passed.

kd> bp bf92a6a3

WARNING: Software breakpoints on session addresses can cause bugchecks.

Use hardware execution breakpoints (ba e) if possible.

kd> g

Breakpoint 1 hit

win32k!fnHkINLPKBDLLHOOKSTRUCT+0x4c:

bf92a6a3 ff1534aa98bf call dword ptr [win32k!_imp__KeUserModeCallback (bf98aa34)]

Alright, right before the call. We know KeUserModeCallback() gets 5 arguments, so let's see what is on the stack. The first should be 0x2D for DispatchHook(), followed by the address to the input buffer.

kd> dd esp L5

ed63faec 0000002d ed63fb0c 00000024 ed63fb38

ed63fafc ed63fb34

which is true. So now let's see what is in the input buffer.

kd> dd ed63fb0c L8

ed63fb0c 000d0000 00000100 00401320 77d26df4

ed63fb1c 00000025 00000012 00000010 00019816

I don't know what the first 4 DWORDs are but the next 4 DWORDs look promising. My tool sends an injected VK_LEFT which is 0x25. 0x12 is the scancode (which can differ, it depends on quite a few things from what I read) and the 3rd DWORD should be the flags then. If we search for the value of LLKHF_INJECTED, we will see that it is the value 0x10, which is awesome. Obviously, we know the 7th DWORD in the input buffer is the flags field. So now we know where to apply our hook and where to find the data we need to modify.

Writing the kernel driver

-------------------------

Writing the kernel driver was obviously the most difficult part for me. Finding the weak spot was (even though it probably was not more than educated guesses) rather easy. Writing code that does the patching though is more difficult. The first issue that arises is the one we encountered before: If our kernel driver needs to modify code inside win32k.sys, we have to run inside a process context of a GUI process. Thus, we need to be able to find a GUI process and eventually attach to it.

Finding a GUI process

---------------------

In theory, finding a GUI process is easy. The process list on Windows is a double linked list of EPROCESS structs. If you want to see what it looks like, type

kd> dt nt!_EPROCESS

+0x000 Pcb : _KPROCESS

...

+0x088 ActiveProcessLinks : _LIST_ENTRY

...

+0x130 Win32Process : Ptr32 Void

Two important fields of the structure are listed. The first one being ActiveProcessLinks which is a structure of two pointers, flink (front link) and blink (back link). One thing I had to learn by mistake was that the points inside the LIST_ENTRY structure point to the ActiveProcessLinks field, and not at the beginning of the _EPROCESS structure. More on that later.

The second important field is Win32Process which points to some Win32Process structure or something if (and only if!) the process is a GUI process. Now we just need to find some handle (any handle) to an _EPROCESS structure, walk the double linkest list until we find one with a Win32Process field different from 0. Getting a handle to an _EPROCESS structure is easy, we can just use PsGetCurrentProcess().

Without further ado, here is the code:

PEPROCESS FindGUIProcess()

{

ULONG CurProc;

CurProc = PsGetCurrentProcess();

DbgPrint("Starting at %08x

",CurProc);

while(*(ULONG*)(CurProc+EPROCESS_WIN32PROC_OFFSET) == 0)

{

CurProc = *(ULONG*)(CurProc+EPROCESS_LINK_OFFSET);

CurProc -= EPROCESS_LINK_OFFSET;

DbgPrint("Next proc at %08x

",CurProc);

}

//CurProc is a GUI proc!

return CurProc;

}

We get ourselves a handle to the current process and then start the while loop. EPROCESS_LINK_OFFSET is 0x88 for XP SP2 as we could see above. EPROCESS_WIN32PROC_OFFSET is 0x130 as we've seen above, too. If it is 0, the process is not a GUI process, so we need to go to the next entry. We do so by walking through the flink field of ActiveProcessLinks to the next process and then subtract the offset of it because as said above, flink points to the next process' ActiveProcessLinks and NOT to the start of the _EPROCESS structure. We do this as long as Win32Process stays 0. This could possibly lead to an infinite loop but I doubt this scenario ever happens, as csrss.exe is a GUI process, so I skipped adding checks for the case where we traversed the entire linked list without finding a GUI process. It works, that's what matters.

Now that we have a GUI process, we need to attach. This is simple:

KAPC_STATE ApcState;

KeStackAttachProcess(GUIProcess,&ApcState);

//do stuff

KeUnstackDetachProcess(&ApcState);

The next thing to do is "do stuff".

Applying the hook

-----------------

Now that we made sure we're running in a GUI process' context, we have to apply the actual hook. To make life easy, I decided not to hook KeUserModeCallback() (could be done with a SSDT hook) because then I'd have to check if it's a call for DispatchHook() and more important, we'd be called all the time because this is the single point for the kernel to call the user mode. I didn't want to interfer with that. I wanted to do the inline hook.

If we look at the disassembly again, the call reads:

kd> u BF92A6A3

win32k!fnHkINLPKBDLLHOOKSTRUCT+0x4c:

bf92a6a3 ff1534aa98bf call dword ptr [win32k!_imp__KeUserModeCallback (bf98aa34)]

It's a total of 6 bytes. FF 15 being the opcode for call, and the following 4 bytes, 0xBF98AA34 (remember, little endian) being the pointer to the location to call. We want to hijack this call.

To do so, we'd fill it with the bytes "E8 DD CC BB AA", which is a call to AddrOfNextInstruction+0xAABBCCDD. But this call is only 5 bytes long, the original is 6 bytes. Therefore, we have to insert a nop, and the offset we have to stamp in for 0xAABBCCDD is relative to this nop.

The hook would be:

char hook[6] = "\xE8\x00\x00\x00\x00\x90";

And the code to calculate the offset relative to the nop and the stamping in then is:

const ULONG HookAddr = 0xBF92A6A3;

*(ULONG*)(hook+1) = (ULONG)WrappedKeUserModeCallback-(HookAddr+5);

Now we have the 6 bytes which will overwrite the existing six bytes at 0xBF92A6A3 (=HookAddr). But first, we have to gain rights to write to this address. To gain these rights, there are several ways, one being the evil cr0 trick which is a csr (CPU specific register, or something like that) containing information regarding the memory protection. There, you simply unset a flag and may write to wherever you desire. This is dirty (I've been told, it globally removes write protection!) and because the cr0 register belongs to a CPU, you will run into trouble with multi-cores (because each core has its own context), so we don't use that.

We do it the proper way:

PMDL pmdlHook;

PVOID *MappedHook;

pmdlHook = MmCreateMdl(NULL,HookAddr,6);

MmProbeAndLockPages(pmdlHook,KernelMode,IoModifyAccess);

MappedHook = MmGetSystemAddressForMdlSafe(pmdlHook,HighPagePriority);

MappedHook now points to the same memory as 0xBF92A6A3, the location where we want to write 6 bytes, but with writing rights. Before writing them, we save the original 6 bytes, in case we want to unload the driver (which I recommend for development):

//Save original data

RtlCopyMemory(SavedCode,MappedHook,6);

//Write in our hook

RtlCopyMemory(MappedHook,hook,6);

What's left is implementing the WrappedKeUserModeCallback() to remove the evil LLKHF_INJECTED flag:

NTSTATUS WrappedKeUserModeCallback (

IN ULONG ApiNumber,

IN PVOID InputBuffer,

IN ULONG InputLength,

OUT PVOID *OutputBuffer,

IN PULONG OutputLength

)

{

PULONG p = InputBuffer;

DbgPrint("Flags: %08x

",p[6]);

//Remove the injected flag

p[6] &= ~LLKHF_INJECTED;

return OrigKeUserModeCallback(ApiNumber,InputBuffer,InputLength,OutputBuffer,OutputLength);

}

Because this function is only called at the single location we hooked, we don't have to make sure the caller is right and just can assume the input buffer is as specified above. The flags are the 7th DWORD, where we simply patch out the LLKHF_INJECTED and pass it on to the original KeUserModeCallback() which will then call DispatchHook() and pass the modified structure to all LowLevelKeyboardProcs. The address of the original KeUserModeCallback() can be read from 0xBF98AA34 as this is the address in win32k.sys' import table. I just hardcoded the address. This is bad practice, but again, it works (for XP SP2).

That's it. Before any low level keyboard hook is called, it will go through our code which makes sure, no LLKHF_INJECTED flag is set, so: Mission Accomplished!

Conclusion

----------

We now have a functioning kernel driver to remove LLKHF_INJECTED flags, so no user land program should be able to distinguish injected input from regular input and therefore we can cheat as much as we want. To do this, we reverse engineered user32.dll as well as its kernel counterpart, win32k.sys to find a weak spot where the KBDLLHOOKSTRUCT is passed through, so we can patch out the injected flag. After we found the weak sport, we implemented the kernel driver to do just that. To be able to write our hook, we had to find a GUI process and attach to it, otherwise we wouldn't be able to access win32k.sys. Having solved this issue, we wrote a wrapper for KeuserModeCallback() that removes the injected flag and made sure the code flow went through at the right time.

Last words

----------

Kernel driver development is frustrating. The debugging cycles are longer, because a mistake means a reboot. Documentation is sparse and finding the information you want is difficult. If you spent 30% of your time researching before coding hacks in user mode, you could at least double the research time for kernel mode, from my experience.

Regarding the final code: It's not nice. We use magic numbers, that only work for a single release (in this case, again, XP SP2) and we are most probably not thread safe. Especially the RtlCopyMemory() to write hook may lead to a bluescreen with the "right" timing. Thing is, it works and it's only a single point of failure. If loading the kernel driver works without a bluescreen, you're good. The above mentioned Rootkit book has a chapter on synchronisation issues and how to handle them, but to be honest, I was to lazy to write it as of yet. If you want to port this code to Win 7, you just need to check WinDbg for the offsets and disassemble win32k.sys for the new weak spot.

I decided against releasing full source code. I'm confident that readers that want to get it done can get it done now. If there are questions or corrections, feel free to contact me!

References

----------

- [1] http://www.awarenetwork.org/etc/beta/?x=1

- "Rootkit: Subverting the Windows kernel" by Hoglund and James Butler

- sourcecode for other rootkits