Introduction

Let's assume that your program is running on Linux and is not going to terminate for a long period of time, like UNIX daemons. However you want to upgrade the program in some simple way but you do not want to terminate the program execution. What comes to your mind is to somehow upgrade some known function in your program so that it will do some additional job without compromising the function's usual behavior and without terminating your program. You think about injecting some new code into the code of your program so that it will be triggered when another already existing function in your program is called. This may be a bit imaginary example but it demonstrates the idea, why it is sometimes needed to inject some code in the running program. It is also relevant to mention the virus injection techniques into the running code.

In this article, I'll explain how it is possible to inject a C function into the running program on Linux without terminating the program. We'll talk about Linux object files Executable and Linkable Format (ELF), about object file sections, symbols and relocations.

Working Example Overview

I will explain step by step the code injection technique using the following simple example. The example consists of 3 components:

Dynamic (shared) library libdynlib.so that is built from dynlib.hpp and dynlib.cpp C++ source files. Application app that is built from app.cpp source file and is linked with libdynlib.so library. The injection function located in injection.cpp file.

Let us review the components code.

extern " C" void print();

The dynlib.hpp header defines the print() function.

#include < stdlib.h > #include < iostream > #include " dynlib.hpp" using namespace std; extern " C" void print() { static unsigned int counter = 0 ; ++counter; cout << counter << " : PID " << getpid() << " : In print() " << endl; }

The dynlib.cpp implements the print() function that just prints a counter (that is incremented at every function call), the program process id and a message.

#include < dlfcn.h / > #include < iostream / > #include " dynlib.hpp" using namespace std; int main() { while ( 1 ) { print(); cout << " Going to sleep ..." << endl; sleep( 3 ); cout << " Waked up ..." << endl; } return 0 ; }

The application app.cpp calls the print() function (from the libdynlib.so dynamic library, then sleeps for a few seconds and continues doing the same in the infinite loop.

#include < stdlib.h / > extern " C" void print(); extern " C" void injection() { print(); system( " date" ); }

The injection() function call is going to replace the print() function call in the application main() function. The injection() function will first call the original print() function and then do some additional job. For example, it can run some external executable file using system() function call or just print the current date as I do.

Compile and Run the Application

Let us first compile the components with g++ C++ compiler and gcc C compiler.

g++ -ggdb -Wall dynlib.cpp -fPIC -shared -o libdynlib.so g++ -ggdb app.cpp -ldynlib -ldynlib -L./ -o app gcc -Wall injection.cpp -c -o injection.o -rwxr-xr-x 1 gregory ftp 52248 Feb 12 02:05 app -rw-r--r-- 1 gregory ftp 1088 Feb 12 02:05 injection.o -rwxr-xr-x 1 gregory ftp 52505 Feb 12 02:05 libdynlib.so

Note that the dynamic library libdynlib.so is compiled and linked with -fPIC flag that produces position independent code and the injection object is compiled with C compiler. We can now run the application app executable.

[lnx63:code_injection] ==> ./app 1 : PID 4184 : In print() Going to sleep ... Waked up ... 2 : PID 4184 : In print() Going to sleep ... Waked up ... 3 : PID 4184 : In print() Going to sleep ...

Getting into Debugger

The application app passed few loop iterations but we pretend that it's already running few weeks so it's now time to inject our new code without terminating the applications. We'll use Linux gdb debugger during the injection process. First we need to attach gdb to the application process 4184 , see the PID (application process id) printed above.

[lnx63:code_injection] ==> gdb app 4184 GNU gdb 6.3 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"... Using host libthread_db library "/lib/tls/libthread_db.so.1". Attaching to program: /store/fileril104/project/gregory/code_injection/app, process 4184 Reading symbols from /store/fileril104/project/gregory/code_injection/libdynlib.so...done. Loaded symbols for /store/fileril104/project/gregory/code_injection/libdynlib.so Reading symbols from /usr/lib/libstdc++.so.6...done. Loaded symbols for /usr/lib/libstdc++.so.6 Reading symbols from /lib/tls/libm.so.6...done. Loaded symbols for /lib/tls/libm.so.6 Reading symbols from /lib/libgcc_s.so.1...done. Loaded symbols for /lib/libgcc_s.so.1 Reading symbols from /lib/tls/libc.so.6...done. Loaded symbols for /lib/tls/libc.so.6 Reading symbols from /lib/ld-linux.so.2...done. Loaded symbols for /lib/ld-linux.so.2 0x006e17a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 (gdb)

Loading the Injection Code into the Executable Process Memory

As I mentioned above, injection.o object file is not initially included in the app executable process image. We first need to load injection.o into the process memory address space. This can be done with mmap() system call that will map the injection.o file into the app process address space. Let us do it in the debugger.

(gdb) call open("injection.o", 2) $1 = 3 (gdb) call mmap(0, 1088, 1 | 2 | 4, 1, 3, 0) $2 = 1073754112 (gdb)

We first open the injection.o file with O_RDWR (value 2) read/write permissions. We need write permission because later we'll make changes in the loaded injection code. The returned allocated file descripter for the opened file is 3. Then we bring the file into the process address space with mmap() call. The mmap() call accepts the file size ( 1088 bytes), the file mapping permissions - PROT_READ | PROT_WRITE | PROT_EXEC (for reading/writing and executing, 1 | 2 | 4) and opened file descriptor - 3. and returns the starting address of the mapped file within the process address space - 1073754112 . We can verify that the injection.o was indeed mapped into the process address space by looking into /proc/[pid]/maps (where pid is the executable process id - 4184 in our example) file that on Linux is the file that contains information about running process memory layout.

[lnx63:code_injection] ==> cat /proc/4184/maps 006e1000-006f6000 r-xp 00000000 fd:00 394811 /lib/ld-2.3.4.so 006f6000-006f7000 r-xp 00015000 fd:00 394811 /lib/ld-2.3.4.so 006f7000-006f8000 rwxp 00016000 fd:00 394811 /lib/ld-2.3.4.so 006ff000-00824000 r-xp 00000000 fd:00 394812 /lib/tls/libc-2.3.4.so 00824000-00825000 r-xp 00124000 fd:00 394812 /lib/tls/libc-2.3.4.so 00825000-00828000 rwxp 00125000 fd:00 394812 /lib/tls/libc-2.3.4.so 00828000-0082a000 rwxp 00828000 00:00 0 00832000-00853000 r-xp 00000000 fd:00 394813 /lib/tls/libm-2.3.4.so 00853000-00855000 rwxp 00020000 fd:00 394813 /lib/tls/libm-2.3.4.so 0096e000-00975000 r-xp 00000000 fd:00 394816 /lib/libgcc_s-3.4.6-20060404.so.1 00975000-00976000 rwxp 00007000 fd:00 394816 /lib/libgcc_s-3.4.6-20060404.so.1 00978000-00a38000 r-xp 00000000 fd:00 45535 /usr/lib/libstdc++.so.6.0.3 00a38000-00a3d000 rwxp 000bf000 fd:00 45535 /usr/lib/libstdc++.so.6.0.3 00a3d000-00a43000 rwxp 00a3d000 00:00 0 08048000-08049000 r-xp 00000000 00:34 30468731 /store/fileril104/project/gregory/ code_injection/app 08049000-0804a000 rwxp 00000000 00:34 30468731 /store/fileril104/project/gregory/ code_injection/app 0804a000-0806b000 rwxp 0804a000 00:00 0 40000000-40001000 r-xp 00000000 00:34 30468725 /store/fileril104/project/gregory/ code_injection/libdynlib.so 40001000-40002000 rwxp 00000000 00:34 30468725 /store/fileril104/project/gregory/ code_injection/libdynlib.so 40002000-40003000 rwxp 40002000 00:00 0 40003000-40004000 rwxs 00000000 00:34 30468724 /store/fileril104/project/gregory/ code_injection/injection.o 4000f000-40011000 rwxp 4000f000 00:00 0 bfffe000-c0000000 rwxp bfffe000 00:00 0 ffffe000-fffff000 ---p 00000000 00:00 0

You can verify that /store/fileril104/project/gregory/code_injection/injection.o starts at address 0x40003000 (decimal 1073754112 ) and ends at address 0x40004000 within the process address space. Other dynamic libraries mapping is also shown in the above output. Well, we now have all the components loaded in the executable process memory.

Relocations

Now it's time to inspect the application binary executable in ELF format from inside. We'll use readelf Linux utility that displays different data from ELF format object files (i.e. any object, library or executable files on Linux). We look at the symbol relocations in the app executable. We are interested in print() function call relocation.

[lnx63:code_injection] == > readelf -r app Relocation section ' .rel.dyn' at offset 0x5ec contains 2 entries: Offset Info Type Sym.Value Sym. Name 08049d58 00001706 R_386_GLOB_DAT 00000000 __gmon_start__ 08049d60 00000305 R_386_COPY 08049d60 _ZSt4cout Relocation section ' .rel.plt' at offset 0x5fc contains 13 entries: Offset Info Type Sym.Value Sym. Name 08049d24 00000107 R_386_JUMP_SLOT 0804868c print 08049d28 00000207 R_386_JUMP_SLOT 0804869c _ZNSt8ios_base4InitC1E 08049d2c 00000507 R_386_JUMP_SLOT 080486ac _ZStlsISt11char_traits 08049d30 00000607 R_386_JUMP_SLOT 080486bc _ZNSolsEPFRSoS_E 08049d34 00000707 R_386_JUMP_SLOT 08048664 _init 08049d38 00000807 R_386_JUMP_SLOT 080486dc sleep 08049d3c 00000907 R_386_JUMP_SLOT 080486ec _ZNKSsixEj 08049d40 00000b07 R_386_JUMP_SLOT 080486fc _ZNKSs4sizeEv 08049d44 00000c07 R_386_JUMP_SLOT 0804870c __libc_start_main 08049d48 00000d07 R_386_JUMP_SLOT 08048ae4 _fini 08049d4c 00001307 R_386_JUMP_SLOT 0804872c _ZSt4endlIcSt11char_tr 08049d50 00001507 R_386_JUMP_SLOT 0804873c __gxx_personality_v0 08049d54 00001607 R_386_JUMP_SLOT 0804874c _ZNSt8ios_base4InitD1E

As you can see, the print symbol relocation is located at the absolute (virtual) address (offset) 0x08049d24 in the app executable and the type of this relocation is R_386_JUMP_SLOT . The relocation address is an absolute virtual address of the executable after it is loaded in the memory prior to its run. Note that this relocation resides in the .rel.plt section of the executable binary image. The PLT stands for Procedure Linkage Table, that is the table that provides indirect call for a function. This means that when you call a function you don't directly jump to the function location, but first jump to an entry in the Procedure Linkage Table and then from the PLT jump to the actual function code. This is necessary when you call a function that resides in a dynamic library (libdynlib.so in our example) because you do not know in advance at what address in the executable process space the dynamic libraries will be loaded and in what dynamic library you will first find the required function ( print() in our example). All this knowledge is available only at the moment of loading application into the memory prior to its run and at that time it's the job of dynamic linker (ld-linux.so on Linux) to resolve relocations so that the requested function will be correctly called. In our example the dynamic linker will load the libdynlib.so library into the executable process address space, find the address of the print() function in the library and set this address into the relocation address 0x08049d24 .

Our goal is to replace the address of the print() function with the address of function injection() from the injection.o object file that was not initially included in the executable process image when it started running.

More information on ELF format, relocations and dynamic linker can be found in Executable and Linkable Format (ELF) document.

We can check that the address 08049d24 currently contains the address of function print() .

(gdb) p & print $4 = (void (*)(void)) 0x40000be8 <print> (gdb) p/x * 0x08049d24 $5 = 0x40000be8 (gdb)

The address of the injection() function can be found by running readelf -s (displays object file symbol table) on the injection.o file.

[lnx63:code_injection] ==> readelf -s injection.o Symbol table '.symtab' contains 13 entries: Num: Value Size Type Bind Vis Ndx Name 0: 00000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000 0 FILE LOCAL DEFAULT ABS injection.cpp 2: 00000000 0 SECTION LOCAL DEFAULT 1 3: 00000000 0 SECTION LOCAL DEFAULT 3 4: 00000000 0 SECTION LOCAL DEFAULT 4 5: 00000000 0 SECTION LOCAL DEFAULT 5 6: 00000000 0 SECTION LOCAL DEFAULT 6 7: 00000000 0 SECTION LOCAL DEFAULT 8 8: 00000000 0 SECTION LOCAL DEFAULT 9 9: 00000000 25 FUNC GLOBAL DEFAULT 1 injection 10: 00000000 0 NOTYPE GLOBAL DEFAULT UND system 11: 00000000 0 NOTYPE GLOBAL DEFAULT UND print 12: 00000000 0 NOTYPE GLOBAL DEFAULT UND __gxx_personality_v0

The function (symbol) injection is located at the offset 0 in the .text section in the injection.o object file. But the .text section starts at the offset 0x000034 in the injection.o object file.

[lnx63:code_injection] ==> readelf -S injection.o There are 13 section headers, starting at offset 0x104: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 00000000 000034 000019 00 AX 0 0 4 [ 2] .rel.text REL 00000000 000418 000018 08 11 1 4 [ 3] .data PROGBITS 00000000 000050 000000 00 WA 0 0 4 [ 4] .bss NOBITS 00000000 000050 000000 00 WA 0 0 4 [ 5] .rodata PROGBITS 00000000 000050 000005 00 A 0 0 1 [ 6] .eh_frame PROGBITS 00000000 000058 000038 00 A 0 0 4 [ 7] .rel.eh_frame REL 00000000 000430 000010 08 11 6 4 [ 8] .note.GNU-stack NOTE 00000000 000090 000000 00 0 0 1 [ 9] .comment PROGBITS 00000000 000090 000012 00 0 0 1 [10] .shstrtab STRTAB 00000000 0000a2 00005f 00 0 0 1 [11] .symtab SYMTAB 00000000 00030c 0000d0 10 12 9 4 [12] .strtab STRTAB 00000000 0003dc 00003b 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings) I (info), L (link order), G (group), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific)

Replacing the print() Function with injection() Function

I would like to remind you that the injection.o file was loaded into the executable process memory at address 0x40003000 (see above). So the final absolute address of the injection() function within the executable process.is 0x40003000 + 0x000034 .

We now set this address into the print() function relocation address 0x08049d24 .

(gdb) set * 0x08049d24 = 0x40003000 + 0x000034 (gdb)

At this point, we successfully replaced the call to the print() with the call to the injection() function.

Resolving injection() Function Relocations

However we still need some work to be done. The code of the injection() function is not ready to run yet because it has 3 unresolved relocations.

[lnx63:code_injection] ==> readelf -r injection.o Relocation section '.rel.text' at offset 0x418 contains 3 entries: Offset Info Type Sym.Value Sym. Name 00000009 00000501 R_386_32 00000000 .rodata 0000000e 00000a02 R_386_PC32 00000000 system 00000013 00000b02 R_386_PC32 00000000 print Relocation section '.rel.eh_frame' at offset 0x430 contains 2 entries: Offset Info Type Sym.Value Sym. Name 00000011 00000c01 R_386_32 00000000 __gxx_personality_v0 00000024 00000201 R_386_32 00000000 .text

The first .rodata relocation points to the "date" constant string stored in the .rodata read-only data section, the second system relocation refers to the system() function call and the third print relocation refers to the print() function call. Note that all the three relocations reside in the .rel.text section that is their offsets are relative to the beginning of the .text section.

We resolve all the above three relocations manually and set appropriate addresses to these three memory locations. The addresses of these relocations within the executable process address space are calculated by summing up:

The injection.o starting address ( 0x40003000 ) within the process address space. The .text section starting offset 0x000034 within the injection.o object file. The relocation offset relative to the .text section ( 0x00000009 - for .rodata , 0x0000000e. for system and 00000013 for print ).

Note that system and print relocations are of R_386_PC32 type. This means that the value (resolved address) to be set into the relocation location should be calculated relatively to the PC program counter, that is relatively to the relocation location. Also R_386_PC32 relocation requires that the value that was stored in the relocation location before relocation resolution ( addend ) should be added to the resolved address. The R_386_32 .rodata relocation also adds the addend to its resolved address.

(gdb) p & system $7 = (<text> *) 0x733650 <system> // Address of the system() function (gdb) p * (0x40003000 + 0x000034 + 0x0000000e) $8 = -4 // Addend of the system relocation (gdb) set * (0x40003000 + 0x000034 + 0x0000000e) = 0x733650 - (0x40003000 + 0x000034 + 0x0000000e) - 4 (gdb) p & print $9 = (void (*)(void)) 0x40000be8 <print> // Address of the print() function (gdb) p * (0x40003000 + 0x000034 + 0x00000013) $10 = -4 // Addend of the print relocation (gdb) set * (0x40003000 + 0x000034 + 0x00000013) = 0x40000be8 - (0x40003000 + 0x000034 + 0x00000013) - 4 (gdb) p * (0x40003000 + 0x000034 + 0x00000009) $11 = 0 // Addend of the .rodata relocation (gdb) set * (0x40003000 + 0x000034 + 0x00000009) = 0x40003000 + 0x000050 // 0x000050 is // the offset of .rodata section within injection.o object file.

We just resolved all the three relocations within injection() function code. Well, we are done. We exit the debugger. The application will continue running and now do additional job of printing the current date.

gdb) quit The program is running. Quit anyway (and detach it)? (y or n) y Detaching from program: /store/fileril104/project/gregory/code_injection/app, process 4184 [lnx63:code_injection] ==> // The application execution continues Waked up ... Thu Feb 12 20:09:40 IST 2009 4: PID 4184: In print() Going to sleep ... Waked up ... Thu Feb 12 20:09:43 IST 2009 5: PID 4184: In print() Going to sleep ... Waked up ... Thu Feb 12 20:09:46 IST 2009 6: PID 4184: In print() Going to sleep ... Waked up ... Thu Feb 12 20:09:49 IST 2009 7: PID 18138: In print() Going to sleep ... Waked up ...

That's it.

Conclusion

I showed how one can inject a C function into the running program on Linux without terminating the program. Note that process memory manipulations that were demonstrated are allowed only for processes for which you are either owner or have appropriate permissions.

History