Last week Jim posted a comment asking about reverse engineering the firmware for some Chinese routers with the intention of extracting the Web files and translating them to English.

Although I usually work with Linux based firmware, this sounded interesting so I thought I’d investigate. Although I wasn’t able to completely recover the Web files , the process of reversing a file system format seemed like a good subject for discussion.

The firmware image contains none of the normal file systems found in Linux firmware, no identifiable compression formats, and the only intelligible strings in the image are the names of the Web files themselves:

embedded@ubuntu:~/Mercury/110523$ strings -n 8 mw150rv5.bin ... owowowowowowowowowowowowowowowow o0common.js lcss_help.css (css_main.css custom.js $0help.js %0menu.js (dbanner.htm 18bottom1.htm bottom2.htm logo.htm AssignedIpAddrListHelpRpm.htm 9@BackNRestoreHelpRpm.htm <,ChangeLoginPwdHelpRpm.htm ?`DateTimeCfgHelpRpm.htm DMZHelpRpm.htm DomainFilterHelpRpm.htm FireWallHelpRpm.htm KhFixMapCfgHelpRpm.htm L2tpCfgHelpRpm.htm ...

The string ‘owowowowow…’ is unusual, and long enough that it’s probably not random. Let’s take a look at the hex dump:

embedded@ubuntu:~/Mercury/110523$ hexdump -C mw150rv5.bin | less 000f5b70 00 00 00 00 6f 77 6f 77 6f 77 6f 77 6f 77 6f 77 |....owowowowowow| 000f5b80 6f 77 6f 77 6f 77 6f 77 6f 77 6f 77 6f 77 6f 77 |owowowowowowowow| 000f5b90 6f 77 6f 77 00 00 00 01 00 00 00 8c 00 00 6f 30 |owow..........o0| 000f5ba0 63 6f 6d 6d 6f 6e 2e 6a 73 00 00 00 00 00 00 00 |common.js.......| 000f5bb0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000f5bc0 00 00 00 00 00 00 00 00 00 00 05 bb 00 00 1a 6c |...............l| 000f5bd0 63 73 73 5f 68 65 6c 70 2e 63 73 73 00 00 00 00 |css_help.css....| 000f5be0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000f5bf0 00 00 00 00 00 00 00 00 00 00 01 e0 00 00 20 28 |.............. (| 000f5c00 63 73 73 5f 6d 61 69 6e 2e 63 73 73 00 00 00 00 |css_main.css....| 000f5c10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000f5c20 00 00 00 00 00 00 00 00 00 00 02 28 00 00 22 08 |...........(..".| 000f5c30 63 75 73 74 6f 6d 2e 6a 73 00 00 00 00 00 00 00 |custom.js.......| 000f5c40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000f5c50 00 00 00 00 00 00 00 00 00 00 00 fd 00 00 24 30 |..............$0| 000f5c60 68 65 6c 70 2e 6a 73 00 00 00 00 00 00 00 00 00 |help.js.........| 000f5c70 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000f5c80 00 00 00 00 00 00 00 00 00 00 03 31 00 00 25 30 |...........1..%0| 000f5c90 6d 65 6e 75 2e 6a 73 00 00 00 00 00 00 00 00 00 |menu.js.........| 000f5ca0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000f5cb0 00 00 00 00 00 00 00 00 00 00 08 d3 00 00 28 64 |..............(d| 000f5cc0 62 61 6e 6e 65 72 2e 68 74 6d 00 00 00 00 00 00 |banner.htm......| 000f5cd0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000f5ce0 00 00 00 00 00 00 00 00 00 00 02 e0 00 00 31 38 |..............18|

It looks like the ‘owowow’ string is followed by a list of all the Web files in the firmware.This isn’t just an array of strings however; there is additional data included. The pseudo-structure for the data layout appears to be:

[‘owowowow’ string]

[12 bytes of data]

[file string, null-padded to 40 bytes]

[8 bytes of data]

[file string, null-padded to 40 bytes]

[8 bytes of data]

[file string, null-padded to 40 bytes]

[8 bytes of data]

etc…

This looks like it could be a simple file system, but searching Google for the ‘owowow’ string didn’t turn up anything interesting. So it is either custom, undocumented, or we are completely off track. The latter seems unlikely, so let’s try to identify the file system structure.

The last entry appears to be for the WzdWlanRpm.htm file. As with the other strings we saw, this null padded string is followed by 8 bytes of binary data, which ends at offset 0x0F75DF. Immediately after this are the bytes 5A 00 00 80:

embedded@ubuntu:~/Mercury/110523$ hexdump -C mw150rv5.bin | less 000f7580 57 7a 64 57 61 6e 54 79 70 65 52 70 6d 2e 68 74 |WzdWanTypeRpm.ht| 000f7590 6d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |m...............| 000f75a0 00 00 00 00 00 00 00 00 00 00 03 e4 00 02 f6 1c |................| 000f75b0 57 7a 64 57 6c 61 6e 52 70 6d 2e 68 74 6d 00 00 |WzdWlanRpm.htm..| 000f75c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000f75d0 00 00 00 00 00 00 00 00 00 00 12 d2 00 02 fa 00 |................| 000f75e0 5a 00 00 80 00 e3 18 00 00 00 00 00 00 00 33 1e |Z.............3.| 000f75f0 1f 3f 54 6a bb ab 88 57 be 2b 9f 84 e1 82 4e 3c |.?Tj...W.+....N<| 000f7600 6a 17 85 39 7c 94 92 e9 84 58 10 4c 68 fd 92 20 |j..9|....X.Lh.. |

Since there are no other obvious strings in the firmware, we can assume that the files themselves are probably compressed. This makes the bytes 5A 00 00 80 very interesting, because they are very similar to the magic bytes for LZMA compression which are 5D 00 00 80.

Let’s assume that the bytes 5A 00 00 80 are the magic bytes for a compression algorithm. If so, they will probably be at the beginning of all the Web files. Let’s see how many instances of this byte string we can find in the firmware:

embedded@ubuntu:~/Mercury/110523$ hexdump -C mw150rv5.bin | grep '5a 00 00 80' | wc -l 140

Although doing this type of simple search isn’t always completely accurate, there are at least 140 instances of these bytes in the firmware image. Let’s compare that to how many Web files are listed in the firmware:

embedded@ubuntu:~/Mercury/110523$ strings mw150rv5.bin | grep -e '\.js' -e '\.htm' -e '\.css' -e '\.jpg' -e '\.gif' | wc -l 140

This looks encouraging! The bytes 5a 00 00 80 are likely present at the beginning of each file.

If the structures we found earlier are part of a file system, there are two pieces of information that will need to accompany each file name:

Where is the file located? How big is the file?

First, let’s look at the 12 bytes immediately following the ‘owowow’ string. We’ll assume that these are integer (4 byte) fields, and we’ll cast each 4 byte field as various data types to see if we can make sense of any of them:

embedded@ubuntu:~/Mercury/110523$ binwalk -C -o 0xf5b94 -l 12 -b 4 mw150rv5.bin DECIMAL HEX DESCRIPTION ------------------------------------------------------------------------------------------------------- 1006484 0xF5B94 Hex: 0x00000001 Little Endian Long: 16777216 Big Endian Long: 1 Little Endian Short: 0 Big Endian Short: 0 Little Endian Date: Mon Jul 13 21:20:16 1970 Big Endian Date: Wed Dec 31 16:00:01 1969 1006488 0xF5B98 Hex: 0x0000008C Little Endian Long: -1946157056 Big Endian Long: 140 Little Endian Short: 0 Big Endian Short: 0 Little Endian Date: Thu Apr 30 16:49:04 1908 Big Endian Date: Wed Dec 31 16:02:20 1969 1006492 0xF5B9C Hex: 0x00006F30 Little Endian Long: 812580864 Big Endian Long: 28464 Little Endian Short: 0 Big Endian Short: 0 Little Endian Date: Sun Oct 1 13:54:24 1995 Big Endian Date: Wed Dec 31 23:54:24 1969

The first four bytes are 00 00 00 01. This is likely not an offset or a size field. It could be a version number or something else, but is probably not terribly important.

The second integer field, when cast as a big endian value, is 140 – exactly the number of files that we have!

The last field doesn’t really match up with anything that we’ve seen so far.

It looks like these first 12 bytes comprise the header for the file system (for lack of a better name, we’ll call it the OW file system):

struct owfs_header { char magic[32]; // 'owowowowowow...' uint32_t version; // version #1 uint32_t file_count; // 140 files uint32_t unknown; // ?? }

Next we need to determine what the 8 bytes that follow the file names represent. Since we still need to know the size and location of each file, it stands to reason that these bytes represent those values. Let’s look at the first file entry:

embedded@ubuntu:~/Mercury/110523$ hexdump -C mw150rv5.bin | less 000f5ba0 63 6f 6d 6d 6f 6e 2e 6a 73 00 00 00 00 00 00 00 |common.js.......| 000f5bb0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000f5bc0 00 00 00 00 00 00 00 00 00 00 05 bb 00 00 1a 6c |...............l|

The two values 00 00 05 BB and 00 00 1A 6C likely represent the file size and an offset to the file’s location in the firmware image. Recall that the ‘owowow’ magic bytes indicating the beginning of the file system are located at offset 0x0F5B74. Adding 0x01A6C to 0x0F5B74 gives us 0x0F75E0, the exact location of the fist occurrence of the bytes 5A 00 00 80 that we identified earlier:

embedded@ubuntu:~/Mercury/110523$ hexdump -C mw150rv5.bin | less 000f75b0 57 7a 64 57 6c 61 6e 52 70 6d 2e 68 74 6d 00 00 |WzdWlanRpm.htm..| 000f75c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000f75d0 00 00 00 00 00 00 00 00 00 00 12 d2 00 02 fa 00 |................| 000f75e0 5a 00 00 80 00 e3 18 00 00 00 00 00 00 00 33 1e |Z.............3.| 000f75f0 1f 3f 54 6a bb ab 88 57 be 2b 9f 84 e1 82 4e 3c |.?Tj...W.+....N<| 000f7600 6a 17 85 39 7c 94 92 e9 84 58 10 4c 68 fd 92 20 |j..9|....X.Lh.. | 000f7610 3a 3c 5a 8c a8 cf 6a 11 90 0f 44 8a f0 19 14 0d |:

If the second 4 byte value is the file offset, then the first four byte offset is likely the file size. Adding the 0x05BB value to the file offset 0x0F75E0 gives us 0x0F7B9B. At the very next byte (0x0F7B9C) we find the second occurrence of the the bytes 5A 00 00 80:

embedded@ubuntu:~/Mercury/110523$ hexdump -C mw150rv5.bin | less 000f7b70 13 34 51 99 04 77 d7 85 21 2f 7f 11 74 df d2 16 |.4Q..w..!/..t...| 000f7b80 f7 fc ef 97 38 81 a0 23 15 e0 f9 6e 50 4b e0 eb |....8..#...nPK..| 000f7b90 57 64 4f f4 11 9e 83 d7 59 ca b4 00 5a 00 00 80 |WdO.....Y...Z...| 000f7ba0 00 8d 08 00 00 00 00 00 00 00 21 14 41 79 b8 3b |..........!.Ay.;| 000f7bb0 d8 aa f3 30 85 4f 7a 2a f3 2c e2 50 44 3e 77 ed |...0.Oz*.,.PD>w.|

This looks good; we can now construct the structure for each file entry in the file system:

struct owfs_entry { char file_name[40]; uint32_t file_size; uint32_t file_offset; }

Now that we know how the file system is constructed, we can write a utility to extract the files. I’ve written an unowfs tool to do just that; you can download it here, and use it to extract the files from the OWFS image:

embedded@ubuntu:~/Mercury/110523$ dd if=mw150rv5.bin bs=1006452 skip=1 of=owfs.img 0+1 records in 0+1 records out 204240 bytes (204 kB) copied, 0.018024 s, 11.3 MB/s embedded@ubuntu:~/Mercury/110523$ gcc -Wall unowfs.c -o unowfs embedded@ubuntu:~/Mercury/110523$ ./unowfs owfs.img Extracting 140 files from OWFS version 1 image... common.js [1467] css_help.css [480] css_main.css [552] custom.js [253] help.js [817] menu.js [2259] banner.htm [736] bottom1.htm [242] bottom2.htm [243] logo.htm [832] AssignedIpAddrListHelpRpm.htm [747] BackNRestoreHelpRpm.htm [817] ChangeLoginPwdHelpRpm.htm [899] DateTimeCfgHelpRpm.htm [426] DMZHelpRpm.htm [626] DomainFilterHelpRpm.htm [1121] FireWallHelpRpm.htm [623] FixMapCfgHelpRpm.htm [451] L2tpCfgHelpRpm.htm [910] LanArpBindingHelpRpm.htm [1005] LanArpBindingListHelpRpm.htm [1167] LanDhcpServerHelpRpm.htm [1018] LanMacFilterHelpRpm.htm [1071] MacCloneCfgHelpRpm.htm [608] ManageControlHelpRpm.htm [606] MiscHelpRpm.htm [920] NetworkCfgHelpRpm.htm [600] PeanutHullDdnsHelpRpm.htm [497] PingHelpRpm.htm [464] PPPoECfgAdvHelpRpm.htm [973] PPPoECfgFailAuthReasonHelpRpm.htm [1315] PPPoECfgFailOtherReasonHelpRpm.htm [866] PPPoECfgFailResponseReasonHelpRpm.htm [1216] PPPoECfgHelpRpm.htm [1047] PptpCfgHelpRpm.htm [909] QoSCfgSOHOHelpRpm.htm [1447] RestoreDefaultCfgHelpRpm.htm [665] SpecialAppHelpRpm.htm [1466] StaticRouteTableHelpRpm.htm [395] SysRebootHelpRpm.htm [423] SystemStatisticHelpRpm.htm [797] UpnpCfgHelpRpm.htm [589] VirtualServerHelpRpm.htm [1397] WanDhcpPlusCfgHelpRpm.htm [839] WanDynamicIpCfgHelpRpm.htm [819] WanDynamicIpCfgHelpRpm_8021X.htm [1210] WanIpFilterHelpRpm.htm [1403] WanStaticIpCfgHelpRpm.htm [664] WanStaticIpCfgHelpRpm_8021X.htm [1080] WlanAdvHelpRpm.htm [684] WlanMacFilterHelpRpm.htm [1202] WlanNetworkHelpRpm.htm [981] WlanSecurityHelpRpm.htm [1260] WlanStationHelpRpm.htm [414] WlanWpsChkModeHelpRpm.htm [849] WlanWpsHelpRpm.htm [1133] arc.gif [70] bgColor.jpg [418] bg_title1.jpg [217] empty.gif [53] logo.jpg [703] minus.gif [67] plus.gif [72] pw.gif [68] sp.gif [67] str_err.js [5910] str_menu.js [1378] AdvScrRpm.htm [2049] AssignedIpAddrListRpm.htm [899] AuthError.htm [952] BakNRestoreRpm.htm [1054] ChangeLoginPwdRpm.htm [1359] confUploadErrorRpm.htm [713] DateTimeCfgRpm.htm [2677] DiagnosticRpm.htm [2209] DMZRpm.htm [1042] DomainFilterAdvRpm.htm [1473] DomainFilterRpm.htm [1670] errorPage.htm [753] FireWallRpm.htm [1161] FirmwareUpdateTemp.htm [1405] FixMapCfgAdvRpm.htm [1392] FixMapCfgRpm.htm [1793] Index.htm [639] L2TPCfgRpm.htm [3490] LanArpBindingAdvRpm.htm [1292] LanArpBindingFindRpm.htm [2294] LanArpBindingListRpm.htm [1241] LanArpBindingRpm.htm [2002] LanDhcpServerRpm.htm [1528] LanMacFilterAdvRpm.htm [1257] LanMacFilterRpm.htm [1692] MacCloneCfgRpm.htm [1554] ManageControlRpm.htm [1665] MenuRpm.htm [1055] MiscShowRpm.htm [1079] NetworkCfgRpm.htm [1347] PeanutHullDdnsRpm.htm [2026] PingIframeRpm.htm [1553] popupSiteSurveyRpm.htm [1821] PPPoECfgAdvRpm.htm [1890] PPPoECfgRpm.htm [4169] PPTPCfgRpm.htm [3514] QoSCfgSOHORpm.htm [3319] restart.htm [1833] RestoreDefaultCfgRpm.htm [792] SoftwareUpgradeRpm.htm [1293] SpecialAppAdvRpm.htm [2088] SpecialAppRpm.htm [1713] StaticRouteTableAdvRpm.htm [1181] StaticRouteTableRpm.htm [1560] StatusRpm.htm [4094] SysRebootRpm.htm [733] SystemLogRpm.htm [972] SystemStatisticRpm.htm [2383] UpnpCfgRpm.htm [1299] VirtualServerAdvRpm.htm [1934] VirtualServerRpm.htm [1702] WanDhcpPlusCfgRpm.htm [2231] WanDynamicIpCfgRpm.htm [3307] WanDynamicIpCfgRpm_8021X.htm [2099] WanIpFilterAdvRpm.htm [1474] WanIpFilterRpm.htm [2202] WanStaticIpCfgRpm.htm [2685] WanStaticIpCfgRpm_8021X.htm [1827] WlanAdvRpm.htm [2108] WlanMacFilterAdvRpm.htm [1302] WlanMacFilterRpm.htm [2242] WlanNetworkRpm.htm [6083] WlanSecCheck.htm [658] WlanSecurityRpm.htm [5439] WlanStationRpm.htm [1377] WlanWpsChkModeRpm.htm [1622] WlanWpsRpm.htm [1602] WzdEndRpm.htm [1366] WzdPPPoERpm.htm [1137] WzdStartRpm.htm [1007] WzdStaticIpRpm.htm [1185] WzdWanTypeRpm.htm [996] WzdWlanRpm.htm [4818] Extracted 140 files to ./owfs-root/

Although we now have the files extracted from the firmware image, they are still compressed. Unfortunately, the file utility doesn’t recognize the extracted files, and Googling for ‘5A 00 00 80’ didn’t turn up anything useful. Given the lack of strings in the firmware image, it is likely that the remainder of the firmware is also compressed, making code analysis impossible without first decompressing the data (there appear to be no standard compression or archive headers elsewhere in the firmware).

Since I don’t have one of these routers myself, this is where I’ve stopped. Looking at some internal images of these devices, it appears that they do have a serial port. Simply observing the debug output from the boot loader and OS during start up may reveal some hints on where to go from here, and if nothing else it should be possible to dump the SPI flash chip in order to get the boot loader code.

So if anyone can shed some further light on how these files are compressed, let me know!

UPDATE:

ghjm and insn left comments regarding the compression used for these files. They are LZMA:23 and can be decompressed with p7zip/lzmadec:

embedded@ubuntu:~/Mercury/110523/owfs-root$ for FILE in *; do mv $FILE $FILE.7z && p7zip -d $FILE.7z; done ... 7-Zip (A) 9.04 beta Copyright (c) 1999-2009 Igor Pavlov 2009-05-30 p7zip Version 9.04 (locale=en_US.utf8,Utf16=on,HugeFiles=on,1 CPU) Processing archive: UpnpCfgRpm.htm.7z Extracting UpnpCfgRpm.htm Everything is Ok Size: 3882 Compressed: 1299 7-Zip (A) 9.04 beta Copyright (c) 1999-2009 Igor Pavlov 2009-05-30 p7zip Version 9.04 (locale=en_US.utf8,Utf16=on,HugeFiles=on,1 CPU) Processing archive: VirtualServerAdvRpm.htm.7z Extracting VirtualServerAdvRpm.htm Everything is Ok Size: 5762 Compressed: 1934 ...

Thanks guys!

UPDATE #2:

After some further probing, I’ve concluded that this firmware is definitely VxWorks based. Whether this file system is exclusive to VxWorks I can’t say for sure, but I wouldn’t be surprised if it was.

UPDATE #3:

It looks like Ruben from IOActive has a little more insight into this file system. As suspected it is a VxWorks pseudo file system, called MemFS (aka, Wind River management file system). Thanks to Ruben and Sergio!