This blog post deals with the Legu packer, an Android protector developed by Tencent that is currently one of the state-of-the-art solutions to protect APK DEX files. The packer is updated frequently and this blog post focuses on versions 4.1.0.15 and 4.1.0.18 .

Some functions of the library are obfuscated but thanks to Frida/QBDI their analysis is not a big deal.

The main logic of the packer is located in the native library libshell-super.2019.so which basically unpacks and loads the protected DEX files from the resources.

An application protected with Legu is composed of two native libraries: libshell-super.2019.so and libshella-4.1.0.XY.so as well as raw binary files embedded in the resources of the APK:

Internals

Basically, the original DEX files are located in the assets/0OO00l111l1l file along with the information required to unpack them.

The following figure lays out the structure of this file.

In the assets/0OO00l111l1l file, the first part contains the original DEX files with the same number of classes<N>.dex according to the multi-DEX feature of the original APK. These DEX files are not exactly the original ones, as their Dalvik bytecode have been NOP-ed by Legu. Therefore, a dump of these files only gives information about the classes' names, not the code logic:

Then follows what we called a hashmap that is used to link a class name (e.g. Lcom/tencent/mmkv/MMKV;) to an offset in the data block located in the third part of the file. This data block contains the original Dalvik bytecode of the methods.

Actually, the first part that contains the altered DEX files, is compressed with NRV . The second part — the hashmap — is also compressed with NRV but the packer adds a layer of encryption through a slightly modified version of XTEA . Finally, the last part is compressed and encrypted with the same algorithms as the previous one.

Regarding the hashmap, it uses a custom structure that has been reversed and lead to a Kaitai structure available here: legu_packed_file.ksy, legu_hashmap.ksy

Its overall layout is exposed in the next figure:

Unpacking process Let's say that the application needs to use the packed Java class Lcom/tencent/mmkv/MMKV;. First, the packer's runtime transforms the class name into an integer with the dvmComputeUtf8Hash() hash function . This integer is then used as an index into the hashmap whose value is a structure that contains information about the class in the packed data (blue area in the figure). The first attribute of this structure — utf8_hash — is a copy of the hash value which is used to check that it is the right key/value association. The class_info structure (blue block in the figure) next contains the packed method information (yellow area in the figure) whose size is the same as the original number of methods in the class. This structure makes the relationship between the NOP-ed bytecode offset in the altered DEX files and the offset in the original bytecode (red block). Finally, the packer copies the original bytecode into the altered DEX files. To summarize, the first part contains the original DEX files with the Dalvik bytecode removed (NOP-ed). The last part contains the missing Dalvik bytecode and the second part makes the bridge between the altered DEX files and the Dalvik bytecode.