Falcon 0.1.0

Today marks the 0.1.0 release of Falcon. This is Falcon’s first release!

Falcon is a Binary Analysis Platform written in Rust, and released under the Apache License 2.0. It provides the building blocks for implementing analyses such as Symbolic Execution and Abstract Interpretation over binary programs.

Documentation can be found here.

What’s in this release

An expression-based IL

Falcon uses yet-another-IL, called Falcon IL, for analysis. Falcon IL is a cross between RREIL and Binary Ninja’s LLIL. The standard hierarchy of structures in Falcon IL are:

Program - Contains many Functions.

- Contains many Functions. Function - Contains a ControlFlowGraph.

- Contains a ControlFlowGraph. ControlFlowGraph - Contains Blocks (and Edges, not listed here).

- Contains Blocks (and Edges, not listed here). Block - A linear sequence of Instructions.

- A linear sequence of Instructions. Instruction - Provides location to an Operation.

- Provides location to an Operation. Operation - Implements semantics of a program, modifies state. The six operations are assign , store , load , brc , phi , and raise .

- Implements semantics of a program, modifies state. The six operations are , , , , , and . Expression / Scalar / Array / Constant - Building blocks of an Expression.

Elements of the Falcon IL can be grouped into two broad categories: Control-Flow and Semantics. This is a confusing, and perhaps contradictory statement, but best illustrated by comparing Function with ControlFlowGraph , or Instruction with Operation .

A Function provides a location for a ControlFlowGraph in a Program . When we are searching for new locations to analyze, we search for a Function , not a ControlFlowGraph . However, when we analyze the semantics of a program, we analyze the ControlFlowGraph , not the Function .

Likewise, an Instruction provides a location, or position, within a linear sequence in a Block . The semantics of an Instruction are implemented in an Operation . A Symbolic Execution Engine, for example, can be broken into two parts: One that handles Control-Flow, and one that applies semantics over a symbolic state.

I will write more about Falcon IL in the coming weeks, but I have spent time documenting its internals with Rust’s inline documentation.

X86 Translation

Falcon uses capstone for instruction decoding, and lifts X86 programs to Falcon IL. The list of supported instructions can be found here.

Of note:

Falcon IL does not yet support values >64 bits, though this is coming, and we therefor are not lifting SSE/AVX instructions.

Falcon IL has no means of dealing with floating-point instructions. Floating-point instructions are translated to a Raise instruction. This allows you to translate 32-bit libraries which have floating-point instructions, but you can safely ignore them unless encountered during analysis.

A ControlFlowGraph in Falcon has optional conditional edges. Direct conditional branches, such as je , do not emit Operation::Brc (A conditional-branch instruction), but instead place conditions over edges between successor blocks.

A Symbolic Execution Engine

Falcon comes with the building blocks for symbolic execution. An example which ties these blocks together can be found in the test for simple-0, which symbolically executes a contrived x86 Linux application, solves for the inputs, and verifies the results. This takes about a second, so I have made end-to-end symbolic execution a built-in test that takes place on every push to github.

Falcon’s symbolic execution underpinnings are currently using z3 for smt solving. However, Falcon produces smtlib2-formatted input instead of using the z3 API directly, and I have attempted to ensure the solver can be swapped out as needed.

Loading and linking of Elf Binaries

Falcon’s loader currently supports the Elf file format. Falcon will discover all functions for which symbols exist, and allows the user to specify additional function entries as needed. Falcon currently has no means to resolve indirect branches statically, so jump tables and other indirect branches will pose a hurdle for Falcon in this release.

Additionally, in order to symbolically execute our simple-0 example, Falcon can resolve dependencies and relocations between Elfs. This allows us to provide a ld-linux.so and libc.so from an Ubuntu 16.04 i386 environment, and leave Falcon to link these together for static symbolic execution.

The beginnings of platform modelling

Falcon provides a means for modelling platforms. Currently we provide a model only for LinuxX86, with one system call, to work around contrived examples. See the linux_x86 module for an example of what this looks like.

Unsupported: Data-Flow Analysis and Abstract Interpretation

Falcon provides a fixed-point engine for Data-Flow analysis. Implemented with this engine are analyses and transformations, including reaching definitions, UseDef/DefUse chains, dead code elimination, single static assignment, and a intra-procedural value set analysis.

This code requires refactoring, testing, and is currently unsupported. However, here is an example of the main function from the simple-0 example lifted, with dead code elimination and SSA applied:

Link to full-size image

Using Falcon

The see the requirements for using Falcon, please refer to its Dockerfile. Dependencies are required for bindgen to link against capstone, as well as a requirement for the z3 solver. The Dockerfile will get you up and running in Ubuntu Xenial, though I do the majority of my development in OSX.