This post is part of the Automated analysis of Ethereum smart contracts series. Take a look there to learn more about related topics.

In a concrete execution, a program is run on a specific input and a single control flow path is explored. In contrast, symbolic execution [1] [2] can explore multiple paths that a program could take under different inputs. The key idea is to allow a program to take symbolic values as inputs. Execution is performed maintaining for each explored control path a formula that describes the conditions satisfied by the branches taken along that path, and a symbolic memory store that maps variables to symbolic expressions or values. Next we present a C code example, and its symbolic execution tree, along with its symbolic stores and path constraints.

void aFunction(int a, int b) { int x = 1, y = 0; if(a != 0) { y = 3 + x; if(b == 0) { x = 2 * (a + b); } } assert(x - y != 0); }

Each execution state, labeled with an uppercase letter, shows the statement to be executed, the symbolic store σ, and the path constraints π. Leaves are evaluated against the condition in the assert statement. This example was taken from [1].

The main goal of this technique, in the context of software testing, is to explore as many different program paths as possible in a given amount of time; then with each path we can generate a set of concrete input values that execute it, to finally check for the presence of errors. From a test generation perspective, it allows the creation of high coverage test suites, while from a bug finding perspective, it provides developers with a concrete input that triggers the bug, which can be used to confirm and debug the error independently of the symbolic execution tool that generated it.

Given a path constraint, meaning the boolean formula mentioned, the generation of concrete input values that satisfy it is typically based on an off-the-shelf satisfiability modulo theories solver, or SMT solver. Constraint solving is one of the main challenges faced by symbolic execution, SMT solvers can scale to complex combinations of constraints over hundreds of variables, however, constructs such as non linear arithmetic pose a major obstacle to efficiency and decidability.

Another challenge of this technique is state space explosion. Language constructs such as loops might exponentially increase the number of execution paths. It is thus unlikely that a symbolic execution engine can exhaustively explore all the possible paths within a reasonable amount of time. There are two key approaches that have been used to address this problem: heuristically prioritizing the exploration of the most promising paths, and using sound program analysis techniques to reduce the complexity of the path exploration [2].

Classical symbolic execution cannot deal with external code not traceable by the executor, nor with complex constraints solving. To cope with these issues concrete and symbolic execution can be mixed into concolic execution.

One popular concolic execution approach, known as dynamic symbolic execution [1], is to have concrete execution drive symbolic execution. After choosing an arbitrary input to begin with, it executes the program both concretely and symbolically. Whenever the concrete execution takes a branch, the symbolic execution is directed toward the same branch and the constraints extracted from the branch condition are added to the current set of path constraints. As a consequence, the symbolic engine does not need to invoke the constraint solver to decide whether a branch condition is satisfiable. In order to explore different paths, the path conditions given by one or more branches can be negated to generate a new input.

There are several approaches for bug finding in smart contracts that rely on symbolic execution. Oyente [3] was one of the first published approaches, it focuses on known security vulnerabilities. It was later extended as Gasper [4] to detect gas costly patterns, Osiris [5] to focus on integer errors, and SASC [6] to generate a diagram of invocation relationship and to find potential logic risks. Pakala [7] aims at efficiently handling multiple transaction counterexamples, it symbolically executes once each public function that produces a state change, to then join resulting states in all the possible orders up to a bound. It only looks for vulnerabilities related to sending ether from the contract under study to an address in control of an attacker.

Another tool that relies on symbolic execution is Mythril [8], it wasn’t formally published but it’s open source, it was informally introduced at HITBSecConf 2018 [9]. Later on, MythX [10] integrated Mythril with Harvey (a fuzzer that will be discussed in one of the following sections of the series), and a new static analysis tool: Maru [11], along with Maestro, a component that syncs them all. Maestro also estimates the confidence of each vulnerability found according to what each tool reported. MythX is a private tool accessible through a paid subscription. Both Mythril and MythX only search for predefined security properties.

teEther

teEther [12] takes advantage of a common static analysis technique called slicing to reduce the state space explosion. A program slice is a subset of the original program statements relevant to a particular computation.

teEther’s goal is to check whether a contract is vulnerable to arbitrary ether transfers, thus it only cares about possible execution paths that end up in specific instructions executed with parameters under an attacker’s control. These instructions, referred to as “critical instructions”, are: call , callcode , delegatecall , and selfdestruct .

To take advantage of an unprotected delegatecall or callcode , for example, a selfdestruct statement could be placed on the callee.

This tool first explores the contract to find these critical instructions, then calculates slices that reach them, and finally performs symbolic execution on the slices to potentially generate transaction sequences that allow arbitrary ether transfers.

Another approach that implements a similar technique is sCompile [13], but we won’t expand on it because it’s neither published nor has a public implementation.

Manticore

Manticore [14] is an open source program developed by Trail of Bits [15], a company that specializes in software security audits. It’s a dynamic symbolic execution framework for analyzing binaries and an arbitrary number of interacting Ethereum smart contracts.

Its core engine is the source of its flexibility; it implements a generic platform-agnostic symbolic execution engine that makes few assumptions about the underlying execution model. State exploration also contributes to its flexibility, because it implements a variety of heuristics to select the next state to explore and to determine when a state must be concretized.

Users are encouraged to create modular, event-based analyses using Manticore’s plugin system. Arbitrary points within the application can broadcast various symbolic execution events that can be handled by an event subscriber outside of the tool.

The authors affirm that Manticore achieves on average 66% code coverage with a default analysis. This analysis was performed on 100 contracts taken from mainnet, but they didn’t specify how and why those specific contracts were chosen.

References

[1] R. Baldoni, E. Coppa, D. C. D’Elia, C. Demetrescu, and I. Finocchi, “A survey of symbolic execution techniques,” ACM Comput. Surv., vol. 51, no. 3, 2018.

[2] C. Cadar and K. Sen, “Symbolic execution for software testing: Three decades later.” 2010.

[3] L. Luu, D.-H. Chu, H. Olickel, P. Saxena, and A. Hobor, “Making smart contracts smarter,” in Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pp. 254–269, ACM, 2016.

[4] T. Chen, X. Li, X. Luo, and X. Zhang, “Under-optimized smart contracts devour your money,” in 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 442–446, IEEE, 2017.

[5] C. F. Torres, J. Schütte, et al., “Osiris: Hunting for integer bugs in ethereum smart contracts,” in Proceedings of the 34th Annual Computer Security Applications Conference, pp. 664–676, ACM, 2018.

[6] E. Zhou, S. Hua, B. Pi, J. Sun, Y. Nomura, K. Yamashita, and H. Kurihara, “Security assurance for smart contract,” in 2018 9th IFIP International Conference on New Technologies, Mobility and Security (NTMS), pp. 1–5, IEEE, 2018.

[7] Pakala’s repository. Accessed: 2019-10-03.

[8] Mythril’s repository. Accessed: 2019-10-22.

[9] B. Mueller, “Smashing ethereum smart contracts for fun and real profit.” Accessed: 2019-10-22.

[10] Mythx’s official website. Accessed: 2019-10-22.

[11] “The tech behind mythx smart contract security analysis". Accessed: 2019-11-28.

[12] J. Krupp and C. Rossow, “teether: Gnawing at ethereum to automatically exploit smart contracts,” in 27th {USENIX} Security Symposium ({USENIX} Security 18), pp. 1317–1333, 2018.

[13] J. Chang, B. Gao, H. Xiao, J. Sun, and Z. Yang, “scompile: Critical path identification and analysis for smart contracts,” arXiv preprint arXiv:1808.00624, 2018.

[14] M. Mossberg, F. Manzano, E. Hennenfent, A. Groce, G. Grieco, J. Feist, T. Brunson, and A. Dinaburg, “Manticore: A user friendly symbolic execution framework for binaries and smart contracts.”, jul 2019.

[15] Trail of Bits’ official website. Accessed: 2019-09-12.