In part 1 of this series we covered the origins of why Py-EVM was created.

In the next few posts we’ll take a look at some of the architecture of Py-EVM. In this post we’ll start at the lowest level of the EVM and look at how opcodes are implemented.

Opcode Primer

The code that the EVM executes, often referred to as bytecode, is made up of many individual opcodes. An opcode is the smallest unit of computation. Opcodes are normally referenced by their mnemonic name but are canonically represented by a hexidecimal number.

ADD : 0x01

: PUSH32 : 0x7f

: DELEGATECALL : 0xf4

EVM execution involves iterating over some bytecode, executing each opcode until execution is complete. The specifics of how each opcode executes are what define the baseline functionality of the EVM. Higher level languages like Solidity compile down to EVM bytecode, translating the code you’ve written into a list of opcodes that represent the same application logic.

The Ethereum yellow paper defines approximately 130 opcodes.

Opcodes in PyEthereum

One of the main departures from the PyEthereum architecture is how Py-EVM handles opcodes. In PyEthereum, the opcode logic is defined within the vm.py module (found here). Each opcode is a clause within an extensive if/elif/else clause that is a bit over 400 lines in length and which contains a branch for each of the roughly 130 EVM opcodes. While this design is pragmatic and efficient, it is by no means modular or extensible. Here is a small excerpt.

if op == 'STOP':

return peaceful_exit('STOP', compustate.gas, [])

elif op == 'ADD':

stk.append((stk.pop() + stk.pop()) & TT256M1)

elif op == 'SUB':

stk.append((stk.pop() - stk.pop()) & TT256M1)

elif op == 'MUL':

stk.append((stk.pop() * stk.pop()) & TT256M1)

elif op == 'DIV':

s0, s1 = stk.pop(), stk.pop()

stk.append(0 if s1 == 0 else s0 // s1)

...

In order to add new opcodes, a new if/else clause must be added to this body of code. There are also additional if/else statements for each hard fork protocol change. The result is a very complex module that does not lend itself well to extension, modification or experimentation.

Opcodes in Py-EVM

In Py-EVM, each opcode is a single function. Here is what the function for the ADD opcode looks like.

def add_op(computation):

computation.gas_meter.consume_gas(3, reason='ADD')

left, right = computation.stack.pop(

num_items=2,

type_hint=constants.UINT256,

) result = (left + right) & constants.UINT_256_MAX computation.stack.push(result)

Opcode functions takes a single argument, the computation object, which exposes APIs for all actions that opcodes may need to perform such as stack manipulation, reading account state or consuming gas.

Constructing a VM

Now lets look at how opcodes are composed together into a VM.

from evm import VM ExampleVM = VM.configure(

opcodes={

0x01: add_op,

},

...

)

This example creates a VM with a single opcode. The VM opcodes are specified as a mapping from opcode number to the function containing the opcode logic.

What’s in a VM?

The term VM in the Py-EVM context is used to refer to a single set of rules for a given period of the blockchain. For example, at the time of writing this post, the public Ethereum mainnet has four rule sets:

The initial Frontier rules

The Homestead for rules

The DAO fork rules

The Anti-DOS fork rules