Ethereum Wallet in a Trusted Execution Environment / Secure Enclave

Weeve creates secure IoT Wallets for Ethereum, leveraging new cutting-edge Security Technologies

Introduction

Over the last few weeks the Weeve team has received a lot of interest from the community on how we implemented the Ethereum Wallet into our weeveOS. The weeveOS is an open-source Operating System optimized for IoT-to-Ethereum (in future releases we will add support for additional blockchain technologies) applications leveraging cutting-edge security mechanisms to protect the Ethereum wallet against cyberattacks (GitHub). With the WeeveOS the project aims to implement secure and trusted IoT oracles for the blockchain.

You might ask yourself, why is it necessary to have a wallet solution in a Trusted Execution Environment (TEE) also sometimes referred as a Trusted Enclave? A wallet is basically a software application to manage digital currencies. Due to the fact that software will always be an “easy” target for cyber attackers and the fact that this wallet will contain your private keys, we need to ensure that this software piece is in the most secure environment. With the TEE approach we are able to shield the wallet software in a highly secure and trusted way without the need for additional hardware. Or, in other words, now cars, charging stations, containers, or fridges can have their own IoT Ethereum wallet and autonomously participate in financial activities, like buying food, energy or selling shipping space without the risk that the wallets fall prey to cyber theft.

In this article I give an overview on how we implemented the Ethereum Wallet as a trusted application, or more simply as a wallet in C. Before getting into details I briefly introduce the novel concept of a Trusted Execution Environment, describe the Ethereum Virtual Machine (EVM) and detail how the generation of a transaction within a Trusted Execution Environment (TEE) works.

Trusted Execution Environments

A TEE offers an execution space that provides higher security guarantees then a commodity operating system. Among the astonishing security features are (1) isolated executions, (2) secure storage for example cryptographic key material, and (3) remote attestation of the device configuration, to name just a few. Our weeveOS implements ARM’s version of a TEE. A separation of the CPU’s privileges into a secure and a non-secure world and different interrupts in both worlds provide us with a multitude of tools to considerably raise the security bar. Trustzone can be seen as the security extension of ARM like SGX is the security extension of Intel. Both hardware extension leverages (almost) identical security hardening capabilities for Operating Systems, and thus differ only in their target domain: Intel targets the domain of PCs and servers, while ARM dominates the sector of embedded/IoT devices. Both technologies have in common the isolation of programs. Intel calls the principal a Trusted Enclave, ARM calls it a Trusted Execution Environment.

In the following we stick to the ARM jargon. Trustzone divides a computing environment (e.g., memory areas, CPU privileges) in two isolated compartments known as the secure and non-secure world:

The Non-Secure World is the ”normal” environment of the CPU, where no higher privileges are available. Only the necessary rights and interrupts are allocated to ensure that no unauthorized access to the secure world is possible.

The Secure World is the main feature of ARM’s TrustZone architecture. It contains the secure operating system, a secure monitor, cryptographic interfaces, secure storage and trusted applications.

Diagram showing the non-secure and secure worlds of the weeveOS. The implementation is available in our GitHub.

Alone, the Trustzone hardware is insufficient to grasp all advantages of a Trusted Execution Environment. In addition to the hardware, one needs a TEE-empowering Operating System, to activate the full security potential. That’s why the Weeve team developed the weeveOS. In fact, a TEE comprises of (at least) two Operating Systems. The first implements the normal world and the second one implements the secure world. Our approach is to choose a lean secure world OS and rich normal world OS. The argument is as follows, by reducing the complexity of the secure OS, the odds of the vulnerability being found are shorter than in the case of an OS with a comprehensive kernel and middle layer. As illustrated, three TEE modules augment the system architecture of a commodity OS, helping with the execution of programs in the secure world known as trusted applications:

TEE Client: A client offers some interfaces for native applications to interact with Trusted Applications in the secure OS. TEE Driver: A driver gives the privileged normal OS the capability of making use of the Trustzone hardware security technologies. TEE Monitor A (secure) monitor ensures that only valid and well-formed values are allowed as inputs for the secure world, while being the supervisor for everything that may happen in both worlds. All communication to and from the secure world go through the monitor. The whole weeveOS machinery is necessary to realize security functionalities that commodity IoT Operating Systems typically lack to provide.

Another important point to mention is that Ethereum is based on a virtual machine and not a database like for example it is the case in the Bitcoin network. Due to this fact an Ethereum smart contracts allow the execution of commands, not only transactions (being a special case). A wallet is the client software to interact with the EVMs.

With the TEE Ethereum Wallet we are able to create transactions the EVM can parse and execute. A bit more precise, the TEE Ethereum Wallet sends a command to the EVM (via our Gateway, which is connected with the EVM). In order not to break anything the commands have to be well formatted. The delicate obstacle we had to solve in the implementation was to cope with the considerably limited programming opportunities the secure world offers. The reason is the secure monitor which by definition permits limited interaction with the secure world.

Why Ethereum Wallets matter

But why is there a need for an Ethereum Wallet and especially a wallet in a trusted application when there are enough free alternative solutions out there? The answer is quite simple, all these solutions are firstly not built in C and secondly not built in a TEE. A huge advantage against classical hardware wallets is that there is no need for extra hardware, because of the paradigm of the TEE we are able to keep the advantages of a hardware wallet. Hence changes to the crypto algorithms or adaptations of other wallets (e.g. multi-wallets) can be implemented with ease. Another advantage is that there is no need for extra human interaction if you want to use the wallet. That’s in particular attractive as IoT devices are machines that can now autonomously participate in a fair exchange of digital assets without the user intervention.

Creating an Ethereum Address

The first thing we need for our Wallet is an Ethereum Address and our public and private key pair. This keypair is calculated with a Digital Signature Algorithm (DSA) based on Elliptic Curve Cryptography (ECC), which is parameterized with the SECP256K1 curve. Because the wallet is built in a trusted application we can save our key pair in a secure storage without hesitation. The public key is now used to generate our Ethereum Address.

An Ethereum Address is basically nothing else than the last 20 bytes from the hashed public key. So we take our public key and hash it with a SHA3–256 (Keccak) hashing algorithm. The output will be 32 bytes long.e then simply drop the first 12 bytes to get our Ethereum Address.

For example (output is printed in hex representation):

hash(public_key) = 45c1442cbfeebe459c91636455c2d462b27e49df96ae923a131bdf81548e3b2e

Than our Ethereum Address would be:

55c2d462b27e49df96ae923a131bdf81548e3b2e

Well done, with our key pair and the Ethereum Address we have created our secure Wallet. The next step is to generate transactions. Here comes the exciting stuff…

Creating a transaction

RLP encoding

By using MQTTS to communicate with our clients and our Gateway, we can push the needed transaction parameters (like gas price, gas limit, etc.) directly into our trusted application. This will ensure that no normal world application can manipulate these parameters.

To push the transaction parameters to the desired client the Gateway will set all negotiated parameters into a JSON-String and push them back via MQTTS. The trusted application on that client will parse the JSON-String and generate a structure with all the necessary values. We will use these values to create our transaction. Basically we take every parameter from our structure and encode all values with the Recursive Length Prefix (RLP) encoding so the Ethereum network can handle our transactions.

The RLP encoding is defined as follows:

If a byte value whose value is between 0x01 and 0x7f, we just take this value as rlp encoding. A 0x00 value will be 0x80. [CHECK PLEASE 0 == 0x80]

For bytes with 2–55 bytes length, the encoding consists of a length prefix plus the string itself. The length prefix is the length of the string plus the value (the offset) 0x80.

For example the string “hello world” has a length of 11 (or hex 0x0b) so the first byte of RLP is 0x80 + 0x0b = 0x8b. Concatenate with the string will give us:

“hello world” = 0x8b, 0x68, 0x65, 0x6c, 0x6c, 0x6f, 0x20, 0x77, 0x6f, 0x72, 0x6c, 0x64

If the string length is more than 55 bytes the encoding define a byte value 0xb7 plus the length in bytes of the length of the string plus the length of the string. After that the actual string length and the string itself is added at the end.

Imagen a string with 1024 “s”:

“sss…” = 0xb9, 0x04, 0x00, 0x73, 0x73, 0x73

So the 0x73s defines the string itself. The 0x04, 0x00 define the lenght of the string (1024 = 0x0400). Finally, the first part 0xb7 plus the length of the second part (0xb9 = 0xb7 + 0x02).

If we encode the hole payload and the payload is a list we determine the total length of this list and check if its between 0 and 55 bytes. The encoding consists of the length of the list + 0xc0 plus the the RLP encoded payload

Last rule, if we have a payload list and the length is more then 55 bytes, the RLP encoding consists of 0xf7 plus the length in bytes of the length of the payload followed by the RLP encoded payload.

Obviously we create some helper function which takes on the RLP encoding steps.

v, r, s values

The v value determines which the chain we are wanting to publish our transaction and is the hex representation of a decimal number (i.e. if we want to work on chain id “4” our v value is 0x04 in the first step).

The r and s values later hold the signature split into 32 bytes each. But, for the first step we just have to keep it with 0 (as mentioned the RLP encoding will changes this to 0x80 in our transaction).

Necessary transaction values

Furthermore a transaction needs the following fields filled:

Gas price: The gas price determines how much Wei the creator wants to pay to redeem the transaction.

Gas limit: With the gas limit we define our upper limit on how much we want to pay on a single transaction.

To: This field holds the Ethereum Address of the recipient.

Nonce: The nonce is an arbitrary number that only can be used once and shows us how many transactions that Wallet has used.

Value: Here we decide how much Ether we want to transfer

Chain id: As mentioned before, this value is place in the v field and define the used chain id.

Data [optional]: The data field is the only optional field in the transaction. We define here a trade id to trace the transaction in the Ethereum network.

Order of transaction values

Next we want to create our transaction. This is simply done by encoding all the above mentioned values with our RLP encoding and concatenate everything together. Here it is very important to concatenate the values in the right order. Ethereum is unable to parse transactions with different orders.

Nonce + gas_price + gas_limit + to + value + data + v + r + s => transaction

Before using this concatenated string as transaction, we need to encode the hole payload. As mentioned in the RLP rules we don’t have a payload list, just a string, so the offset for this step is 0xc0.

So from our values (written in hex representation):

nonce = “00” gas_price = “01” gas_limit = “0186a0” to = “f125691d24a6b5cdcb87a89cb825fdf4487c2a34” value = “0401” data = “fe3d61b7cd6eff03698c0303b056e0fe3116206e” v = “04” We use the chain id 4 r = “00” s = “00”

We get an encoded transaction:

0xf84a8001830186a094f125691d24a6b5cdcb87a89cb825fdf4487c2a34820401a866653364363162376364366566663033363938633033303362303536653066653331313632303665048080

Now we have our encoded transaction we are ready to get it signed.

Signing a Transaction

After we encode the transaction to our desired format we can start signing this message.

The signing algorithm takes our encoded transaction and produces a hash from that (again the used hash algorithm is SHA3–256 Keccak). This hash and the private key will generate our signature. The signature itself consists of a pair (r,s) where each element is 32 bytes long.

Encode it again

Finally we use the signature, split the pair into two 32 byte strings and move them to our structure. Obviously r is pushed to our r structure value and s to s.

The v value will (as mentioned in the description of the transaction) determine the used chain and is used to recover the public key from the signature later.

Recovering a public key

To recover a public key from an Elliptic Curve Digital Signing Algorithm (ECDSA) it is mandatory that we know

a) what curve is used with that algorithm

b) what hash function is used and

c) what message was signed

Luckily we know all these requirements, but we can’t uniquely recover the public key for now.

If we try to recover it with these parameters we will get two public keys; one on the positive y coordinate and one on the negative side. To uniquely recover the public key we also need a recovery id which will determine the right public key. Ethereum saves this recovery id in the v field by recalculating it with:

chain_id*2+35+(0 or 1)

The 0 or 1 determine if we have a positive or negative position of the public key y point. To get this value we simply read the last bit of the public key y point and determine if its 0 or 1.

So, for our previous example transaction we get:

0xf88a8001830186a094f125691d24a6b5cdcb87a89cb825fdf4487c2a34820401a8666533643631623763643665666630333639386330333033623035366530666533313136323036652ba0d5c8eaca6a7bc128065bf7ed3fa053863cdff64b8910f1325099eb89328877fba073fc3184cf468d163296543433c31213ebcf83b3a076a8765461e81c79b1ebc7

With the reseted v, r, s values we encode the hole transaction again as before. The output will now populated to our normal world application and can be used to push into the Ethereum-chain.

Conclusion: Success for Ethereum-chain Secure wallets

Thanks to our advances in creating this trusted application inside of the secure world, we are now able to securely create Wallets and transactions for the Ethereum-chain. This allows us to counter cyber attacks against IoT devices by utilizing the security features of a TEE (Trusted Execution Environment).

If you made it this far, firstly congratulations and secondly, we are sure you either have a few questions or would like to chat to us about it. Contact us on Gitter and take a look at our Github here or join the conversation on Telegram.