Blockchains are a rapidly expanding area of research, implementation and experimentation. But not all the ideas being explored are enriching the blockchain as a base for a secure and democratizible medium of exchange.

In particular, I'm not convinced of the wisdom of adding Turing complete programming languages to blockchains. It converts the blockchain from a distributed, immutable time-series ledger of asset ownership (which it's genius at) into a distributed virtual machine whose code can never be modified.

If we know one thing about software it's that most of it is broken. The methodological innovations in software development of this century—the Agile methodologies—take as their starting premise that almost all software is an approximation, probably buggy and imperfectly adapted to the changing commercial environment. So, instead of trying to get it right first time, Agile acknowledges that this is a pipe-dream and that we should instead focus on how to rapidly change and adapt.

But what if you couldn't change your software once you'd released it? Worse, what if you could never decommission, refactor or replace it!? Well, that's a smart contract.

Even more seriously, adding the ability to execute arbitrary code fundamentally weakens the security of the blockchain at the level of its architecture; and potentially opens the door to a systemic exploit that would allow control of all nodes of the network simultaneously. I explain how further on.

Ethereum

Ethereum was created by a 19-yo Russian-Canadian developer, Vitalik Buterin. He had previously proposed adding a generalised scripting language to Bitcoin. When that idea didn't gain consensus, he created Ethereum, which he described as a, “decentralised mining network and software development platform rolled into one.” The language he created is called Solidity, an ECMAScript-like language that executes on the Ethereum Virtual Machine (EVM).

That's a super-interesting idea but it produces quite a different beast than a distributed ledger.

Clever and creative innovations all find their niches somewhere. Ethereum is definitely both and it will find its place in the ecosystem. But I don't expect that that will be as the global smart contracts ledger it's being proposed for. It's a vehicle for speculation and its value derives entirely from that. Do we even call it a distributed ledger any more? It's more like a distributed VM with a pay-per-opcode model. It some sense it's more like AWS Serverless with the state transitions being persisted into the underlying ledger. Any blockchain ledger is really just a time-ordered sequence of state transitions for the assets under management with guaranteed eventual consistency.

It's a fundamental distinction that divides blockchain schemes into those that are solely decentralised ledgers and those that offer generalised code execution. My argument is that the former are the safer and more powerful; and, ultimately, the more useful.

The root issue is that smart contracts are a vector for penetration and fraud, they carry a cost to implement and potentially weaken what was the tight atomicity of a blockchain transfer. They are a weakness that a blockchain/cryptocurrency is superior without. Without ultra-high security guarantees one of the primary raisons d'être for a blockchain disappears.

The DAO Attack and Parity Multisig theft

Last year, Ethereum made headlines when a vulnerability was exploited to steal ETH 3.6M (today's value, €603M). And more recently, last month, a simple bug in a smart contract resulted in the loss of ETH 150,000 (€28.5M), the theft having been enabled by “a vulnerability in the Parity multisig smart contract.” There's a superb blow-by-blow analysis of what happened here. I especially like Haseeb's observation that bugs in smart contracts can happen to anyone:

So who were the crackpot developers who wrote this? They should’ve known better, right? The developers here were a cross-collaboration between the Ethereum foundation (literally the creators of Ethereum), the Parity core team, and members of the open-source community. It underwent extensive peer review. This is basically the highest standard of programming that exists in the Ethereum ecosystem.

Multisig (multi-signature) signing supports asset transfers that require m of some n signatories for a transaction to proceed. In Ethereum, multisig is implemented as a smart contract written in the platform's Solidity programming language, which a hacker was able to exploit.

The trouble is that for Ethereum/ETH (unlike NEM/XEM) multisig was implemented as a smart contract and not as part of the base API. Implementing commonly used smarter types of transactions, like multisig, in the blockchain API instead would have reduced the possibility of bugs, exploits and their execution overhead.

But security isn't the only disadvantage of adding a programming language to a blockchain. The problems are legion, including one that potentially opens the floodgates to a system-wide vulnerability that could compromise all assets stored in the blockchain:

Smart contract code can't be modified, bug-fixed, version-controlled or replaced. Once code is committed to a block in the chain it is there forever. Most software is buggy; that's the norm, and it's no different for code committed to the blockchain.

Once code is committed to a block in the chain it is there forever. Most software is buggy; that's the norm, and it's no different for code committed to the blockchain. Always read the fine-print before signing. The security of your smart contract depends not only on the underlying cryptographic security of the blockchain but also on the code in the smart contract itself. It's possible to have a vulnerability in your own contract, which you wrote, which you may even be aware of, but which you can't change. The exact behaviour of the smart contract depends solely upon its code, so institutional players would need to review that code first. Do we think that's a realistic expectation? Banks struggle to perform 100% error-free settlement when it's only simple transfers. To be really sure, for large multi-million dollar values (which an investment bank trading desk would do hundreds of a day), you would want to be doubly sure and review the opcodes the compiler generated as well. This is an insurmountable obstacle for institutional adoption, and as an alternative for interbank settlements. It wouldn't be feasible. And it should make people nervous. Look at the DAO attack, where the attacker is now claiming that his actions were legitimate and that he should be allowed to keep the 'stolen' money because the smart contract functioned exactly as it was written (albeit not as the authors intended) and that the smart contract was also a legal contract that should be honoured. It's a cheeky angle but it may even have some legal standing. In a multi-jurisdictional world, it will be a valid argument somewhere! :)

The security of your smart contract depends not only on the underlying cryptographic security of the blockchain but also on the code in the smart contract itself. It's possible to have a vulnerability in your own contract, which you wrote, which you may even be aware of, but which you can't change. The exact behaviour of the smart contract depends solely upon its code, so institutional players would need to review that code first. Do we think that's a realistic expectation? Banks struggle to perform 100% error-free settlement when it's only simple transfers. To be really sure, for large multi-million dollar values (which an investment bank trading desk would do hundreds of a day), you would want to be doubly sure and review the opcodes the compiler generated as well. This is an insurmountable obstacle for institutional adoption, and as an alternative for interbank settlements. It wouldn't be feasible. And it should make people nervous. Look at the DAO attack, where the attacker is now claiming that his actions were legitimate and that he should be allowed to keep the 'stolen' money because the smart contract functioned exactly as it was written (albeit not as the authors intended) and that the smart contract was also a legal contract that should be honoured. It's a cheeky angle but it may even have some legal standing. In a multi-jurisdictional world, it will be a valid argument somewhere! :) You'll need a code lawyer. No one can think of all eventualities. For interbank transfers, investment banks and their counter-parties have a documentation department, i.e., the documentation being the legal contracts. These are contract legal specialists who review the binding legal master-agreements between parties. Except that now, the binding contract won't be a 200-page legal asset transfer document, it will be expressed in the smart contract. These documents are vast for a reason—they encompass a wide-range of outcomes from force majeure circumstances to more commonplace defaults. Trying to express these outcomes in code to the satisfaction of a financial institution's legal department is unlikely to be possible. Those complex relationships between the transacting parties and their channels for problem resolution already have established practices, and they're ones that should remain external to the blockchain, which should only reflect previously mutually-agreed asset transfers.

No one can think of all eventualities. For interbank transfers, investment banks and their counter-parties have a documentation department, i.e., the documentation being the legal contracts. These are contract legal specialists who review the binding legal master-agreements between parties. Except that now, the binding contract won't be a 200-page legal asset transfer document, it will be expressed in the smart contract. These documents are vast for a reason—they encompass a wide-range of outcomes from force majeure circumstances to more commonplace defaults. Trying to express these outcomes in code to the satisfaction of a financial institution's legal department is unlikely to be possible. Those complex relationships between the transacting parties and their channels for problem resolution already have established practices, and they're ones that should remain external to the blockchain, which should only reflect previously mutually-agreed asset transfers. Performance constraints. Bitcoin has already exceeded its execution capacity. It's top execution rate is only a few per second. Last year transactions exceeded capacity, pushing confirmation times out from 10 to 43 minutes and causing others to fail. And it's not doing extra work other than the computationally-expensive mining of blocks. Imagine adding into that the ability to attach code that must be executed first before yielding the POI (Proof of Importance) that is the equivalent of mining in the Ethereum scheme. The Ethereum platform doesn't impose a bound on computational complexity, effectively opening up a DOS attack vector on the platform itself (though that will cost you real world money because executing code is based on pay-per-opcode).

Bitcoin has already exceeded its execution capacity. It's top execution rate is only a few per second. Last year transactions exceeded capacity, pushing confirmation times out from 10 to 43 minutes and causing others to fail. And it's not doing extra work other than the computationally-expensive mining of blocks. Imagine adding into that the ability to attach code that must be executed first before yielding the POI (Proof of Importance) that is the equivalent of mining in the Ethereum scheme. The Ethereum platform doesn't impose a bound on computational complexity, effectively opening up a DOS attack vector on the platform itself (though that will cost you real world money because executing code is based on pay-per-opcode). Reliance on off-chain external triggers. It's inevitable that the state that smart contracts depend on will exist off-chain. Via external APIs, the smart contracts will test the current state of the wider environment (has this company defaulted? has the price of an underlying security crossed a floor or ceiling?). These APIs are now effectively part of the contract and should be subject to the same scrutiny because each of them represents a potential attack vector for causing the smart contract to malfunction. You can't draw a line around a contract like that.

It's inevitable that the state that smart contracts depend on will exist off-chain. Via external APIs, the smart contracts will test the current state of the wider environment (has this company defaulted? has the price of an underlying security crossed a floor or ceiling?). These APIs are now effectively part of the contract and should be subject to the same scrutiny because each of them represents a potential attack vector for causing the smart contract to malfunction. You can't draw a line around a contract like that. Reputation management. When you publish code to the chain you are exposing what would have been a private contract between two parties to external scrutiny. Any bugs, omissions or mistakes are committed permanently to the public record of the chain. For some organisations this would be a genuine source of concern over reputation management. If you're a technology thought leader in the world of cryptocurrencies and digital asset management then your potentially embarrassing learning experiences will be there for everyone to see. It's like having a photo of your younger self drunk and acting like an idiot on social media that you can't delete. The Ethereum Foundation and the Parity core team are experiencing this now because a simple and understandable error that any of us could have made has resulted in the loss of tens of millions of client' dollars, causing reputational damage to them and the platform; to say nothing of the hundreds of million dollars lost in the capitalisation of the ETH currency.

Most of those are serious problems, which would be substantial hurdles requiring mitigation and offsetting against the value of smart contracts' power. But there is one aspect that potentially undermines the entire system.

Crashing or penetrating the Ethereum Virtual Machine (EVM)

If I were a hacker and I were wanting to compromise the Ethereum system, I would focus on compromising the EVM. It's a juicy target. If you can generate opcodes that cause the EVM to crash or exit its sandbox then you have opened the door to controlling all the EVMs. Smart code contracts execute on all nodes in the network (22,632 of them, as of time of writing), each a clone of each other.

The Ethereum nodes form a massively distributed network for the synchronisation of data and code. They all run the same software (the Ethereum VM itself), on the same data and with the same smart contract code. That's a vulnerable and precarious arrangement! In this, they are a bit like bananas...

The familiar yellow banana we have all eaten (the so-called Cavendish Dessert Banana) has only existed for 181 years! A single plant—the result of a random genetic mutation—was discovered in 1836 by the Jamaican Jean Francois Poujot, who found one of his trees bearing yellow fruit rather than green or red plantains. All yellow bananas ever since have been propagated by vegetative division from that one plant and so share identical genes. This has the weakness that when a bug (insect or fungus) strikes a banana plantation, it can (and has) wiped out entire geographical regions because there's no genetic diversity to stop it. This is actually a serious problem because bananas are a vital staple-food for over half a billion people. And it's a weakness that the EVMs share, which are—by design—perfect clones of each other. The banana-analog is such a serious problem that there's a gene bank in Belgium (far away from the tropics) that stores genetic backup copies for emergencies, but there's nothing like that for Ethereum catastrophes (as we've seen).

Because the Ethereum nodes are network-clock-perfect clones, if you could control one EVM, you would control all of them; and then you would have consensus across the nodes to control the ledger as you wish. That's like having control over The Matrix.

This is an inherent vulnerability of the architecture that cannot be mitigated against—you have a distributed data store that replicates data and code in all network nodes; if you add the ability to trigger the execution of client-side code then an exploit would penetrate all nodes on the network simultaneously. It is a fundamental design flaw.

This is not far-fetched at all. It sounds alarmist to suggest you could control a VM whose code is open to public scrutiny, but consider the JVM. The Java Virtual Machine is the most battled-hardened virtual machine that has ever existed; it sets the gold standard for secure sandboxing over a history of a quarter century. Nothing has stature like that in our industry. It operates the same as the EVM—a sandboxed virtual machine that executes user-supplied opcodes (bytecodes in Java parlance). As with the EVM, the source code for both the compiler and the runtime VM are public domain, allowing anyone to scour it for vulnerabilities. The JVM has a sophisticated multi-stage bytecode verifier; but, despite that, holes have been found over time.

There are many ways to crash a JVM. Other than infinite loops and recursion, you can use external transfers of control like JNI, use reflection to create corrupt in-memory representations of objects, stack or memory overflow. There are also some weird and wonderful glitches that no-one could have anticipated. To widespread surprise, it was discovered that trying to parse the magic number 2.2250738585072012e-308 would cause the JVM to crash. The following line would crash all the then current and antecedent versions of the JVM:

Double.parseDouble("2.2250738585072012e-308")

This was discovered in 2011 (Java was released in the primordial history of 1996), i.e., when Java was 16 years old (ancient in dog years). Yet, all that time, this one line of code would crash all versions of the JVM. That should make us all nervous about committing sole arbitration of asset ownership to a system with a virtual machine in it!

So, imagine that you go looking for an equivalent example for the EVM. Think what you could do with it! You could put that into a smart contract, send it some Ether and bring down the entire network. No one would know how you'd done that or who, so you could send it some Ether again just as soon as the network was back up again.

In that scenario, that would be a comparatively benign exploit, as it would merely disable the network. To steal currency, what you'd ideally be looking for is a way to jolt the EVM out of its sandbox to execute arbitrary code that you'd loaded into an array so you could get the EVM to execute your code at the level of machine instructions. This vector for attack is a type of Buffer Overflow exploit and it's a common form of compromise. For Ethereum, achieving this would mean controlling all EVMs (and therefore all assets under their management) simultaneously. Gulp.

Anywhere where there's the ability to transfer control to code a hacker has written there's the potential to take over control of the application itself. For example, there are broad categories of exploits (listed at www.owasp.org), including SQL Injection and XPATH/XML injection, that gain control of the enclosing application by getting it to execute your code. They're both a risk even where the application had no intention of executing user code at all. In the case of smart contracts that intention is explicit and invited! The fundamental weakness that this highlights is that having your application execute externally-supplied code is inherently insecure; at least, if history is any good guide.

The lesson of the history of our industry is that whenever we have allowed applications to execute client-side code inside themselves—no matter how carefully sandboxed—whether that's a browser or a virtual machine, there's always something we didn't consider and vulnerabilities are inevitably found and exploited. We're human and imperfect—our great successes in technology are the results of our continuous refinement and moving past our inevitable errors. Smart contracts position us in a situation of having our errors carved in stone. The only architecturally-secure design is to eschew client-side code execution in mission critical software.

Safe uses for smart contract blockchains

Ethereum has become a global phenomenon and it's too late to put that genie back its bottle. The underlying tech that Vitalik Buterin created has enormous utility value and I could picture how this technology finds its niche in internal implementation projects, where an organisation's enterprise products utilise it as an architectural device for guaranteeing consistent state that encompasses business rules across multiple divisions. It could be useful for that, as an implementation detail of a private, internal system. But exposing a distributed VM outside of the company's network? I can't picture yet how this would make sense (other than riding the speculative wave skywards) and deliver an advantage that isn't overwhelmed greatly by the many disadvantages. By offering code execution, the Ethereum network nodes are in the same security class as web browsers. They represent the same target area for attack—getting an application controlling a resource to execute your code.

Code execution breaks the same guarantees that made blockchains compellingly interesting technology in the first instance. I'm not sure that smart contracts in the form of free-form codified scripts are the solution here. In principal, I'm attracted (who wouldn't be?) to the idea of being able to express any arbitrary set of transformations in code, but in practice we all know that code is always imperfect. My personal Damascus Road experience happened when I was following, years ago, the C++ Standards committee, WG21. The people on that committee were some of the smartest, most experienced and cautious computer scientists in the world yet something as simple as a smart pointer would go through so many (almost endless, it seemed) iterations of revision—it was truly incredible to see the creativeness (and perversity) of people in finding ways to break even the most rigorous implementation proposal. Now, I'm more of the view that we should structure our processes around being agile so we can quickly detect, fix and re-deploy imperfect code. But that's something that's impossible with smart contracts and it makes me nervous.

A more secure scheme would be to limit the blockchain solely to being a distributed ledger of immutable transactions. The need for more complex transaction types, like multisig, will become apparent over time. These can be implemented in the API as opaque platform primitives rather than as potentially error-prone user scripts. This alternative strategy is the one that NEM adopted and it has none of the risks or costs of smart contracts. We need that secure base that does one thing well. We can build highly complex authentication/coordination/event-triggered processes on top of that secure base later without weakening it, but only if we constrain the underlying blockchain to being a distributed, eventually consistent, time-series datastore; which is what it started as. It's a time honoured technique in computer science, akin to the seven layers of the OSI network model. Each layer needs to do only one thing and do it well. Smart contracts conflate multiple concerns and, when that happens, security is the first victim.

Smart contracts effectively break the guarantees on security, complexity, boundedness, atomicity and determinacy. They are a fascinating parallel side-technology that has emerged from the explosion of interest and up-take of blockchains but their best use case—I believe—isn't (and can't be) to solve the challenge of low-latency and high-scalability asset transfers for decentralised ledgers.