Photo by Austin Distel on Unsplash

The last two years have seen businesses of all sizes look to adopt blockchain technology, from Walmart to Pfizer. However, as with the introduction of any new technology, it is not an easy task. To make the leap into enterprise, blockchain must keep in mind the needs of business and the restrictions around its implementation.

We spoke to Kirill Ivkushkin, Chief Architect at Insolar, a blockchain platform for business, about the challenges faced in building blockchain technology in such a way that it met business needs.

Can you give us a bit of background around the problems with blockchain you’re looking to solve?

I’ve been working on blockchain technology since 2011. It was not initially the main focus for me. I was working at Sberbank, where we did a lot of blockchain-related experiments — some of which went public, most of which did not.

I also worked in the so-called Association of FinTech. Here we were working to apply blockchain-related technologies to different aspects of FinTech. We built something called Masterchain, which is a version of Ethereum that’s been adapted to Russian cryptography standards. We even did some legislation changes to enable some of the projects to run.

Based on my experiences, I tried to figure out the actual problem with blockchain technology when you try to put it into real use. I found that it’s less about the technology itself and more about the framework around it. It’s not all these general things like making the technology scalable, or the number of transactions per second (which is not the biggest issue as some say), it’s more like how to wrap it in a way that business can use.

Think about putting a document on the blockchain. If you cannot do that, then you have a secondary system that you have to run and maintain. The secondary system is painful for a number of reasons. First, let’s say you have a document. You want to automate something and then put the document on an offline system. When you then do something on a blockchain, every node will have to go to this offline system and check if the document is there. That means that you have too many nodes hitting the single server and it will die.

Photo by Constant Loubier on Unsplash

A more complex problem exists if this is a document you can act on — for instance, you can read XML fields. Again, if it needs to be used offline, you cannot act on it. When this happens, the blockchain stops acting like a system for automating and becomes purely a system for registering, with automation outside of the blockchain. Furthermore, if you already have lots of legacy systems and try to add an additional system, then you multiply your pain, because you run the previous system, a blockchain system, and a few more.

You also have a problem in that you need a different skill set from your developers. It’s not only about the language to use, because to write something for the blockchain today you have to be skilled in distributed systems. You have to understand how to ensure the ordering between different contracts etc. You don’t have this problem in Ethereum, for instance, but only because on Ethereum you run every complex logic on the same single node, and then it’s simply a case of comparing it with the results of others. This isn’t the case in the really complicated systems — they work like services which may not know what’s inside the other.

If you want to build something real on the blockchain, for most companies, the developers they have now — and there may be thousands — are useless.

Developers therefore have to be really skilled and know more than you can afford to pay them for. IT resources are very expensive today and they are only becoming more expensive — to hire someone with a very specific skill set, the price gets exorbitant. If you want to build something real on the blockchain, for most companies, the developers they have now — and there may be thousands — are useless.

What are you trying to address at Insolar?

What we tried to address at Insolar was, of course, scaling — and scaling not just by number of transactions, but by the size of the objects or size of transactions on the chain, so you can keep the normal objects without going outside. We also wanted to be able to run complex and long operations. For example, if you called one contract, this contract could go out to the market and find another provider, which is another contract, and it can be repeated multiple times. This makes it impossible to run within a pulse (blocktime) period.

We also wanted to address cost of ownership. This is not only about the technologies we use inside, but the kind of developers that are needed to write the code for the Insolar platform, and to maintain this code. We are still in the early stages, but we understand how to make any blockchain app simple enough so you can treat it like you are running on the enterprise service. For example, I want to call this service, and you can validate later on if something went wrong, but in general at the level of the contract you write, you just write what you call and you don’t care what’s happening behind. That’s going to make your work easier as a developer of a smart contract, because things like ordering, sequencing and isolation, the platform can provide it for you.

Can you explain the logic behind the architecture of the Insolar platform and its use of multiple chains?

As I mentioned, the TPS is not a core measurement. The core goal was to make the system scale on different aspects by adding the nodes. Let’s say you have more computational-heavy contracts, then you just add one type of node — we call them virtual or calculation nodes. If you have very large documents on the network, then you add other nodes called ‘light material nodes’, which work like temporary caching nodes — a kind of short-term storage.

If your network lives for a long time, but you still have roughly the same amount of users, then you don’t need to increase the number of nodes in the first two categories. Instead, you have to grow long-term storage nodes. In this case, we wanted the system to be adjustable to the different use cases. This adjustability is achieved by increasing the nodes and nothing else. That’s the key behind the architecture.

To enable this, we had to do some tricks with how the blockchain system works. The core one, when you look on the average blockchain system, is when you try to establish a consensus across all the nodes, or some part of the nodes if you look into sharding. However, you choose the nodes for the consensus based on some rule — like Ethereum is doing now, where they have a beacon chain and then periodically they have voting to determine which shard will be managed by which nodes. We divided this problem into two or three stages. For the first one, we decided that for the system to be the most efficient, the only way to do that is to ensure that there is a simple case — there is only one node working for one contract at a time. You then don’t have any problems with parallels with excessive computation etc. To do that, all nodes have to have a similar view of what’s going on in the system. So if I have Contract A, then all nodes should know who at this moment is responsible for the Contract A.

Photo by Stephen Hickman on Unsplash

You can do it like Hyperledger Fabric and put a kind of equals sign between the contract, the node, and the participant. This is a possible solution, but it’s a problem because with all the scalability problems or scalability of your power, you have to manage your own because you are the node. Then, if you have multiple contracts running on your behalf, you have to have a very powerful node. That’s a problem you’ve got to solve by yourself.

Another problem is that being a member in such a project requires you to own the node. The problem here is that SMEs don’t really want to have an IT department, because IT is expensive. It’s not like the cloud, because if you look at what Hyperledger is doing, they actually force you to buy a node on Bluemix and worry about this node, or entrust IBM to worry about this node. This is wrong. You still have some kind of IT and you have to understand what you are doing. The idea was to allow most users to be able to use the system and to provide the scalability, without owning the node. We solved the problem of assigning one node by separating so-called network consensus and transactional consensus.

So, what happened in the system? Periodically, let’s say approximately every 10 or 15 seconds, the whole network agrees which nodes are currently considered active. We have some tricks, but what we use is more or less BFT-like. So it’s not a leader-driven BFT for a network up to 1,000 nodes. For a bigger one, we have to use some kind of leadership model, because BFT has a square progression so if you run a thousand nodes and above then you have to do some tricks anyway. But within a thousand nodes, we have more or less fair BFT.

At the moment, as soon as the system agrees which nodes are considered as active for the next 10 seconds, the system also agrees on some randomness. Based on these two inputs — a list of active nodes and a random number — for every object in the system you can calculate on every node which is responsible for this object. This solves the first problem. We guarantee that for one object, only one node is responsible this time. We call this node executor. This rule changes dynamically, so in the next 10 seconds, the executor for the next object will be different. This reduces the chance to control the behaviour of the object, but of course if you use only one node, this node can cheat. So what you can do is put your own wallet on the network — you’ll be responsible for this wallet and you can put extra money on it without anyone being able to check it. So it’s not good.

The next step is to say that there is a result produced by the executor node — for instance, the result is two plus two equals four. But then, when the next round of the network consensus comes, there will be a new source of randomness. Then, based on the new randomness that was not known to the executors, the network selects the validator or a few validators for the results of the executors.

Here, we of course will spend more effort than just with a single node, because one node did the calculation, but then a few nodes have to validate the calculation. But this design enables a unique feature which is not available on any other platform right now — of course, soon I believe others will start doing something like this.

If you bought a candy, maybe just one validator is enough, but if you bought a car or a property in, let’s say, Las Vegas or in San Francisco, then you now have like a thousand validators.

The feature is that we can assign a number of validators, so literally choose the size of consensus based on the value at risk. So if you have a contract that, for instance, bought a candy, and it’s one box in the 10 seconds you’ve spent, then you don’t need a thousand validators because the overall cost of this transaction will be excessive. If you bought a candy, maybe just one validator is enough, but if you bought a car or a property in, let’s say, Las Vegas or in San Francisco, then you now have like a thousand validators. The cost of your transaction can take that because for you it will be less than some one 0.01% of the cost. That’s unique because, based on this architecture, we can enable the contract to publish a metric which will choose a number of validators.

Another thing that is different here is that in most blockchain systems on the market now, the validators are actually equivalent to the executor — they all do the same work and then they choose the option which is provided by the majority. So if you have ten validators and eight of them say that two plus two equals four, then the answer is four. But in terms of trying to build something for business, this is not the correct course. In business, when you write an agreement, it’s not the majority that decides — the majority is actually auditing, they are not deciding. The idea is that only the executor can provide an answer, and the validators can either agree or disagree, but they cannot provide an answer.

Photo by taha ajmi on Unsplash

Why? Because if you do things right, then two plus two must be four. There is a chance that you may get a different answer if you have something like, let’s say, a hardware failure. But that doesn’t mean that some answers will be by majority or not — validators either agree or disagree, and maybe one validator may be missing or give a wrong answer. Or for enterprises, it can be even an indicator that something went wrong and the contract should not proceed, but should be escalated. In this case, we see not only dynamic consensus per transaction, we also see two major types of validation happening here. The first one, we call all-or-nothing. This means that all validators must agree. Maybe one can give no answer, perhaps due to failure, but if at least one disagreement is provided by validators, then it’s a subject for extensive investigation and escalation — maybe even to the courts. If something is calculated wrongly, then there is a problem with the contract and it cannot proceed normally.

For public use of the Insolar platform, there can be a majority — so ⅔ of validators have to agree with executors. If they disagree on the answer, then the executor is punished, but then the transaction has to be rolled back. After that, it depends on the logic of the contracts how to handle this — either to read, reiterate or return that to the user and say ‘Ok we were not able to process because something happened in the system.’

The challenge is how to do the scalability stuff in a way that we still keep this distribution and security. We do this in a similar way to when you call your mobile, because initially all mobile phones were using individual frequencies, and now they have started doing time sharing etc. We do a form of time sharing, so our first network consensus is applicable for all nodes, but we don’t decide on the calculation — we only decide on who has the right to participate for the next 10 seconds. Then we do all the individual calculations, which saves on the effort spent on calculation. On the next round, when a new randomness (entropy) appears, we do the validation, but we can do this validation proportional to the value or importance of the transactions that the system handled.

How do you achieve the randomness?

If we talk about enterprises, then the randomness is probably not a big issue. You just agree, and we have some protocol we called the Pulse protocol that allows you to generate randomness without being able to manipulate it during generation. If it’s on a public network, then we can connect our system to Ethereum or whatever.

We can use the same protocol as we use for enterprise, the only limitation to performance is that the randomness generation is limited by the number of nodes participating, because again we use pure BFT on randomness generation. So, if you have a network of ten thousand nodes, then the randomness can only be generated by like a hundred nodes out of ten thousand.

What are the main challenges about working with businesses?

The challenge is figuring out the most efficient areas to introduce new technologies. The problem is that if the business is running, there are systems in place already, and there is a cost relating to these systems.

If you introduce something new, it’s not like you can wipe out the past and create something different in its place in one day — it’s a very slow process, taking a year or more

If you introduce something new, it’s not like you can wipe out the past and create something different in its place in one day — it’s a very slow process, taking a year or more (a year is very optimistic). It’s better to find a spot where the business does not have expensive systems in place, or for a task that is relatively new and does not have an existing IT infrastructure.

Why did you choose Java and Golang as the languages for developers?

Over 30% of the resources of bigger enterprises are Java. Golang, on the other hand, has a memory-managed model, which is relatively simple for developers, and you also have very stable response times.

Photo by Markus Spiske on Unsplash

If you write in Java, then when you write the core product you have to use lots of tricks to ensure that the response time is more or less stable. I’ve been building real-time and low-latency systems in Java and I know that it’s tricky. Golang is much simpler, and we decided to use it as a starting point. There is a joke about having 16 different systems doing the same thing in a different way, so we need to build a 17th to replace all of them. Our idea was to introduce something new. The most-used system now for enterprise experimentation is Hyperledger Fabric. A year or two ago it was Ethereum, but now it’s Hyperledger, and in Hyperledger, the core is on Golang. We looked at the market and decided that if we want to be more successful, we need to think about compatibility now. One of our goals is to be compatible with the Hyperledger chain code. So it’s not like we are able to run the Hyperledger-compiled code on our platform, but we expect to take a chain code, and enable running it on the whole platform with minimal adaptation.

To find out more about Insolar, you can email business@insolar.io.

Primalbase is always looking for blockchain developers like Insolar. If you’re looking for office space in a vibrant tech community where you can drive, drop us a line.