Recently I had discussions with people which are related to blockchain world. To my surprise, none of them was sure what a public blockchain is. They were also confused about terms like POS/DPOS, finality and many other things. These people are by no means new to blockchain space, in fact they are traders and cryptofans. Some are also smart contract developers. I thought that it was clear what a public blockchain is, because we talk so often about them. After some googling it became obvious to me that the term is interpreted differently by many people and more than one definition exists. So I have decided to give my own definition and present the three different categories of blockchain. Except of these I am going to clarify many other things regarding blockchain architecture.

Before continuing, I must say that I don’t have the perfect definition at hand. Most important of all, I am not against any cryptocurrency (with some exceptions which I am not going to share anyway). They all (try to) fit a purpose. Please use this post for constructive critique or for a reference point to learn or at least as a guide to the architecture of blockchains. Don’t use this info for deciding if you must buy,sell or hold (hodl?) the X coin. The reason is that no one can be for sure in which direction the cryptoassets are going to move. But I am sure you already know this.

Ok, lets start:

The three categories of blockchain are : public, consortium and private. But there are many different characteristics that determine the results on security, cost (node running cost and creation block cost), decentralization degree, transaction speed, block creation time and others.

The most important of these are:

1) on which consensus algorithm are they based (PoW,PoS,dPoS,PoA) ?

2) What is the maximum node number they can have?

3) Is there finality on block creation ?

4) do they favor Availability over Consistency or the other way around ?

5) Can they run smart contracts? Do they have an isolated VM (Virtual Machine)? Is the VM Turing complete or not?

If you think that I listed the above as shorted by importance then you missed that one. The order is random. And I will start with (4), as it is the most basic to understand BC (blockchain) architectures. From discussions I had I also found out that is the least known one. That surprised me a bit as it is the first to consider for someone who wants to design a blockchain.

Back to the basics: the CAP theorem ( Partition Tolerance-Consistency-Availability), a theorem from Eric Brewer. This theorem claims that in a network we, unfortunately, can’t have all three. But let me be more clear what is it all these terms for a blockchain (more precise, a network or distributed system):

a) Partition Tolerance: The blockchain continues to exist and operate even if there are broken connections between nodes

b) Consistency : If you ask a specific question from many nodes at the same time , you will get the same answer ( example: block height, account balance etc)

c) Availability: That’s even simpler — if you ask you always get an answer, even a wrong one.

Because we can’t have all three (nature laws can’t change, so do proved theorems) we must choose between (b) or ( c ) and leave the other. If someone asks we not keep both and drop (a), we must remind him that the blockchain should continue to operate under any conditions, including net splits or other communication disruptions between nodes. Because (a) can’t be dropped we can drop only (b) or ( c ) . To be more clear , imagine people discussing on a chat channel: If they are all connected without problems then (b) and ( c ) stand. This is because they discuss, vote and agree for the next step. So whenever you ask you will get an answer (Availability) which will be also the correct one , according to their agreement or protocol (Consistency). But what happens if we have a net split or a bad network that many people (or some) can’t connect to the chat? In this case, the result depends on the previous agreement they made. If their protocol says that they can take a decision without waiting the late comers, then you always gets an answer from them because they have already decided (Availability). But not necessary the “correct” one. Because you maybe have asked a group that is a minority and they have decided without the other, leading o different results. At the same time, making the same question to the other group (which can’t communicate with the first) you will probably get a different answer. There is no Consistency.

But lets say that they want to all participate, so their protocol demands all to be present and have saying, or at least the vast majority. In that case they wait until (almost) all the nodes will contribute to the decision. In that case whichever we ask (if he is honest, that is) we get the same answer ,so the system has Consistency. But if we ask at a time that they wait the late comers then we will get no answer, as there is no decision yet. So there is no Availability.

This is a good point to say that whatever decision the blockchain architect will make on Availability or Consistency restricts him for the next options. So you cant choose completely independent between consensus algorithms , number of nodes , Availability over Consistency model etc. For example you can’t have finality with PoW and Availability.

Enough with (4), lets move to (3). Finality property means that whenever a block is created (a decision made) it is final. No need to worry about reversed blocks/ rewriting history. Also, finality is very good for cross chain transactions and second layers. Maybe now you understand why many supporters of RAID and plasma want finality for ethereum. Even not at every block, finality is a need for them.

So finality looks very good , right? Well, not so fast. Finality is much more dangerous and also demands lower percentage from the attacker to launch the attack successfully. To get finality you must choose a specific architecture. So we have:

a) More dangerous. If the attacker/s launch a successful attack at some point he can COMPLETELY replace the chain data, not just reverse blocks, until the genesis block or at least the latest hardcoded checkpoint inside the protocol. After that, the damage is final and undone.

b) To achieve finality you need PBFT (Practical Byzandine Fault Tolerance) algorithm or some variance of it. The attacker needs a little more that 1/3 of voting power (whatever that is) of the whole. In non PBFT like protocols the attacker needs at least 1/2. Not claiming that someone to get 1/3 is easy of course. But easier in theory than 1/2.

c) Because of PBFT (or any xBFT variation) each node must be connected to all others. That leads to n*(n-1)/2 connections, meaning O(n²) connections between the nodes. In practice that limits the number of the total nodes (to a two digit number, ie less than 100). In other consensus algorithms, which they don’t demand connectivity with all other nodse they is actually no limit of the nodes that the chain can have. The nodes can freely connect or disconnect without any impact on the “health” of the blockchain. But more details on these later.

Before I move to the next topic, I have to say something more about finality. The block creation time has actually meaning at systems that have finality. Of course the shorter, the better (if it doesnt have any negative impact, ie orphaned blocks). BUT, for systems that don’t have finality the block time is not important. This is a fact that many people do not realize. Remember, in a system that there isn’t finality, blocks can be reversed. So for someone to be sure about a transaction (or more correct: state) he must wait long enough to be sure (in practice, not in theory) that no one can reverse it (or if he can, the cost is astronomical). Example: lets say we have two blockchains who do not have finality property. The first has a block time of five minutes, the other of thirty seconds. So it is ten times faster, isn’t it? Well actually no. Lets say both of these chains use the same PoW algorithm. The really fastest is the one with the higher total hashing power: for example the five minute blockchain can be much safer if he has double hashing power than the first one. At the first block of the first chain it is more safe to accept a transaction than the second , even if 30 blocks have past (eg 15 minutes). So the time to wait until you accept a transaction depends on hashing power (or other factors in PoS) and NOT in block creation time.

The above do not apply for blockchains that have finality. In these systems the time creation is important, as the created blocks are final.

It’s about time to speak for the number of nodes in the blockchain. The most obvious think to notice is this: A blockchain can operate with 4 nodes , 40 nodes , 1000 nodes etc. But which one can be considered decentralized? Or a really public blockchain? Or, if we accept that decentralization is not black and white but has degrees, which one can be considered “more” decentralized? The answer is obvious. Consequently, the architecture should allow a very high number of nodes (ideally must be unlimited). We must be clear here of what a node can be considered, for public chains at least. To be considered as a node it must have permissionless write access, that is, a chance to participate in forming the next block,ie do mining(PoS) or minting(PoW) or voting(dPoS).

And finally, we arrived at the “heart”, or better “brain”, of the blockchain: consensus algorithm. Very briefely:

- PoW “burns” electrical power, which raises the cost of block formation and is also environment non-friendly. There is an extrernal cost for the chain. It has a big plus though:creates scarcity. This is very important when we talk about currency. In non PoW protocols someone can launch a chain for “free”; and anyone can copy him without cost. So where the value of currency comes from? A good analogy is with gold and fiat money. Yes, you can extract more gold from phusical mines but you must pay a heavy cost. On the other hand, printing fiat money has low (near zero) cost.

- When I refer to PoS I mean “pure” PoS, that is, no representatives. Anyone can directly try to “win” the next block directly. Basic differences compared to PoW include: no external cost (which is good but also bad) and an unknown level of security. In PoW the level of security is known: it depends on burned resources (electricity, hardware). If an attacker wants to reverse the chain at N depth of blocks he must burn the same electricity or a little more (if he is really lucky a little less). But for PoS this is a questionmark. By unknown, I don’t mean necessarily lower. We just don’t know.

- delegated PoS (dPoS) leads to a completely different architecture. It has completely different security problems. From a security perspective the highest danger comes from DDoS attacks on the nodes, as it is the esiest for m of attack. Even bringing down some nodes the attacker can stall the chain. dPoS is faced with special security problems that don’t exist in other protocols and vice-versa: its is free of some vulnerabilities that exist on PoS or PoW.

Of course, I can’t make a full analysis for each consensus type here.

Finally, about (5), if the platform can run smart contracts more complexity is involved. A virtual machine is needed to provide code execution but also complete isolation from the chain code, protecting it from attacks via smart contracts or bugs. The VM should accept many languages, as compilers can compile to VM bytecode.

PoS and dPoS can provide an advantage compared to PoW for chains that have smart contract capability, not a technical but one regarding costs: The fees can be much lower because of the very low cost block creation.

The most common architectures are:

PoW+Availability, no finality, a plethora of nodes; high transaction cost, truly decentralized, slow, scarcity.

PoS+Availability, no finality, a plethora of nodes; low transaction cost, truly decentralized, slow.

PoS+Consistency, finality, many nodes; low transaction cost, decentralized, moderate speed.

dPoS+Consistency, finality, few nodes; low transaction cost, low block creation time, high speed, no real decentralization.

To be honest , dPoS leads to consortium chains and not public ones; consortium chains are chains that only few nodes have writing access at block creation. These blockchains are ideal for establishing trust between a small number of owners (for example big banks who want to transfer digital assets).

Finally, there are private chains, where a single entity can control the chain, as it controls the registry authority . They usually use PoA or a PBFT variance. They can even use an FT (Fault Tolerant) protocol that isn’t Byzantine, meaning it considers all participants as honest. They are like distributed databases but they involve cryptography and are more transparent. Of course they are slower than databases but faster than the other blockchain types. To be honest, they purpose isn’t clear to me.

Public chains can build trust between either party; consortium chains build trust only between the owners of the nodes that have writing access/special permission to form the blocks (alter the state); private chains can’t build any trust( my personal opinion).

I have promised to give my own definition about public blockchains. After looking on this I decided that the easiest and more clear definition comes for their basic property. So my definition is:

A blockchain is public if it is unstoppable.

Simple, elegant and short, right? The above means that governments, attackers, the core developers, even the chain creator can’t alter/stop/destroy the chain. If anyone can , by any method, the this blockchain can’t be considered as public.

We can go on and on, and for each of the above topics an encyclopedia can be written. But I must stop somewhere.

Let’s have a constructive discussion :)