First, we will talk about multi-signatures — what they are, how they work in Bitcoin and how they work in Ethereum. Then we’ll talk about threshold signatures and their advantages and disadvantages compared to multi-signatures. And finally — things will go a bit crazy — we’ll do some cryptography and discuss threshold ECDSA protocol — similar to the way it’s implemented in Keep.

Before we go any further, allow me to tell you a few words about Keep, because it would help to better understand why we started the work on the threshold signatures. If I had to describe Keep in just one sentence, I would say that Keep is a privacy layer for public blockchains.

Over the last several years, public blockchains have brought us incredible transparency and auditability with censorship resistant and immutable records. However, any smart contract that is published to the blockchain can be easily accessed by competing interests. As a result, when companies consider building applications on the blockchain, privacy is one of the primary issues that arise. Keep is a bridge between public blockchain and privacy which allows contracts to harness the full power of blockchain technology without compromising on security. We can also say that Keep is a kind of a private computer, able to store and process data hidden even from itself.

Let’s start from multi-signatures. “Multi-signature” generally refers to requiring more than one key to sign a transaction and is generally used to divide up responsibility.

A good example of a 1-of-2 multi-signature is a husband and wife’s petty cash joint account where signature of either spouse is enough to execute a transaction.

An example of a 2-of-2 multi-signature is a husband and wife’s savings account when agreement of both is required to execute a transaction.

An example of 2-of-3 multi-signature is a parent’s savings account for a child. The kid can spend money with the approval of at least one parent, and money can be completely taken away from the kid if just both parents agree.

In Bitcoin, multi-signatures work by creating a multi-signature address and, when we create it, we specify what keys are associated with that address and how many of them are needed to sign a transaction. Then, after some time, we create a transaction, people having the keys sign it, and that’s it! In this example, we have created a multi-signature address with three associated keys and at least two of them are needed to provide a signature.

This is really cool, but there are some limitations.

First of all, the information of which keys were used to redeem the transaction is public. For some businesses it may be a benefit, but for some others that need greater anonymity, this can be a disadvantage. Also, there are some limits on how many keys can be used. And the security policy is fixed once the multi-signature address is created. So, it’s not possible to change your mind after some time and say, “Okay, now I want to have 2-out-of-3 policy instead of 5-out-of-6” — it requires creating a new multi-signature address with a new security policy, and transferring all the funds there.

The way it works in Ethereum is a bit different, because there isn’t anything like a multi-signature. There is always one address that signs a transaction and spends the gas. However, there are a bunch of multi-signature wallets that are basically smart contracts able to store tokens and requiring that the issued transaction needs to accept some requirements, including a specific signing configuration.

Some of the most well-known wallets are BitGo wallet, Gnosis wallet, and Parity wallet. And, although they are smart contracts, so obviously they allow for greater flexibility than Bitcoin transactions, they are also harder to implement. And the cost of a bug, if it occurs, is really high. I think everyone here heard about Parity hacks (editor’s note — check here and here if you haven’t).

So, is there a better way? I think the answer is “yes.” And that is threshold signatures.

Multi-signature mechanisms rely on multiple parties having separate, unique keys that they sign a transaction with, and signers provide their signatures one after another. In this picture (above), we have a group of signers on the left side. Each signer has their own secret key that they sign a transaction with. And when the verifier, Bob (on the right side), wants to verify a transaction, he needs to check every single signature separately.

For threshold signatures, the situation is a bit different because we only have one public key, one secret key, and only one signature. In the picture above, we have a group of signers on the left side. Each signer has the same public key and a unique share of the secret key. Signers cooperate using a special communication protocol that does not expose the secret key in order to produce a signature and there’s only one signature that is produced. Now the verifier has to check only one signature in order to say if the transaction is correct or not.

So, what threshold signatures can give us? First, we avoid contract-based multi-signature. With a threshold signature we can have Ethereum transaction signed by multiple parties without the need of locking our tokens in a smart-contract wallet. I think it’s a huge win for Ethereum. Secondly, threshold signatures are indistinguishable on-chain.

Here’s an example of a threshold transaction that we executed on testnet Bitcoin network with 20 signers. You will notice no difference between this transaction and a standard Bitcoin transaction where there is only one signer. This offers greater anonymity because the information about which shares were used to sign a transaction (i.e. who signed a transaction) does not leak publicly.

Also, threshold signatures are cheaper than multi-sig transactions in terms of validation. Multi-sig requires creating a new signature for each signer and the cost increases with the number of signers. For threshold signatures, there’s only one signature so the cost is fixed in terms of validation.

Threshold signatures also offer better flexibility. First of all, there is unbound in a number of signers. And secondly, it is possible to modify the security policy after some time. It is a single point of failure algorithm, but it is still possible to change your mind and say, “Okay, I want 3-out-of-6 instead of 7-out-of-10.”

But there’s always a catch. The first catch is that, although threshold signatures are not something completely new, most people are still not yet familiar with them. This means technology adoption will take some time. Also, tooling does not exist. For multi-signatures we have a bunch of tools like hardware wallets with nice UI support. We can’t find those for threshold signatures — at least not yet. The other problem is that signers need to stay online. For multi-signatures, you can provide your signature and then go offline for your vacation and, in the meantime, someone else comes, provides yet one signature, and the transaction gets accepted. For threshold signatures, signers need to cooperate (at least some minimum number of them) in order to produce a signature.

I hope it all sounds exciting to you! Now it’s time to discuss how it actually works under the hood. But first, we need to recall some models and definitions. And those are “additively homomorphic encryption,” “threshold encryption” and “commitment scheme.”

First, additively homomorphic encryption. Additively homomorphic encryption has a nice feature, allowing to operate on ciphertexts. One example of additively homomorphic encryption scheme is Pailier, for which there exists an efficiently computable operation “add”, so addition operation, that’s plus with e subindex on the slide, that allows to add two ciphertexts together.

So, we can have two values, a and b. We can first encrypt those values and then, using this special operation, add those two ciphertexts together. Or we can first, having those two values, add them together and then encrypt the result. When we decrypt we will get exactly the same value in both cases.

The next thing is (t, n) threshold encryption. The easiest way to explain it is to say that we have n players; each one has the same public key, but also each player has its unique share of the secret key. So, on this slide we have secret_share 1, secret_share 2, secret_share 3…secret_share n, and signers — if you have a message encrypted under the public key, signers must cooperate using a special communication protocol that does not expose the secret key in order to decrypt it.

For example, say that Alice and Caroline are playing a coin flipping game. Alice chooses the side of the coin, evaluates a commitment to her choice, and sends the commitments to Caroline. Now Caroline throws the coin into the air and says what the result is. Next, Alice sends a special value called decommitment key, which allows Caroline to evaluate the commitment and to see if the value that was initially chosen by Alice is really what she’s saying now.

The decommitment key allows to validate the commitment, but also has a nice feature, allowing to define unconditionally hiding schemes. So, no matter what Caroline does, no matter how high computing power she has, she cannot guess the value Alice committed to without having the decommitment key. She cannot do it from just the commitment.

Now it’s time to discuss the protocol. Our implementation is based on work of Rosario Gennaro, Steven Goldfeder, and Arvind Narayanan from City College in New York and Princeton University, who described Threshold-Optimal DSA/ECDSA Signatures and their Application to Bitcoin Wallet Security. We made some small modifications to the protocol, but we don’t have the time to discuss that now. And there are two pieces of the protocol: the Key Generation protocol and the signature protocol. We will go through the Key Generation first.

That’s when we do cryptography. Assume we have n signers, each signer is initialized with the additively homomorphic threshold encryption scheme, and this happens in the setup phase:

For now, we will skip how it is done and will return to talk about it briefly at the end.

This may get confusing because we have two types of keys. We have threshold additively homomorphic encryption scheme keys which are initialized in the set-up phase and we also have t-ECDSA key that we use for signing. And this second key is what we are going to generate now. For the additively homomorphic threshold encryption scheme, we just assume that it’s done in the set-up phase.

In the first step, each player chooses a random integer x, which will be used as a secret key share of that player:

x cannot be greater than q. q is the elliptic curve cardinality, so it’s the number of points elliptic curve has. On all of the slides, q stands for the cardinality of the elliptic curve. Each player computes y as g to the power of x. This is an elliptic curve operation, and basically, we multiply curve’s generator point by x. This is a notation you will often find for groups, but since elliptic curve is a group we can use it also here.

Then, each player computes a commitment to this value, and in the second round, publishes this commitment to all of the players in the group:

Next, in the third step, each player reveals x in an encrypted form, it’s α on the slide:

Encryption is done with the additively homomorphic encryption scheme we initialized in the set-up phase. What’s more, each player reveals the public key share, the decommitment key, and the zero-knowledge proof stating that all those values together make sense. This allows for a validation of the commitment, but also allows all players to see if all those shares that the player just revealed together makes some sense.

So, the zero-knowledge proof says that there exists a number x such that curve’s generator point multiplied by x gives point y, and y is public at this moment because it was just revealed, and that if we decrypt the value α that we have just published — it’s the encrypted secret key — we’ll get that number x.

Of course, it’s a zero-knowledge proof, so it is not possible to guess what x really is, but what we say here is that it lies in the range from (-q³, q³) , and since q is the cardinality of the elliptic curve, this range is really huge.

In the fourth step, all signers use the add operation of the additively homomorphic threshold scheme to produce the final t-ECDSA key, so all of the encrypted shares of x can be added:

As a result we get the secret key in an encrypted form, all revealed public shares of y can be added, and as a result, we have the public key. Addition operation here is just addition of elliptic curve points, so it’s something simple.

The second piece is Signature — let’s go through the signature algorithm.

We have α, which is a t-ECDSA private key in encrypted form shared between all the signers, and we have y which is a t-ECDSA public key; it’s just a point on the elliptic curve.

In the first round each party draws a random integer ρ:

Encrypts this value with additively homomorphic threshold encryption scheme:

And multiplies the secret ECDSA key (α on the slide) by this random value:

This is possible because having addition operation of the additively homomorphic encryption scheme, we can also have multiplication. Each signer publishes commitment to those values, and in the second round reveals all those values, along with the zero-knowledge proof, stating that they make sense:

We skip the construction of the zero-knowledge proof for this presentation.

So, after the second round players join the revealed shares together, using the addition operation defined by homomorphic encryption scheme.

The same commit-reveal pattern is used in the 3rd and 4th round:

On the right side we have all the parameters that were evaluated so far, and all players had the same values.

In the third round each party draws a random integer kᵢ:

And a random integer cᵢ:

q all the time stands for the cardinality of the elliptic curve, so the number of points on the elliptic curve.

Each party computes rᵢ as g to the power of k — we basically multiply curve generator point by it:

Each party computes the parameter w which is k time ρ plus c times q

q all the time is the cardinality of the elliptic curve, and we can compute it because we use additively homomorphic threshold encryption.

At the end, each party commits to all those parameters, and in the round 4 generated parameters are revealed, along with the zero-knowledge proof stating that they make sense together:

For now we skip the zero-knowledge proof.

Having all those parameters from all the group members we can add them together, just like we did after the round 2. We sum up all k shares, all c shares, all w shares. What’s more, we evaluate parameter r as a sum of all rᵢ shares and we use a special hash function:

This is actually the function that we know from the standard ECDSA protocol: it’s x point coordinate modulo elliptic curve order; something from a standard ECDSA protocol.

And after the round 4, all parameters on the right side are shared by all signers in the group. Now we need to do some discrete mathematic magic to produce a signature. Using all those parameters we have evaluated so far, and since we operate on encrypted data, the signature will be also encrypted. But this is something we will deal in the final round 6.

The very first thing we need to do is that we execute a threshold decryption mechanism to have all the players decrypt the parameter w and assign this value to η:

Next, we compute yet one parameter called Ψ which is multiplicative inverse of η modulo q, and q is all the time cardinality of the elliptic curve:

Then, having m, which is the hash of the message we are signing (or a hash of the transaction), we start evaluating the signature with the following equation:

c is the value we have just evaluated, and u, r and v are the parameters jointly evaluated by all the signers in previous rounds.

So, since u is an encrypted ρ, and v is an encrypted ρ multiplied by the secret ECDSA key, we can do the following transformation:

And if we replace Ψ with the value it represents, we will get the following equation:

And finally, if we eliminate ρ, we get this:

I guess most of you know what this equation is for — it’s an equation for the standard ECDSA signature, where k is the cryptographically secure random integer, n is the message hash, x is our secret ECDSA key, and r is the curve generated point multiplied k times modulo q. So it’s an equation for a standard ECDSA protocol.

All those equations were done on ciphertexts, so at the end our signature is also encrypted:

We need to deal with it, which we do in the 6th round, where all the players execute a threshold decryption mechanism to learn the value of s. And the decrypted value s and parameter r evaluated in round 4 together make the signature:

I know that’s obviously not easy to understand the first time you see that. My intention was not to give you a full deep understanding of the protocol, rather go through all the steps that you have some idea of how it works under the hood.

So, there are two considerations.

Actually, this is about the question that was asked — single point of failure for the signature protocol is unacceptable. With Key Generation it’s easy, because we can detect if one party misbehaves. We have zero-knowledge proof and commitment. If one party misbehaves during Key Generation protocol, we can just drop the result and replace that misbehaving party with someone else and start the protocol again. With Signature, it’s not that easy because we should be able to produce a signature even if one party misbehaves. And as long as we have t plus 1 where t is the threshold players that are honest — and this is the parameter that we set when we generate an additively homomorphic threshold encryption scheme — we can still produce a signature. So, we need some minimum number of honest players and then everything is fine.

Another important consideration is how those keys are generated. We have generally two possible approaches here — there is a trusted dealer set-up where one party generates those keys and delivers them to all other players in the group, and that trusted dealer should forget all those keys, erase the memory. And the second possibility is that we execute some kind of a dealerless protocol in order to execute the keys by all group members together. There’s also a third approach that will probably be presented tomorrow by Steven. Those two papers (editor’s note — Distributed Paillier Cryptosystem without Trusted Dealer and A Generalisation, a Simplification and some Applications of Paillier’s Probabilistic Public-Key System ) show possible algorithms for this set-up.

So, does it really work? Yes, we’ve signed a number of Ethereum and Bitcoin transactions. But it’s important to understand that this solution is really generic so we can use any elliptic curve here.

So, if you’re a wallet company or provide institutional custodial solutions, or even if you want to protect your own personal wallet, then you can really benefit from t-ECDSA, especially on chains that do not support multi-signatures. So, for example, as a cryptocurrency exchange, you can have your Ethereum keys stored on multiple servers or even some of the shares stored offline, so this should reduce the possibility of an attack.

We hope you enjoyed this cryptography journey!

If you would like to connect with Piotr Dyraga directly, you can reach him on Twitter, Github, Keybase, and LinkedIn.

You can also ask your questions by commenting on this post directly or in our Slack.

P.S. Claps and reshares always appreciated!

Learn More

For more information about the Keep Network: