📖 Introduction

Without a doubt, we currently live in a data-driven society where in fact, data has become a more valuable resource than gold or oil. Seriously, consider the amount of personal data that we share online every minute, on a daily basis. Location, feelings, preferences, passwords, messages… and the list keeps growing…

Fortunately for us, symmetric and asymmetric modern crypto made it possible to protect our data against malicious adversaries attempting to eavesdrop on our communication channels. But what about the data controllers — the guys who we legitimately send the data to? How can consumers make sure that their data are not mishandled or abused? One way for sure is refusing to send data in the first place. But in reality, is not that simple. It’s an exchange. We exchange a bit of privacy for some kind of service that they provide right? Still, on many occasions, this exchange is not a fair one from the consumers perspective, since data handlers ask for way more data than what’s actually needed.

Yet again, cryptography may have a solution for this. What if I told you that it’s possible to avoid sharing your data after all. For instance, instead of sending a complete salary overview and job details to a letting agency for a credit check, send only a proof that you earn more than 40k per annum. This is exactly what Zero-Knowledge proofs (ZKP) provide! ( It becomes slightly more complicated than that in practice for whole lot other reasons. Like how do we trust the origin of the data etc… but you get the idea 😅 )

Although there are many ways in which you can construct ZKP, in this post, I will be giving a walkthrough (with full code snippets) on how to create a ZKP for a Sudoku puzzle solution using only hash-functions as commitments. ZKP can be quite daunting to understand initially, thus I truly believe that Sudoku puzzle is a good example to understand how ZKPs work at a very high level. Plus, Sudoku is something that most people are familiar with. But let’s just cut to the chase.

💡 Background Knowledge

Zero-Knowledge Proofs

Zero-Knowledge protocols originate back to the 80s when they were first proposed by Shafi Goldwasser, Silvio Micali and Charles Rackoff in MIT [2]. More precisely, they described an interactive proof system that involves one party (‘Prover’) to prove to a second party (‘Verifier’) that a statement holds.

Contextualising this into the usual Alice/Bob example, we can consider the following scenario: Alice stumbled upon an online competition with a puzzle to be solved with a prize stake of £100. She asks for Bob’s help and they agree to split the price if any of them solves it. Not much time later, Bob claims that he has solved the problem and Alice asks him to share the solution. However, Bob is reluctant to sharing because he thinks that Alice may submit the solution by herself without sharing the price. Consequently, he is looking for a secure way to prove to Alice that he has the solution but without sharing it directly with her. Yeah, you guessed right! A ZKP is exactly what he is after!

But let’s define more thoroughly what is meant by the term “secure way”. In general, a ZKP must provide the following properties (at least with very high confidence!):

Completeness: Every invalid proof must be rejected by the verifier Soundness: Every valid proof must be accepted by the verifier Zero-Knowledge: Verifier does not learn any information about the statement other than its assertion.

Note that this is a very basic definition of interactive ZKPs. There a wide variety of more sophisticated interactive/non-interactive ZKPs but for the present walkthrough, this definition suffices.

Commitment Schemes

Commitment schemes are a crucial ingredient of ZKPs and frankly, of cryptography in general. In simple terms, they are cryptographic primitives that allow a party to commit to a specific value without revealing the actual value while providing a way to reveal the value and verify the commitment in a later phase.

More formally, let C be a commitment scheme, it must provides these two properties:

Hiding: Hard to distinguish between commitments of different values. i.e:

Binding: There should be no way for a person who commits to one bit, to claim that he has committed to another value later:

One way to create a commitment scheme is by using one-way hash functions and cryptographically secure random bytes. It should be noted that in such case the security of the scheme is governed by the cryptographic assumptions of the hash function itself (i.e. it’s truly one-way).

To add more clarity, let’s walk through an example with the usual suspects. Alice and Bob decide to play a game of rock, paper, scissors digitally. They both make their choices and exchange them such that a winner is decided. Naturally, in a digital world, one of them has to share his/her pick first, which brings her/him in a disadvantageous position as he/she can just share a different choice after reviewing what the other player picked. This is exactly the kind of problems that commitment schemes solve!

Alternatively, they can create a commitment based on their choices and share the commitment instead of their actual choice! For instance:

Let a set:

S = {“Rock”, ”Paper”,”Scissors”}

Bob and Alice both randomly pick Pᴬ and Pᴮ from S respectively. Now they calculate: ( || -> represents concatenation)

Cᴬ = sha256(Rᴬ || Pᴬ) and Cᴮ = sha256(Rᴮ || Pᴮ)

Bob shares Cᴬ with Alice and Alice Cᴮ with Bob. Note that by now, they both committed to these values.

Finally, they share their original choices and random bytes Pᴬ, Rᴬ and Pᴮ, Rᴮ. With this information, each party can verify the commitment by hashing P || R and assert their equivalence. Based on their picks the winner can be decided and none of them could have altered their initial choice since the hashes wouldn’t match.

🧩 Sudoku ZKP

It’s time for the main part of this article. Sudoku is a very well known puzzle that is also known to be an NP problem (in fact NP-Complete [4]) and is proven that there is a ZKP for any problem in NP [1]. Sudoku ZKP is by no means something new but I have yet to find an intuitive and clear explanation of the protocol with code examples so at the very least this is what this article aims to provide. Actually, the protocol described here is the implementation of this very interesting work from Gradwohl et al. [3] therefore for more formal details you can refer there. Interestingly, in their paper, they also described a physical protocol to perform the proof using a deck of cards which is fun if you want to demonstrate ZPK physically but let’s stick to the digital proof for now.

Before jumping into the code let’s see a high-level plain-English description of the protocol itself.

Alice wants to prove to Bob that she has a solution to a Sudoku Puzzle but Bob doesn’t believe her. Assume she on hold of the following puzzle and solution.

To avoid confusion let’s follow the proof in steps:

1. Alice creates a permutation of the sudoku digitis which effective is a one-to-one mapping for each digit. i.e. 1 -> 3, 2 ->8 ….

2. Additionally, she generates a random byte sequence (nonce) for each sudoku cell. This leads to 81 random nonces.