Overview

The representation of one’s identity evolved over the past from being centralized to federated to user centric to self-sovereign. Most of the systems started being centralized, where the identity attributes were maintained centrally in silos. Even within the same organization, there are cases where each department maintains its own employee data independently. Not just the Internet giants like Facebook, Google, Amazon and Microsoft, but also the governments maintain identity information centrally.

Aadhaar, a brain-child of the Indian government to secure its residents’ fundamental rights to have an unforgeable identity, is the largest collection of biometric data in the world. It has captured fingerprints and iris of more than 1 billion residents in India. In addition to biometrics, Aadhaar also collects name, date of birth, gender, address, mobile/email (optional) of every resident (no need to be a citizen), and stores those centrally against the corresponding finger prints and iris patterns.

In USA Integrated Automated Fingerprint System (IAFIS) run by the FBI is another large repository of biometrics, which includes fingerprints, facial images, and other physical characteristics, including height, weight, hair, eye color, and even scars and tattoos. The database has more than 70 million criminal records alongside 34 million civil records that law enforcement agents have available on a 24x7x365 basis, managed centrally.

Equifax, a consumer credit reporting agency, was recently hacked and social security numbers (SSNs) and many other personal data of more than 143 million American residents were exposed. Not just Equifax, any centralized system is a honeypot awaiting to be hacked — it’s just a matter of time.

Centralized systems next evolved to support identity federation. For example, around 1995, Microsoft came up with an initiative called Microsoft Passport to share the user data stored under Microsoft with other relying party web sites. Google started to share user data via OpenID/SAML 2.0 and today via OpenID Connect. Facebook started with Facebook Connect and today supports OAuth 2.0 for user data sharing. Microsoft supports SAML 2.0/OpenID Connect and WS-Federation. Federation didn’t make centralized systems any better. Still all the user data are managed centrally by the corresponding companies. One good thing it did was, it made possible for other relying parties to consume user data managed by those centralized identity providers — without having to manage by each of them.

Then came the user-centric identity paradigm. Kim Cameron, one of the distinguished identity architects at Microsoft, with the support from the community came up with the seven laws of identity, in 2005. This laid the foundation for user-centric identity paradigm, where the user is in the middle of an identity transaction, between the identity provider and the relying party. The identity provider will share the user data with the relying party only after the user’s consent. Even after 12 years today, this is one of the primary requirements in the proposed GDPR regulation in EU. User-centric identity model gave some control over to the user — but then again the control is quite limited — still the user data are managed centrally.

Self-sovereign identity is the next phase of the identity evolution. It talks about giving total control of user data to the user himself/herself. You own your identity, not anyone else. There won’t be one central authority to manage millions of user records. It’s a decentralized model, where each user decides how they want to manage their own data.

Why Blockchain?

Blockchain based implementations evolved a lot since its inception in 2008 with bitcoin. Melanie Swan, in her book, Blockchain ~ Blueprint for a New Economy, identifies three generations of blockchain. Blockchain 1.0 or the 1st generation is about the currency, the deployment of cryptocurrencies in applications related to cash, such as currency transfer, remittance, and digital payment systems. Blockchain 2.0 is about contracts, the entire slate of economic, market, and financial applications using the blockchain that are more extensive than simple cash transactions: stocks, bonds, futures, loans, mortgages, titles, smart property, and smart contracts. Blockchain 3.0 is about blockchain applications beyond currency, finance, and markets — particularly in the areas of government, health, science, literacy, culture, and art.

Blockchain is no more being shadowed by the cryptocurrencies. The complete decentralized, trustless, immutable architecture proposed by blockchain is the foundation for most of the identity and naming systems built on top of it. Let’s discuss below the key use cases solved using blockchain, in the identity domain.

Zooko’s Triangle

Zooko’s triangle defines that, any naming system can only fulfill two out of the following three desirable properties at any given time:

Secure, i.e. globally unique: There is only one, unique and specific entity to which the name maps.

Decentralized: There is no centralized authority for determining the meaning of a name.

Memorable: Names are arbitrarily chosen strings short enough for humans to memorize.

For example, your GMail username is unique, memorable, but centrally owned by Google — not decentralized. A public key that you generate is globally unique, decentralized (you do not need to deal with any central authority to generate your own public key), but not memorable. A nick name that you pick is decentralized, memorable but not unique.

Unique, Decentralized, Human Meaningful Identifiers

How do we break the Zooko’s triangle to create a unique, decentralized, memorable identifier? This is one of the puzzles solved by blockchain based identity & naming systems. How useful is this and why do we need such an identifier?

DNS (Domain Name System) is the best example. It’s a distributed, but centralized system which is governed by ICANN (Internet Corporation for Assigned Names and Numbers). The US government was able to reassign the management of the country-TLDs of Afghanistan and Iraq during the war times. Also, the WikiLeaks was blocked in USA after the disclosure of diplomatic cables. Many social networking sites are blocked in China. Every part of the world, governments take control over the DNS whenever needed. How about a building a completely decentralized DNS? A completely decentralized DNS that cannot be governed by any central authority. This is where we need to break the Zooko’s triangle to create unique, memorable domain names in a completely decentralized manner. As we proceed with this blog post we will learn how the systems like Namecoin and Blockstack address this need in their implementations.

Another key pillar in building the Internet is the PKI (public key infrastructure) — based on certificate authorities (CAs). A secured domain name has to be associated with a certificate, which is issued by a root certificate authority or an intermediary. It is no easy becoming a root certificate authority — there are only less than hundred CAs globally. Once again, governments can control these CAs and they do have the power to ask a CA to revoke a certificate. I doubt this has happened openly in the past — but I don’t see anything that prevents this happening in the future. In other words, CAs are centrally managed entities. DNS and CAs — the two key pillars of the Internet, both are centrally managed. How do we build a decentralized PKI for the Internet? The primary task of a CA is to associate a public certificate with a domain name — the client knows the certificate being used to encrypt the channel between the client and the server, belongs to the corresponding domain name. This is another strong use case for blockchain. Once you establish an identity for your web site in a complete decentralized manner, the blockchain itself can associate a key pair with corresponding name. Let’s get into the details as we proceed. Both Namecoin and Blockstack support decentralized DNS and PKI with their implementations. The Blockstack browser was unveiled during the Consensus 2017 conference last May, let you surf the new decentralized web (if this feature is not fully supported yet — hopefully soon in the future).

Know Your Customer (KYC)

Know your customer (KYC) is the process of a business identifying and verifying the identity of its clients. The term is also used to refer to the bank and anti-money laundering (AML) regulations which governs these activities. Know your customer processes are also employed by companies of all sizes for the purpose of ensuring their proposed agents, consultants, or distributors are anti-bribery compliant. Banks, insurers and export creditors are increasingly demanding that customers provide detailed anti-corruption due diligence information [ref]. A 2016 survey done by Thomson Reuters revealed that financial firms’ spend up to $500 million annually on KYC globally. Also more than 89 percent corporate customers had not had a good KYC experience, and 13 percent had changed their financial institution relationship as a result. Over the time KYC has become an expensive, time consuming, painful process.

Every bank and financial institution performs the KYC process individually, and uploads the validated information and documents to the central registry that stores digitized data tagged to a unique identification number for each customer. By using this reference number, banks can access the stored data to perform due diligence whenever customers request for a new service within the same banking relationship, or from another bank. SWIFT launched the SWIFT KYC Registry in December 2014, and more than 2000 banks have already enrolled with it. This does not use blockchain and neither does the other large KYC Registry KYC.Com [ref].

Blockchain is not a place to store artifacts. If we do so, once again we create a large repository of artifacts stored and duplicated in each node of the blockchain. Even storing encrypted artifacts on blockchain is not a good practice. Anything stored in blockchain is immutable. A hacker may not be able to decrypt them today, but with the advancements in the quantum computing possibly they will in the future. The blockchain will not replace as it is, what SWIFT KYC Registry or KYC.com do.

What are the challenges in the current KYC process? Putting Theory Into Practice report by Goldman-Sachs highlights that the lack of data “mutualization” between banks leads to duplicate effort in client on-boarding. When a new client relationship is formed, financial institutions conduct a thorough customer due-diligence (CDD) process in accordance with “know your customer” (KYC) regulations. In most jurisdictions, banks are required to independently vet prospective accounts even when the account has already been vetted by another bank. It is estimated that proper KYC due diligence can cost $15k-$50k per client.

There are multiple blockchain implementations that do help to make KYC process more efficient and productive. Let me explain here, one common approach followed by most of them. The customer first has to visit individual authority, prove who he/she is and get a signed attestation token, and record it in the blockchain. For example, you visit department of motor vehicles (DMV), get a signed digital copy of your driving license and store the signed copy along with DMVs public key, in an app (KYC app) in your mobile device. In the meantime, DMV also calculates the hash of the signed doc and will commit to the blockchain. In the same way you can get your birth certificate, house lease document, car lease document, employment confirmation all signed by the corresponding authority and get the hash of individual records added to the blockchain. Now when you want to start the KYC process with another entity, you simply share the details you want to share through your KYC app, and this entity will validate the signature with the attached public key (with the each record), and verify the hash of the record in the blockchain (provided that the corresponding public keys are trusted).

There are many initiatives to build KYC solutions on top of blockchain — both around permissionless and permissioned. ShoCard and Civic enable KYC via identity proofing. Deloitte’s Smart ID platform follows a similar approach. KYC-Chain uses the blockchain and smart contracts to provide a platform for opening accounts online, while complying with laws and regulations. It employs Ethereum and will work primarily via the use of “trusted gatekeepers,” who can be any individual or legal entity permitted by law to authenticate KYC documents, for example, notary publics, people of diplomatic status, lawyers, governments, etc. A trusted gatekeeper would perform an individual check on a user’s ID using KYC-Chain’s platform and authenticate them. The verified files would be stored in a distributed database system, which can later be retrieved by the trusted gatekeeper, or the user, to demonstrate with certainty that the ID is genuine [ref].

iSignthis provides automated KYC identity proofing by using real-time electronic verification of regulated payment instruments. They recently partnered with Coinify, a Denmark-based blockchain currency payment provider, to offer a new service connecting blockchain payments, identity verification and credit cards [ref].

R3 Corda ( a private, permissioned distributed ledger) platform runs a KYC registry that is intended to allow identities (of both individuals and entities) to be managed by the identity owners. The identity owners can allow other participants of the Corda platform to access their digital identity for client on-boarding and KYC purposes. R3 is a consortium of over 80 financial institutions and regulators around the world to design and deliver distributed ledger technologies for global financial markets.

Reputation

There are many organizations emerging with strategies around reputation. Ebay, for example, includes systems for establishing trust between buyers and sellers. Solutions like Disqus work to manage communities and users in the comment space. TrustCloud and Traity measure helpfulness and compassion, and provide features such as background checks on service providers and insurance to help organizations in the sharing economy to respond with greater assistance when a rental or trade situation goes wrong [ref].

Klout purports to measure social influence. Organizations like Yelp and RipOffReport have risen up to give consumers a way to praise good performance or to sound off when service is bad or deals go awry[ref]. Uber and Lyft both have their own rating systems. It’s all about the driving experience, but an Uber driver cannot carry his/her Uber rating to Lyft — or vise-versa. All these reputation systems function in silos and centrally managed.

There are multiple initiatives out there to solve this with blockchain to break the silos in a complete decentralized manner. Open Reputation is an open source decentralized platform that maps identity and reputation onto the Internet-of-things. Using blockchain, everyone and everything has an identity that begins pseudonymous and gathers encrypted shareable reputation, enabling everyone to maintain a fully customizable balance with privacy [ref].

Here is an interesting discussion with Andreas Antonopoulos (one of the top bitcoin evangelists) on blockchain based reputation systems. He does not like the idea — and thinks it’s a wrong application of blockchain — mostly due to the social concerns. Blockchain is immutable and anything written to it will be there forever. If you get some negative reputation for something, when you were a kid, it will still stay there in the blockchain forever — and he argues this gives little chance for a person to change. I disagree. This is not an issue with the blockchain, but how you use blockchain. Some systems use blockchain to store raw data, while others use it to reference data stored in some where else under the control of the user, and to protect the integrity of it. Reputation systems on blockchain can follow the latter — you store the reputation data some where else (under your control), and only share with others with your consent.

Auditability

Centralized identity systems are not to disappear for next ten years or so. Even, ten years is very much optimistic. Like Aadhaar in India (which we already discussed before), many governments focus on building centralized identity systems for citizens /residents— and they are building very powerful ecosystems around these identity systems. Financial, healthcare, education and all the other key domains, will build systems relying on this government issued centrally managed identity (or the identifier).

If the government wants — or if someone can bribe a government official, the data stored in these systems can be changed. For example, in the Aadhaar system you may change someone’s phone number or the email address recorded against his/her Aadhaar number. This will make sure all his/her communications with banks, educational institutes and all the other Aadhaar dependent institutes will go to the wrong email address/phone number. Surely Aadhaar would be generating audits for such changes — but then again audits too managed centrally, so anyone can delete them to go undetected.

How can we make Aadhaar better with blockchain? The Aadhaar system can publish all the changes against each user record to the blockchain. No need to reveal the data or the Aadhaar number — we can use the hash of both and record it in the blockchain. Now, whoever receives data from the Central Identities Data Repository (CIDR) can validate the hash of that data against the hash stored in the blockchain. If that matches — we know that there are no internal handling of the data. If not, someone has played with it. In this way, each individual can monitor changes happening against their Aadhaar record and immediately question the authorities. This will make Aadhaar more transparent, even though all the user records are maintained centrally. Also gives the opportunity to develop tools around blockchain to notify individuals whenever there is a change happened to their records.

Here is another interesting research with blockchain to protect the integrity of logs. This presents a proposal and implementation to immutabilize integrity proofs of the secure logs within the bitcoin blockchain.