Mining Process:

Bitcoin transactions (e.g., the BTC address 18nJ9bxLTZL5VxSbtfLw12uxtJsooYBhSi sending 3.41 BTC to the BTC address 1E4f8Exg8rtuW1EdDVFw7qkX4MELLBcrqd ) are announced to the decentralized network of machines running the bitcoin software in a fully peer-to-peer manner, with no central authority or trusted 3rd party acting as an intermediary.

(e.g., the BTC address sending BTC to the BTC address ) are announced to the decentralized network of machines running the bitcoin software in a fully peer-to-peer manner, with no central authority or trusted 3rd party acting as an intermediary. The way this works is that each machine in the network has a list of IP addresses of other nodes that it can directly communicate with to compare the state of the network.

Given the “6 degrees of separation” phenomenon, it takes a surprisingly small number of direct peer-to-peer connections to indirectly wire up every machine to every other machine in the global network.

Transactions accumulate in a pool of pending transactions (i.e., ones that have not yet been included in a block), called the mempool.

Here, full nodes (we will use this term interchangeably with the shorter “nodes”) on the bitcoin network, which are simply computers that have the entire bitcoin blockchain history stored locally on disk and can trace transactions through their entire history, verify that transactions in the mempool are valid. A transaction is valid if and only if the following requirements are met:

That the transaction for a given bitcoin address is signed with the sender’s private key , representing a promise that cannot later be repudiated by the sender, similar to a physical signature on a legal document.

with the sender’s , representing a promise that cannot later be repudiated by the sender, similar to a physical signature on a legal document. That the sending bitcoin address actually contains the required unspent BTC (this is verified by tracing the history of each coin through the entire blockchain history) and those coins are not involved in a pending transaction in the mempool.

Bitcoin miners, which are generally different machines than the full nodes, begin the mining process by generating a file containing the specific details of hundreds of valid transactions, adding more transactions to the file until they have reached the current limit of 1 megabyte worth of transaction data.

Miners then do a series of repetitive calculations using a hash function:

Hashes , which are a central idea in bitcoin that underpin the whole system, are a kind of secure “one way function” which can generate a fixed length hexadecimal number as output for any given length of input data.

, which are a central idea in bitcoin that underpin the whole system, are a kind of secure “one way function” which can generate a fixed length hexadecimal number as output for any given length of input data. That means that it doesn’t matter whether the input data is the string “Bitcoin is cool” or a single text file containing all of the text in the Library of Congress — you will always get as the output a 256 bit hexadecimal number.

What makes a hash function secure is that it is easy to compute the output of the hash function for a given input (called simply the hash of that input), but it is nearly impossible to recover the input data that would lead to a specified hash (i.e., the output of running the hash function on that input data).

is that it is easy to compute the output of the hash function for a given input (called simply of that input), but it is (i.e., the output of running the hash function on that input data). The bitcoin system is built using the SHA256 hash function.

hash function. To illustrate what this hash function looks like, the following is the SHA256 hash of the input data represented by the string “Bitcoin”:

b4056df6691f8dc72e56302ddad345d65fead3ead9299609a826e2344eb63aa4

If we were to change even a single letter — say we add an “s” to the end of “Bitcoin” — then this is the new SHA256 hash that results for the string “Bitcoins”:

aa0921d24d095df038a0c0a32eb0d644f1882e3a0a3d8814175c4e1cebbf84fb

Notice that the hash has completely changed despite the minor change in the input. This phenomenon, known as the avalanche effect, is the whole point of a hash function: that it is easy to calculate the hash in one direction (i.e., to compute the hexadecimal numbers above), but it is virtually impossible to go in the reverse direction — that is, to come up with the input data that will result in a specified hash. Because of this property, hash functions are also sometimes called “trap door” functions — it’s easy to fall through the trap door, but very tough to get back out the same way!

The reason why it is so hard to find the input data that corresponds to a given output hash is because as soon as you change one thing in the input data, the resulting hash has changed completely, thus preventing you from making any progress in matching the desired hash.

Indeed, the only way to find input data that will result in the desired hash is to try lots and lots of different inputs and then compute the hash of them and check to see if the hashes match the hash you are looking for.

Note that there is not a 1-to-1 correspondence between input data strings and hashes: because the input data can be of any finite length, while the output is always of a fixed 256 bit length, there are necessarily multiple inputs that will lead to the same output.

a 1-to-1 correspondence between input data strings and hashes: because the input data can be of any finite length, while the output is always of a fixed 256 bit length, there are necessarily multiple inputs that will lead to the same output. This is known as the pigeon-hole principle in mathematics: if you have N pigeon holes, and greater than N pigeons, and every pigeon needs to go into a hole, then there must be at least 1 hole with more than 1 pigeon in it.

Hashes are one of the critical components of the Bitcoin mining process, which works as follows:

Miners first generate a file of valid transactions as described above, which is called a block.

You can imagine a block as being like a spreadsheet, with each row representing a unique transaction and the various columns representing the different fields of the transaction, such as:

The timestamp the message was first communicated to the bitcoin network;

The IP address of the node first reporting the transaction;

The sender’s BTC address;

The receiving BTC address;

The quantity of BTC transacted;

The transaction fee specified (which goes to the miner of the block in which the transaction was included).

In addition to these normal transactions, you can think of a block as also containing some special rows which we will describe below: the nonce and the hash of the previous block in the chain of blocks (this chain of blocks is usually referred to as the blockchain).

The nonce is just some whole number that is appended to the list of valid transactions. Generally, miners set the nonce equal to 0 and then continually increment the nonce by one, replacing the old nonce with the new nonce. After updating the nonce, the miner then calculates the SHA256 hash of the entire block (in practice, this is done twice in a row, taking the “hash of the hash”) to check if it results in a “valid” hash (we will describe next what valid means in this context). The computed hash might look like this, for example:

8a5f808da37e6d7363d729f33ae51576df3d00cc1f6ac48ee1f31d36874fc3cb

Based on the bitcoin protocol that all nodes on the bitcoin network adhere to by voluntarily running compatible versions of the bitcoin software implementing this protocol, a block is deemed to be valid if and only if:

Every transaction included in the block is valid, in the sense described above. The hash of the block begins with a certain number of zeros.

Given the unpredictable nature of hashing functions, most nonces appended to the end of a block of transactions will result in a hash that will not begin with “0” (~90% won’t start with a zero) — and even fewer will begin with “00”.

A truly microscopic number of nonces will result in hashes for the proposed block that begin with a large number of zeros; if the required number of leading zeros is high (say, like 25), then the chances of finding a nonce that will result in such a hash become vanishingly small — to the point where, even if one is able to check billions of hashes per second, it will still take a huge amount of time and resources to find a nonce that works. This is the one part of the process that can’t be “faked” or “gamed” in any way — only lots of computing power and electricity can find these nonces!

Assuming a miner is able to discover a nonce that works — that is, a nonce that results in a block hash that begins with a sufficient number of zeros — then the block that includes that nonce is communicated by the discovering miner to other nodes on the bitcoin network in a peer-to-peer manner, where it is then verified. The nature of hash functions makes it quick and easy for the other nodes to verify that the hash of certain input data has specific properties (e.g., that it begins with 25 zeros).