In Part 1 we took a look at the incentives involved in Bitcoin mining and how they are used guarantee a single transaction history needed to prevent bitcoins from being double spent. In this post we will take more a technical look at the cryptography involved and how it is used to secure the network. As I said previously, Bitcoin is very accessible. While we will be discussing cryptographic concepts, it shouldn’t discourage you from continuing further.
Cryptographic Hash Functions
Before moving forward we should take a moment to learn about hash functions since they are used all throughout the Bitcoin protocol. To put it simply, a hash function is just a mathematical algorithm that takes an input and turns it into an output. For example, suppose we have an algorithm which just adds all the digits in the input string together. If our input is 1234 we would get an output of 10.
1234 ==> 10
Simple enough. However, there are certain properties of really good hash functions that make them suitable to use in cryptography. Keep these properties in mind as they are vital to the operation of the Bitcoin protocol.
- It should be very easy to compute an output for any given input, however it should be impossible (given current knowledge of mathematics and the state of computers) to compute the input for a given output even while knowing the mathematical algorithm. Consider, in the above example we can easily compute an output of 10 given the input of 1234, however going in reverse isn’t as easy. In this case there are many possible inputs that could add up to 10 (55, 136, 7111, etc). However, given the simplicity of our function one could still figure out the input relatively easily. Some cryptographic hash functions, on the other hand, are said to be unbreakable by even quantum computers.
- Unlike our example, each potential output should map to only one input. If a two different inputs can produce the same output this is called a hash collision. Good cryptographic hash algorithms are resistant to such collisions.
- A hash function should be able to take inputs of variable size and turn them into outputs of a fixed size. For example:
hello ==> 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 goodbye ==> 82e35a63ceba37e9646434c5dd412ea577147f1e4a41ccde1614253187e3dbf9
The output should be the same length regardless of whether the input has 10 characters or 10 thousand characters.
- A tiny change in the input should produce an entirely different output that in no way relates to the original input. Example:
hello world ==> 98c615784ccb5fe5936fbc0cbe9dfdb408d92f0f Hello World ==> a830d7beb04eb7549ce990fb7dc962e499a27230 Hello World! ==> 8476ee4631b9b30ac2754b0ee0c47e161d3f724c Hello, World ==> 6782893f9a818abc3da35d745a803d72a660c9f5
Bitcoin makes heavy use of the cryptographic hash function SHA256, which stands for Secure Hash Algorithm 256-bit. Incidentally, the SHA algorithms were originally developed by the NSA. You might wonder how we can trust something that came from the NSA. That’s certainly cause to be suspicious, however, the algorithms are part of the public domain and have been vetted and analyzed by cryptographers who know what they’re doing. The consensus is that they are secure.
Now that we have the preliminaries out of the way we can start focusing in on the protocol. If you read Part 1 you will recall that all Bitcoin transactions are relayed to each of the peers in the network. Miners collect these transactions, perform a number of checks to make sure they’re valid, then add them to their memory pool. It’s at this point that they begin the process of creating a block.
The first step in the process is to hash each transaction in the memory pool using SHA256. The raw transaction data may look something like this:
Once hashed it will look like this:
These hashes are then organized into something called a Merkle Tree or hash tree. If you’re familiar with what an NCAA tournament bracket looks like, you’ll understand this concept. The hashes of the transactions are organized into pairs of twos, concatenated together, then hashed again. The same is done to each set of outputs until something like a tree is formed (or an NCAA bracket).
In the above example there are only four transactions (tx stands for transaction). A real block will contain hundreds of transactions so the bracket (tree) will be much larger. The hash at the very top of the tree is called the Merkle Root.
By the way, don’t worry if you don’t yet understand why transactions are organized into a Merkle tree, we’ll bring it all together soon enough and then it will all “click”.
The Merkle Root of this hash tree is placed into the block’s header along with the hash of the previous block (to be explained later) and a random number called a nonce (also to be explained later). The block header will look something like this:
The block’s header is then hashed with SHA256 producing an output that will serve as the block’s identifier. Now having done all this can we go ahead and relay the block to the rest of the network? If you recall the last post, the answer is no. We still need to produce a valid proof of work.
Proof of Work
The Bitcoin protocol sets a target value for a block header’s hash. The output must be less than the specified number. Another way of saying this is that the hash of the block header must start with a certain number of zeros. For example a valid hash may look like this:
Any block whose header does not produce a hash that is less than the target value will be rejected by the network. The target value is adjusted by the protocol every two weeks to try to maintain an average block time of 10 minutes.
So after you’ve hashed each transaction, hashed the outputs into a hash tree, found the Merkle Root, added it to the block header with the hash of the previous block and a nonce, hashed the header and produced an output that does not start with the correct number of zeros, then what?
This is where the nonce comes in. The nonce is simply a random number that is added to the block header for no other reason than to give us something to increment in an attempt to produce a valid hash. If your first attempt at hashing the header produces an invalid hash, you just add one to the nonce and rehash the header then check to see if that hash is valid.
For example, suppose we wanted to hash “Hello, world!” such that the output started with at least three zeros. You would concatenate it with a nonce, hash it and check the output to see if it’s valid. If it isn’t, you add one to the nonce and try again.
"Hello, world!0" => 1312af178c253f84028d480a6adc1e25e81caa44c749ec81976192e2ec934c64 "Hello, world!1" => e9afc424b79e4f6ab42d99c81156d3a17228d6e1eef4139be78e948a9332a7d8 "Hello, world!2" => ae37343a357a8297591625e7134cbea22f5928be8ca2a32aa475cf05fd4266b7 ... "Hello, world!4248" => 6e110d98b388e77e9c6f042ac6b497cec46660deef75a55ebc7cfdf65cc0b965 "Hello, world!4249" => c004190b822f1669cac8dc37e761cb73652e7832fb814565702245cf26ebb9e6 "Hello, world!4250" => 0000c3af42fc31103f1fdc0151fa747ff87349a4714df7cc52ea464e12dcd4e9
In this example it took 4,251 tries to find a nonce such that when concatenated with “Hello, world!” produced an output starting with at least three zeros.
This is Bitcoin mining in a nutshell. Notice the entire block of transactions isn’t rehashed with every attempt, just the header. This is essentially what Bitcoin mining is, just rehashing the block header, over, and over, and over, and over, until one miner in the network eventually produces a valid hash. When he does, he relays the block to the rest of the network. All other miners check his work and make sure it’s valid. If so, they add the block to their local copy of the block chain and move on to finding the next block.
In the old days miners just performed the SHA256 calculations on their laptop’s CPU. However, the more hashes that you can perform per second, the greater the probability that you will mine a block and earn the block reward. CPU mining quickly gave way to GPU mining (graphics processing units) which proved much more efficient at calculating hash functions. In todays world, miners are using ASICs (application specific integrated circuits) to mine Bitcoin. Basically, these are purpose built computer chips that are designed to perform SHA256 calculations and do nothing else. It’s not uncommon to see miners calculating over one trillion hashes per second (a terrahash). At present, the total hashing power in the network is about 700 terrahashs per second and closing in on one petahash per second.
I should probably also note at this point that the first transaction in each block is referred to as the “coinbase” transaction. This is a transaction where the miner sends himself 25 bitcoins that have just been created “out of thin air”. Because each miner is sending these 25 bitcoins to his own address, the first transaction in each block will differ from miner to miner. Now remember the properties of a cryptographic hash function? If an input changes even in the slightest, the entire output changes. Since the hash of the coinbase transaction at the base of the hash tree is different for each miner, the entire hash tree including the Merkle root will be different for each miner. That means the nonce that is needed to produce a valid block will also be different for each miner.
This is the reason why the Merkle tree is employed after all. The transactions are represented in the header by the Merkle Root so that the entire block of transactions doesn’t need to be rehashed with each attempt (which would make the amount of time needed to hash a block vary with the number of transactions). Any change to a single transaction will cause an avalanche up the hash tree that will ultimately cause the hash of the block to change. Now let’s see how this protects the network from attack.
The hash of each block is included in the header of the next block as such:
If an attacker wants to alter or remove a transaction that is already in the block chain, the alteration will cause the hash of the transaction to change and spark off changes all the way up the hash tree to the Merkle Root. Given the probabilities, it is unlikely a header with the new Merkle Root will produce a valid hash (the proof of work). Hence, the attacker will need to rehash the entire block header and spend a ton of time finding the correct nonce. But suppose he does this, can he just relay his fraudulent block to the network and hope that miners will replace the old block with his new one or, more realistically, that new users will download his fraudulent block? No. The reason is because the hash of each block is included in the header of the next block. If the attacker rehashes block number 100, this will cause the header of block 101 to change, requiring that block to be rehashed as well. A change to the hash of block 101 will cause the header of block 102 to change and so on all the way through the block chain. Any attempt to alter a transaction already in the block chain requires not only the rehashing of the block containing the transaction, but all other subsequent blocks as well. Depending on how deep in the chain the transaction is, it could take a single attacker weeks, months, or years, to rehash the rest of the block chain. And as I mentioned in Part 1, as long as the attacker does not control a majority of the processing power in the network, the rest of the network will be adding new blocks on to the main chain faster than the attacker can add blocks to his fraudulent chain, guaranteeing that the legitimate chain remains the longest and the attacker’s chain is ignored.
The only exception to the above rule is if the attacker simply gets lucky. As we noted, it takes the entire network an average of 10 minutes to find a valid block. It should take a single attacker with, say, 10% of the processing power in the network 100 minutes to find a valid block (200 minutes at 5% etc), but those are just averages. It’s theoretically possible that an attacker could get lucky and mine a block in 1 minute when it’s supposed to take him an average of 100 minutes. If that block contained a double spend, it’s possible the attacker’s fraudulent transaction would get included in the block chain and his legitimate transaction rejected (the rest of the network would think the legitimate transaction is the double spend). The deeper a transaction is in the block chain, however, the more times in row the attacker would need to get lucky and mine a block before the rest of the network to extend his chain longer than the main chain. From a probability standpoint, the chances of such an attack succeeding decrease exponentially with each subsequent block. It’s kind of like winning the lottery a number of times in a row. In the original white paper Satoshi Nakamoto calculated the probabilities that an attacker could get lucky and pull off a double spend. In the following table q is the percentage of the network controlled by the attacker, P is the probability an attacker could get lucky and override z number of blocks.
q=0.1 z=0 P=1.0000000 z=1 P=0.2045873 z=2 P=0.0509779 z=3 P=0.0131722 z=4 P=0.0034552 z=5 P=0.0009137 z=6 P=0.0002428 z=7 P=0.0000647 z=8 P=0.0000173 z=9 P=0.0000046 z=10 P=0.0000012 q=0.3 z=0 P=1.0000000 z=5 P=0.1773523 z=10 P=0.0416605 z=15 P=0.0101008 z=20 P=0.0024804 z=25 P=0.0006132 z=30 P=0.0001522 z=35 P=0.0000379 z=40 P=0.0000095 z=45 P=0.0000024 z=50 P=0.0000006 Solving for P less than 0.1%... P < 0.001 q=0.10 z=5 q=0.15 z=8 q=0.20 z=11 q=0.25 z=15 q=0.30 z=24 q=0.35 z=41 q=0.40 z=89 q=0.45 z=340
Given the above probabilities we can see that an attacker with 10% of the network’s processing power would have a .024% chance of getting lucky and overriding six blocks. Which is usually why it is recommended that if you are selling something expensive, you should wait until your transaction is six blocks deep (six confirmations in Bitcoin lingo) before actually handing over the merchandise.
Ok that’s it for now. This post got long in a hurry. I was going talk a little bit about mining pools, but maybe I’ll save that for a Part 3. Hope you enjoyed these posts and I hope you learned something.