Inside Bitcoin - Part 2 - Cryptographic Hashes
If you haven't read the last article, feel free to stop by and give it a read, but abstaining from reading it will not, for the most part, be detrimental to your understanding.
Today we are going to explore some of the inner workings of Bitcoin. As we previously learned in the last article, Bitcoin can be an effective method of anonymous transaction if certain conditions are met. That being said, you should not trust a protocol that you do not understand. Absolute anonymity is impossible, yet sufficient anonymity can be attained by taking certain measures to protect your flow of information, and these steps are nearly impossible to implement if you are technologically illiterate. Therefore, let us dive into the subject matter and learn about Bitcoin!
Below will be a basic explanation of hashing and hashing functions. If you already know enough about hashing, skip the next few paragraphs until you arrive at the Bitcoin picture.
One of the fundamental bases upon which Bitcoin stands is cryptography, or more specifically, cryptographic hashing. Cryptography concerns itself more directly with obfuscating a message composed of a string of characters, that which we may return to the original message by using a special key. With Bitcoin, standard cryptography is important in many aspects, but it is not nearly as essential to the underlying protocol as actual cryptographic hashing.
Cryptographic hashing is a process where a function takes input data and returns an obfuscated block of data. However, with hashing we do not want to get the original string back. The block that is output by the hash function is always the exact same size, no matter the input data, and the original data passed through the function cannot be salvaged like with standard cryptography. So how could cryptographic hashing ever be useful?
Following is an example which I will then explain:
We have two files, file 1 and file2. Within the files are two very similar strings. The only difference is the capital T in file2. However, this creates drastically different output hashes, as seen above. This drastic difference is called The Avalanche Effect, where a single byte of data will cascade the rest of information into a very different output. So how could this ever be of any use to anybody?
Well, using hashes can be particularly useful in confirming file integrity after download. Sites that offer downloads usually place an MD5 hash or a SHA1 hash near the download. This is called a checksum, which is basically just the official hash from the official file on site. After you download the file, you hash the file and check it against the checksum from the download site. Because if only a single byte is off, the output hash will be very different. Therefore if you get a different hash from the one on the official site, you know that something went wrong with the data you downloaded.
Below is an example from https://www.kali.org/site, our favorite hacking distro. You can see the checksums on the right. By confirming our file integrity we know that our files will run as planned and will save us trouble if the data was corrupted en route!
So that's all that MD5 and SHA1 means. These are just different methods (functions) of creating a hash (or sum). MD5 is the Message Digest 5, created to replace the earlier MD4, and SHA1 stands for Secure Hashing Algorithm 1. There are many others, some you may have even heard of. So now whenever you see MD5 or SHA1, you will know exactly what they mean!
Now to apply this new knowledge to Bitcoin...
Last article, we learned briefly that Bitcoin is made possible by something called the blockchain. We made reference to the fact that the blockchain acts like a bank ledger, replacing what would be the central bank of Bitcoin. Instead of having a centralized institution to govern the currency, Bitcoin is maintained by every user on the peer-to-peer network.
The most obvious reason we use the blockchain is to prevent double spending. Considering the fact that our digital currency is literally nothing but a message that is practically equivalent to the statement "I am sending 5 Bitcoins to Jim", how could we ever prevent scamming?
I have 5 Bitcoins in my account, but I send the same message to both Bob and Jim! How could we ever solve this fraudulent counterfeiting of currency?
This is why we would need a central bank to govern the printing of currency that is legitimate to prevent fraudulent transactions that would threaten the stability and integrity of our economy. The central bank would keep ledgers of all transactions and maintain account balances, to ensure that I cannot send my 5 dollars to two people. This is where the blockchain comes in...
Whenever we send a Bitcoin transaction, we must broadcast this transaction not just to our recipient but to the entire network. The network then verifies that a transaction is legitimate and appends that transaction to the end of the blockchain. Because there is only one blockchain used by everyone on the Bitcoin network, only one transaction can be verified. Remember, the currency is transparent and every wallet balance is visible to everybody, so the network will see the duplication of my Bitcoins which I tried to send to two people. Therefore it won't accept this fraudulent activity.
An attacker could still duplicate money with more complicated attacks, such as flooding the network with verification. Also, we need a method to create new bitcoins. This all begins with a concept called proof-of-work.
When I send out my transaction, it will not be verified until someone completes a very difficult mathematical problem that is computationally expensive. Basically, it will take your computer a long time to solve and will cost you on electricity. Once a computer solves the equation, it sends that message to the network and every participating node will quickly see that you solved the equation (it is hard to solve but easy to verify the correct answer to this equation) and add it to their blockchain. This process is called mining, and it will reward new Bitcoins to the user who solved the problem. But what is this mathematical problem?
The equation involves hashing the header of your transaction. Per say I sent "I am sending 5 coins to Jim", then the equation will be to find a certain value by hashing the header string. Bitcoin has a global target which is the maximum value the hash value can be. If you solve the header to an equivalent or lower value, you have successfully solved the equation.
Using the SHA-256 algorithm (our function, (f)), we will hash our header "I am sending this much to Jim" (h), against our nonce (n). We will change the nonce and run the function until it gives us a hash value equal to or less than the target. However, it is not easy to find a successful nonce. This takes a long time and a lot of computational power. This also takes an increasingly long time as more Bitcoins are created and the target gets more difficult (as more Bitcoins are created, think of this as similar to currency devaluation by inflation). Also, it is important to know that after a given period of time, the reward for mining a block of Bitcoins also depreciates over time. It used to be 50, it is now 25, and it will continually shrink.
Once we find a successful value, we broadcast our find to the network. Though it took us a long time to find the correct value, other computers can check to make sure that we are correct very quickly. Once we find this value and make our broadcast, the transaction becomes verified as the other nodes in the network add that transaction to their blockchain. We are rewarded with Bitcoins for authorizing the transaction, our friends transaction goes through successfully, and the network updates their blockchain with this new info. How exciting!
People might wonder, what happens if two transactions are verified at the same time? Though two computers authorizing both double-spend efforts is very unlikely, it is not, however, impossible when we consider also that it may take a while for the authorization to reach the whole network. This is an issue where forks are created in the blockchain, but this is more advanced, we may go into this later.
So now we understand how cryptographic hashing plays an integral role in our Bitcoin protocol! That is all for now, please give tips as to how I may improve my articles in the comments. Thanks for reading, keep learning!