In this Tutorial I will show you how you can use the frequency of letters to decrypt the famous Caesar cipher. I hope it is easy to understand and helpful for getting in touch with basic cryptanalysis. If you find any mistakes or have questions feel free to ask me. Finally I ask you to give me feedback which I will take into account for the next part.
The Caesar cipher is one of the most known ciphers today, although it is one of the insecurest. Every latin letter in the plain text is moved as many places to the right as the key displays. For example:
I think everyone asks now why this ingenious cipher is not as widely used today as it deserves. It is not so hard to Brute-force this cipher because there are only a few keys. 25 if we want to be correctly. Although this is bad enough to decrypt this "cipher" we can add another technique for breaking it. The Frequency analysis uses the frequency of letters in a language to decrypt ciphers which only use single letters for encrypting. We will come to other encryptions with pairs of letters later.
I think Brute-forcing the encrypted text is not the brilliant way we should use here, so we will Brute-force the text with the frequency analysis. In languages every letter has a different probability. For example in the english language "e" is the most used letter followed by "a". So we wont just try every key, but look for the frequently used letter and think that this one was the "e" in the plain text. Then we will use this adoption to decrypt the text using this key. After that we will look for the second frequently used letter and so on...
You can find my Python code here. I will try to explain it step by step, but a fundamental Python knowledge is required.
I just import this for the following clearing function.
os.system('cls' if os.name=='nt' else 'clear')
This function is used to clear the terminal. It's useless if you are not afraid of getting it looking good ;).
input("\nPress ENTER to continue!")
I use this function to shorten my code a bit.
I jump the following two functions because I explain them later on. So we come to the Entry Point. I will extend this tool in the following Tutorials, so I added a little menu for controlling different functions.
This is just the menu, so I jump to the interesting part.
if eingabe == "1":
This part takes the encrypted text and calls the decrypt function.
Now we come to the decryption part in the caesarDecrypt(text) function. In the following code I have to use "(" for representing square brackets.
frequencyLetters = (0 for i in range(26))
for c in text:
ucode = ord(c)
ucode -= 97
if ucode < 26 and ucode >= 0:
frequencyLetters(ucode += 1)
Here we save the frequencys of the letters in an array. I use ord() for moving the letters to numbers. Play a bit around with Unicode if you don't understand why I subtracted 97.
for i in range(26):
In the following part I Brute-force the encrypted text with the help of the Frequency analysis. It is self explanatory, so I will let you alone while reading through this.
I think it is not hard to see why this cipher is not useful, but it is a great example for the use of the Frequency analysis. Hold in mind that the Frequency analysis can only be useful if a long encrypted text is used because short texts have a bigger chance of breaking the default probability of letters. Finally I excuse the bad layout of the code here, but I think together with the pastebin code it won't be so bad. If you want to try out this little decryption tool you can use sites like this for encrypting your plain text.