In this Tutorial I will show you how you can use the frequency of letters to decrypt the famous Caesar cipher. I hope it is easy to understand and helpful for getting in touch with basic cryptanalysis. If you find any mistakes or have questions feel free to ask me. Finally I ask you to give me feedback which I will take into account for the next part.
What Is the Caesar Cipher?
The Caesar cipher is one of the most known ciphers today, although it is one of the insecurest. Every latin letter in the plain text is moved as many places to the right as the key displays. For example:
Key: 2
Plain Text
abcdefghijklmnopqrstuvwxyz
Encrypted Text
CDEFGHIJKLMNOPQRSTUVWXYZAB
Why Is the Caesar Cipher Not Useful for Serious Encryption?
I think everyone asks now why this ingenious cipher is not as widely used today as it deserves. It is not so hard to Brute-force this cipher because there are only a few keys. 25 if we want to be correctly. Although this is bad enough to decrypt this "cipher" we can add another technique for breaking it. The Frequency analysis uses the frequency of letters in a language to decrypt ciphers which only use single letters for encrypting. We will come to other encryptions with pairs of letters later.
Breaking the Caesar Cipher / Frequency Analysis
I think Brute-forcing the encrypted text is not the brilliant way we should use here, so we will Brute-force the text with the frequency analysis. In languages every letter has a different probability. For example in the english language "e" is the most used letter followed by "a". So we wont just try every key, but look for the frequently used letter and think that this one was the "e" in the plain text. Then we will use this adoption to decrypt the text using this key. After that we will look for the second frequently used letter and so on...
Let Us See an Implemantation
You can find my Python code here. I will try to explain it step by step, but a fundamental Python knowledge is required.
import os
I just import this for the following clearing function.
def clear():
os.system('cls' if os.name=='nt' else 'clear')
This function is used to clear the terminal. It's useless if you are not afraid of getting it looking good ;).
def pause():
input("\nPress ENTER to continue!")
I use this function to shorten my code a bit.
I jump the following two functions because I explain them later on. So we come to the Entry Point. I will extend this tool in the following Tutorials, so I added a little menu for controlling different functions.
while True:
...
This is just the menu, so I jump to the interesting part.
if eingabe == "1":
...
This part takes the encrypted text and calls the decrypt function.
Now we come to the decryption part in the caesarDecrypt(text) function. In the following code I have to use "(" for representing square brackets.
frequencyLetters = (0 for i in range(26))
for c in text:
ucode = ord(c)
ucode -= 97
if ucode < 26 and ucode >= 0:
frequencyLetters(ucode += 1)
Here we save the frequencys of the letters in an array. I use ord() for moving the letters to numbers. Play a bit around with Unicode if you don't understand why I subtracted 97.
for i in range(26):
...
In the following part I Brute-force the encrypted text with the help of the Frequency analysis. It is self explanatory, so I will let you alone while reading through this.
Conclusion
I think it is not hard to see why this cipher is not useful, but it is a great example for the use of the Frequency analysis. Hold in mind that the Frequency analysis can only be useful if a long encrypted text is used because short texts have a bigger chance of breaking the default probability of letters. Finally I excuse the bad layout of the code here, but I think together with the pastebin code it won't be so bad. If you want to try out this little decryption tool you can use sites like this for encrypting your plain text.
Just updated your iPhone to iOS 18? You'll find a ton of hot new features for some of your most-used Apple apps. Dive in and see for yourself:
17 Comments
Interesting post. Looking forward to part 2 (Vigenère cipher?).
You should cover ROT13. They use it at the NSA all the time. Although, I've heard ROT26 is twice as secure...
Oaktree:
Are you being facetious? NSA uses ROT13????
Yeah, our gouvernment spares no cost and integrated ROT26 as default encryption to confuse every hacker. No one will expect THAT.
I'm hoping you guys are trying to be funny. The NSA does NOT use ROT. No respectable organization uses ROT. It is a 2000 year old technology and takes less than a millisecond to crack.
Well I've heard it's an intraoffice thing for nonsensitive communication - just for fun.
We are just kidding^^ I thought I made that clear with the caesar decryption post above ;)
Ok. Be careful. Some people may not understand that you are being facetious.
Very funny , nice one !
Not bad! Here's a simple challenge for those who want to practice decrypting.
Brute forcing 25 times is probably a faster method of cryptanalysis.
I agree
I would also agree, but if you want to be elegant, in that you derive the key, this is how you'd do it.
It's too overkill and time consuming for such a simple task. Besides, what happens if the cipher text isn't large enough for a proper distribution of characters to apply a frequency analysis? What happens if the user intentionally avoids using the letter 'e'?
Very true. I have a brute-decrypt script for Caesar Ciphertext.
I wanted to use this simple cipher for explaining how frequency analysis work becasuse we'll need it for other ciphers later. To explain somebody that he just have to try every key is'nt that ingenious ;)
That's fine and all, but brute forcing really is a better technique in this scenario since one of the weaknesses is the lack of possible keys. You could then address the issues of brute forcing in real-world situations.
If you wanted to explain frequency analysis, just do it like you would here but in a future separate article or one with a more relevant cipher where it would make more sense to use it.
Just a bit of advice.
Share Your Thoughts