Developers creating login systems know better than to store passwords in plain text, usually storing hashes of a password to prevent storing the credentials in a way a hacker could steal. Due to the way hashes work, not all are created equal. Some are more vulnerable than others, and a little Python could be used to brute-force any weak hashes to get the passwords they were created from.
Hackers often steal entire databases of user login and password data, and because of this, hashes are the preferred way to store sensitive information like passwords.
Hashes are different from encryption because they do not store data. Instead, the number that makes up a hash is the result of a calculation run on whatever it is you're hashing, be it a password or an entire file. This is used to ensure that the file you are downloading matches the file you're intending to download or to confirm the password the user entered matches the password they signed up with.
Depending on the size of the file or password you're hashing, hashes like SHA-1 or MD5 will take fixed blocks of the data you're hashing and run a complex calculation on it block by block until it reaches a final value. This value is a very long number designed to be unique so that one can verify that one file matches another by comparing the hash values. If the hash value is different, then something about the file has been changed.
This is great because if the user enters any password other than the one they chose, the hash value will be completely different. Because of this, the developer just needs to store the hash, because any time the user needs to log in, they can just enter the password to create a new hash to compare to the stored one.
As an example, I hashed nullbyte to the following SHA-1 value. You can create your own SHA-1 hash at sha1-online.com to see for yourself what this looks like.
Unfortunately for developers, not all hashes are created equal for storing passwords. For hashes like SHA-1, there are a few problems that make saving passwords with SHA-1 a less-than-ideal solution.
To highlight one, every time you hash the same word with SHA-1, it generates the exact same hash. While this is by design, you can simply take a huge number of guesses and hash them all into SHA-1, and then compare the hashes rapidly to get the password the SHA-1 hash was derived from. Because SHA-1 is designed to be fast, this process takes a very short amount of time, which makes it even easier to brute-force.
There are some solutions to this, and one of the most popular is adding a salt. A salt is a string of text that you can add to the password before hashing it. An example would be to add the word salt to the password nullbyte. While we know the SHA-1 value of nullbyte from above, the hash of nullbytesalt or saltnullbyte would be totally different. This helps, but if the salt is not per user, then figuring out the salt is not too difficult and you're back to the same problem.
A better solution is to add a random salt, and there is a hashing algorithm that was created for storing passwords with exactly this in mind.
Bcrypt is not only deliberately slow to foil brute-forcing, it also adds a random salt to each hash it generates. As a result, no two bcrypt hashes will be the same, even if they're made from the exact same password. To check a guess against a bcrypt hash, you instead have to use a bcrypt function that takes the password guess and the hash as an argument and returns the result of whether or not they match.
To show off how these different hashes work, I wrote some Python to turn any password into a SHA-1, MD5, and bcrypt hash.
import hashlib, bcrypt #Demonstrates the difference between two types of hashing, SHA1 and Bcrypt password = input("Input the password to hash\n>") print("\nSHA1:\n") for i in range(3): setpass = bytes(password, 'utf-8') hash_object = hashlib.sha1(setpass) guess_pw = hash_object.hexdigest() print(guess_pw) print("\nMD5:\n") for i in range(3): setpass = bytes(password, 'utf-8') hash_object = hashlib.md5(setpass) guess_pw = hash_object.hexdigest() print(guess_pw) print("\nBCRYPT:\n") for i in range(3): hashed = bcrypt.hashpw(setpass, bcrypt.gensalt(10))
As you can see below, the MD5 and SHA-1 hashes are all identical, but the bcrypt hashes change each time they're generated. For developers, bcrypt is clearly the better choice. But if we happen upon a SHA-1 or MD5 hashed password database, how could we actually go about brute-forcing the hash?
/Users/skickar/venv/untitled10/bin/python /Users/skickar/Desktop/TestSHA1.py Input the password to hash >nullbyte SHA1: 32c0ced56f1fe08583bdb079d85a35a81995018c 32c0ced56f1fe08583bdb079d85a35a81995018c 32c0ced56f1fe08583bdb079d85a35a81995018c MD5: 5f804b61f8dcf70044ad8c1385e946a8 5f804b61f8dcf70044ad8c1385e946a8 5f804b61f8dcf70044ad8c1385e946a8 BCRYPT: b'$2b$10$Z1WVDUi50fmqyrpw19rIyOLPIKVUFeh7HO0FfQi1MbKjyxyduG2WS' b'$2b$10$F.vehMYSUh/6zmTR/VY2quTnPfzPDcIdHTfZpb8twqjRIIIEFcbUW' b'$2b$10$pZyptPPDHrnIgpU7wTW2nu4cfGAUS65kcGZb6FMC7KmYwJmuwSoLO'
Part of growing up as a hacker is learning to write your own tools. At first, your tools will be simple and solve small problems, but as you gain experience, you'll be able to achieve more and more. When you're getting started, programming languages like C++ that are strongly typed can be difficult for beginners to understand, but Python3 is a flexible and beginner-friendly language that lets us abstract ideas and build prototypes with ease.
The simple program we'll write today will help practice the way a hacker creates a tool to exploit a vulnerability. In this example, SHA-1 is vulnerable to brute-forcing because you can compare two hashes together, so we'll write a program to do exactly that.
To write any program, you'll need to write out the steps that your program needs to follow in order to succeed. This list might seem a little long, but it can be condensed and you should be as specific as you can for the way things need to work in order to get the output you want. I prefer to use whiteboards or online flow-chart makers like MindMupp to draw the way these programs should flow from start to finish.
When you have your steps laid out, you can start jumping into pseudocode, which is where you lay down the steps in the order in a way that is readable but getting closer to the way the code would actually be expressed. With this pseudocode written, you can start to fill in your code line by line, correcting for mistakes as they happen, and watch each step of your program begin to take shape and interact with each other.
To follow this guide, you'll need a computer with Python3 to work on. Python3 has a number of differences from the previous version of Python, so you should be sure to get the correct version. You can install Python3 in a number of ways. In Linux, you can type the following to install Python3.
apt install python3
You will need a Python3 IDE (integrated development environment). These are programs that will help you write, test, and experiment with your code. In particular, I recommend PyCharm from Jetbrains. In addition, the professional edition is available free of charge to students, which is absolutely worth it if you happen to be eligible.
For everything to work properly, we'll need to import some libraries. We'll be using the urllib, urlopen, and hashlib libraries for this code to be able to open files from a remote URL and hash password guesses into SHA-1. To include them, create a new Python3 file in your IDE and type the following into the first line.
from urllib.request import urlopen, hashlib
This will import the libraries needed, ensuring the rest of the program has access to these libraries. If you need to install any of these libraries on your computer to run this script, you can generally do so with pip install and then the name of the library you need.
To follow along, you can download the Python programs I wrote for this example. To do so, open a terminal window and type the following three commands to download the scripts, change into its directory, and list the files in it.
git clone https://gitlab.com/skickar/SHA1cracker cd SHA1cracker ls
For the first command, we'll need to get the hash we want to crack from the user. To do this, we can use the input function, which will display a prompt to the user and allow them to enter a response.
In Python, we can store this response in a variable without doing anything beforehand. This is because Python isn't like C++, which requires that you to declare everything at the beginning. We can just create variables to hold data we want as we go.
We'll name our variable sha1hash because we will be storing an SHA-1 hash inside of it. We can just type that to create the variable, and then we'll need to assign the user's response to fill that variable. In Python, the equals (=) symbol does not mean it's comparing something to see if it is equal. That's actually done with two equals signs (==) instead. The equals symbol is more of a command in Python, the variable to the left is being assigned the data on the right of the equals sign.
We'll be assigning whatever the user types, so we'll call the input function, which also allows us to put the text that appears to the user inside two parentheses. To tell Python we want to print a string, or a collection of characters, we'll enclose whatever we're typing in quotation marks as well. The end result should look like this:
sha1hash = input("Please input the hash to crack.\n>")
When we run this, a prompt will appear that says "Please input the hash to crack." After this, we see a "new line" symbol, which is a backslash (\) and an n. This means to jump to a new line. Last, I put a > symbol just so the user can type their response on a new line. When we run the file, NBspecial.py, the result looks like this.
Dell:SHA1cracker skickar$ Dell:SHA1cracker skickar$ python3 NBspecial.py Please input the hash to crack >_
Once the user inputs a hash, it is saved in the sha1hash variable for use later in the program.
Next, we want to open a list of many common passwords. We'll be using a list of the 10,000 most common passwords for our example, which is a plain text file hosted on GitHub. You can use other lists, such as leaked passwords online or ones made with the Mentalist or Crunch.
For the file we're using, we'll be again assigning it to a variable, this time called LIST_OF_COMMON_PASSWORDS. To open the file, we'll be using a function called urlopen, which allows us to easily open this text file and tell Python the correct type of encoding. Use the format below.
This will open the URL enclosed in quotes with the read method, meaning we want to read text from the file. To make sure the str() function knows what it's working with, we'll also add a command and 'utf-8' after this function to tell the program we are using UTF-8 text encoding.
We'll again be saving the data as a string, and to prevent any problems with doing so, we can make sure the data we're putting into the variable is a string by first "casting" it to a string. This means trying to change the data to another type, and it can be done to convert integers to strings, strings to bytes, and any other sort of data type you want. To do this, we'll type str() and then include the data we want to turn into a string inside the parentheses. The final result should look like below.
LIST_OF_COMMON_PASSWORDS = str(urlopen('https://raw.githubusercontent.com/danielmiessler/SecLists/master/Passwords/Common-Credentials/10-million-password-list-top-10000.txt').read(), 'utf-8')
In this line, we open the text file we selected from a remote URL, encode it as a UTF-8 text file, and then save that data to a string called LIST_OF_COMMON_PASSWORDS.
Now, we'll need to solve an interesting problem. While we know there are 10,000 passwords in the text file, the program has no idea how many to expect, so we will need to create some code to run once for every guess in the password file.
To do this, we'll use a structure called a for loop. A for loop is a very basic concept in programming and looks something like this:
for [an individual guess] in [the variable that guess is in]: [do this]
What this means is that for the number of guesses in the variable we created to hold all of the guesses in the last step (in this case 10,000), we'll do the action that follows. In practice, this means we'll grab a guess from the list of guesses, do whatever action, and then jump back up to grab the next guess until we run out of new guesses to try.
We can name the variable that holds each guess whatever we want, but for clarity, I named it guess. It would work to just say for x in LIST_OF_COMMON_PASSWORDS just as well.
The final problem we'll need to solve is to tell the program how to break up the big long list of passwords into individual password guesses. The password list we're using separates passwords by a new line, so we can use the new line character to split LIST_OF_COMMON_PASSWORDS into individual guesses.
To put this in action, we can add .split() to the end of the LIST_OF_COMMON_PASSWORDS variable and put the code for a new line (which is '\n') into the parentheses. The end result looks like below.
for guess in LIST_OF_COMMON_PASSWORDS.split('\n'):
This code will grab a password, stopping at the end of the line, from the LIST_OF_COMMON_PASSWORDS variable we created earlier. It will run for as many times as there are passwords in the list, unless we tell it to behave differently in the next steps.
Here, we will need to create a new variable to hold a hashed version of the password guess we pulled from the list. When we do this, it should create an identical hash if we use the same password that was used to create the hash supplied by the user in the first step. If it matches in the next step, we'll know we found the password.
We'll name the variable to hold the hashed version of the guess hashedGuess. Next, we'll need to do some prep work before we're able to hash the guess we pulled from the password list. To cast the string variable we have called guess into a bytes object. This is necessairy because the SHA-1 function only works on bytes objects, not strings.
Fortunately, it's easy to cast a string into bytes. We can do this the same general way we cast the user input in the first step into a string. The formula looks like the following. In this case, we'll be casting guess into bytes, and the text encoding is UTF-8.
Now that we have the bytes version of guess, we can turn it into a SHA-1 hash by using the following code.
So what is this doing? We're calling the SHA-1 hash from the hashlib function and hashing the bytes variable we put inside the parentheses. Because of the way SHA-1 works, we could keep adding stuff to it, but to print the current value of the SHA-1 hash, we add .hexidigest() to the end.
In the final code, we'll assign the value of the hashed guess to the variable HashedGuess.
hashedGuess = hashlib.sha1(bytes(guess, 'utf-8')).hexdigest()
Now that we have the password guess saved as a hash, we can compare this guess to the original SHA-1 hash to crack directly.
In this step, we'll need to tell the program what to do if the hash matches. To do this, we'll use a simple statement called an if statement.
An if statement works somewhat like a for statement, but checks a condition to see if it's true before executing the next part of code. If the condition is true, you can tell the program to take one action, and if it is false, to take another action instead.
The general formula for an if statement in Python is as follows.
if [some condition to check is true]: [do whatever this code says to do]
For our use-case, we want to determine if the hashed guess matches the origional hash the user gave us, so we can use the == sign to determine if they are equal. The statement we want to evaluate is whether hashesGuess equals sha1hash, the variable we're keeping the original hash in. In our code, that is a simple statement.
if hashedGuess == sha1hash:
Now that we've set up this comparison, we'll have to explain to the program what to do in three circumstances we're expecting: a match, no match, or no more passwords in the list to guess.
In this step, we'll explain what to do if the hashes match or don't match. Because the previous statement asks what to do if these two are equal, our first instruction will be for what to do if the hash of the password guess matches the original password.
If this is the case, we have found the password, and the correct thing to do is print out the correct password and quit the program. If we don't quit, the loop will continue even though we've found the password that matches the SHA-1 hash. To do so, we can just type the following.
print("The password is ", str(guess))
This prints everything within the quotes, and then adds the string version of the current password that's been successfully guessed. It's important we're printing the guess variable and not the hashedGuess variable, since the hashedGuess version will just give us another SHA-1 hash. In this case, we also cast that variable to a string so Python can print it nicely without an error. After this is printed, we simply include quit() to close the program, because we've got the password!
- Don't Miss: How to Crack Passwords Fast Using Hashcat
If the hashedGuess and sha1hash variable do not match, we will need to explain what to do. We can add this part of the statement with an elif statement. Elif, or "else if," tells the program what to do if a different condition is true. Our next statement to test in this case is as follows.
hashedGuess != sha1hash
This statement asks if the two variables are not equal, shown with the != symbol. If this is true, or in other words, if the two hashes are not equal and the password guess is wrong, we'll need to tell the user that the guess failed, and then go back to the top of the loop to grab a new password.
To do this, we'll do the same thing we did before and simply use the print() function to print out a message. In this message, we'll say: "Password guess", [guess], "does not match, trying next...". The end result should look like the code below.
print("The password is ", str(guess)) quit() elif hashedGuess != sha1hash: print("Password guess ",str(guess)," does not match, trying next...")
This code explains what to do if the guess is correct, and what to do if the guesses do not match, but what if we don't find a match at all? Rather than just quitting, we can give the user some more information if we determine we've exhausted our list of passwords and there are no more guesses to try.
If we go all the way through this loop and find no matches, the loop will end because there will be nothing further to grab from the list of password guesses. We'll let the user know we've not been successful rather than just exiting the program abruptly by placing a print statement just outside the loop. This way, if the password is found, the final print function will never execute because of the quit() function we added earlier to end the program when we get the right password.
So how do we put this statement outside the loop? In Python, whitespace matters, so we can put it on a new line and simply not indent it, as seen in the example below.
for guess in LIST_OF_COMMON_PASSWORDS.split('\n'): hashedGuess = hashlib.sha1(bytes(guess, 'utf-8')).hexdigest() if hashedGuess == sha1hash: print("The password is ", str(guess)) quit() elif hashedGuess != sha1hash: print("Password guess ",str(guess)," does not match, trying next...") print("Password not in database, we'll get them next time.")
Python would execute first the for loop, then evaluate the if and elif statements, and only if the loop ended would execute the final print function, because it is outside the for loop.
This print function is simple and contains no variables, just a string to let the user know that we did not find a matching password in the list.
print("Password not in database, we'll get them next time.")
With this last line, we have a fully functional SHA-1 brute-forcing program, so let's run it! First, we get a prompt asking for the SHA-1 hash to crack. I'll give it the hash cbfdac6008f9cab4083784cbd1874f76618d2a97 to test it.
Dell:SHA1cracker skickar$ python3 NBspecial.py Please input the hash to crack. >cbfdac6008f9cab4083784cbd1874f76618d2a97
After pressing return, the script begins to work.
Password guess 171717 does not match, trying next... Password guess panzer does not match, trying next... Password guess lincoln does not match, trying next... Password guess katana does not match, trying next... Password guess firebird does not match, trying next... Password guess blizzard does not match, trying next... Password guess a1b2c3d4 does not match, trying next... Password guess white does not match, trying next... Password guess sterling does not match, trying next... Password guess redhead does not match, trying next... The password is password123 Dell:SHA1cracker skickar$ _
And just like that, we've found the password that was used to create a hash, allowing us to reverse the "one way" SHA-1 hash.
With some simple Python3 knowledge, we were able to write a simple script to find the password a hash was derived from in only 11 lines. You can see the entire code without comments below. With some clever formatting of our Python, we can make this more compact (but much more difficult to read or understand) and execute all of this with only three lines of code.
from urllib.request import urlopen, hashlib sha1hash = input("Please input the hash to crack.\n>") LIST_OF_COMMON_PASSWORDS = str(urlopen('https://raw.githubusercontent.com/danielmiessler/SecLists/master/Passwords/Common-Credentials/10-million-password-list-top-10000.txt').read(), 'utf-8') for guess in LIST_OF_COMMON_PASSWORDS.split('\n'): hashedGuess = hashlib.sha1(bytes(guess, 'utf-8')).hexdigest() if hashedGuess == sha1hash: print("The password is ", str(guess)) quit() elif hashedGuess != sha1hash: print("Password guess ",str(guess)," does not match, trying next...") print("Password not in database, we'll get them next time.")
This is possible by getting the hash to crack on the same line we use to import libraries and by condensing the for and if statements into one line with something called a ternary operator. In general, the format for these is the following and can be added on to for as long as needed.
<expression1> if <condition1> else <expression2> if <condition2>
In our script, the format we will use is this:
<password match response> if <hashes match> else <password not in dictionary response> if <password is empty> else <password does not match response>
After applying these changes, we can condense our code like the example below.
from urllib.request import urlopen, hashlib; origin = input("Input SHA1 hash to crack\n>") for password in str(urlopen('https://raw.githubusercontent.com/danielmiessler/SecLists/master/Passwords/Common-Credentials/10-million-password-list-top-10000.txt').read(), 'utf-8').split('\n'): [print("The password is ", str(password)), quit()] if (hashlib.sha1(bytes(password, 'utf-8')).hexdigest()) == origin else print("Password not in database, we'll get them next time.") if password == "" else print("Password guess ", str(password), " does not match, trying next...")
While this is horrible for someone new to Python to understand without comments, Python can be condensed from a rough idea to a few concise lines of code simply by working through the program and looking for shortcuts.
If you wanted to get this to one line, you could simply wrap it in an exec() function and add new line (\n) characters for each new line break. Why you would do this, I'm not sure, but it's useful to be able to condense programs when needed.
I hope you enjoyed this beginner guide to writing your own SHA-1 brute-forcer in Python! If you have any questions about this tutorial or basic Python3 programming, feel free to leave a comment below or reach me on Twitter @KodyKinzie.