Microsoft Office files can be password-protected in order to prevent tampering and ensure data integrity. But password-protected documents from earlier versions of Office are susceptible to having their hashes extracted with a simple program called office2john. Those extracted hashes can then be cracked using John the Ripper and Hashcat.
Extracting the hash from a password-protected Microsoft Office file takes only a few seconds with the office2john tool. While the encryption standard across different Office products fluctuated throughout the years, none of them can stand up to office2john's hash-stealing abilities.
This tool is written in Python and can be run right from the terminal. As for Office compatibility, it's known to work on any password-protected file from Word, Excel, PowerPoint, OneNote, Project, Access, and Outlook that was created using Office 97, Office 2000, Office XP, Office 2003, Office 2007, Office 2010, and Office 2013, including the Office for Mac versions. It may not work on newer versions of Office, though, we saved a DOCX in Office 2016 that was labeled as Office 2013.
To get started, we'll need to download the tool from GitHub since office2john is not included in the standard version of John the Ripper (which should already be installed in your Kali system). This can easily be accomplished with wget.
--2019-02-05 14:34:45-- https://raw.githubusercontent.com/magnumripper/JohnTheRipper/bleeding-jumbo/run/office2john.py Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 188.8.131.52 Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|184.108.40.206|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 131690 (129K) [text/plain] Saving to: ‘office2john.py’ office2john.py 100%[=======================================================================>] 128.60K --.-KB/s in 0.09s 2019-02-05 14:34:46 (1.45 MB/s) - ‘office2john.py’ saved [131690/131690]
In order to run office2john with Python, we will need to change into the same directory that it was installed into. For most of you, this will be Home by default (just enter cd), but feel free to create a separate directory.
Next, we need an appropriate file to test this on. I am using a simple DOCX file named "dummy.docx" that I created and password-protected with Word 2007. Download it to follow along. The password is "password123" as you'll find out. You can also download documents made with Word 2010 and Word 2016 (that shows up as 2013) to use for more examples. Passwords for those are also "password123."
The first thing we need to do is extract the hash of our password-protected Office file. Run the following command and pipe the output into "hash.txt" for later use.
python office2john.py dummy.docx > hash.txt
To verify that the hash was extracted successfully, use the cat command. We can see that the hash I saved corresponds to Microsoft Office 2007. Neat.
Like already mentioned, we'll be showing you two ways to crack the hash you just saved from the password-protected Microsoft Office file. Both methods work great, so it's really up to preference.
Set the --wordlist flag with the location of your favorite word list. The one that is included with Nmap will do for our purposes here, but for tougher passwords, you may want to go with a more extensive word list.
john --wordlist=/usr/share/wordlists/nmap.lst hash.txt
Using default input encoding: UTF-8 Loaded 1 password hash (Office, 2007/2010/2013 [SHA1 128/128 SSE2 4x / SHA512 128/128 SSE2 2x AES]) Cost 1 (MS Office version) is 2007 for all loaded hashes Cost 2 (iteration count) is 50000 for all loaded hashes Will run 4 OpenMP threads Press 'q' or Ctrl-C to abort, almost any other key for status
John will start cracking, and depending on the password complexity, will finish when a match is found. Press almost any key to view the current status. When the hash is cracked, a message will be displayed on-screen with the document's password: Since our password was pretty simple, it only took seconds to crack it.
password123 (dummy.docx) 1g 0:00:00:03 DONE (2019-02-05 15:00) 0.2824g/s 415.8p/s 415.8c/s 415.8C/s lacoste..cooldude Use the "--show" option to display all of the cracked passwords reliably Session completed
We can also use the --show option to display it, like so:
john --show hash.txt
dummy.docx:password123 1 password hash cracked, 0 left
Now that we know one method of cracking a password-protected Microsoft Office file, let's look at one other way using the powerful tool Hashcat.
We can begin by displaying the help menu (--help) for Hashcat. This will provide us with a wealth of information including usage options, hash modes, and other features. There is a ton of information here, so I won't show the output, but you should dive into it if you really want to know Hashcat.
From the output, we're just interested in the MS Office hash modes. Near the bottom of the help menu, we will find the MS Office mode options and their corresponding numbers. We know from our hash that this is an Office 2007 file, so locate its number ID of 9400.
9700 | MS Office <= 2003 $0/$1, MD5 + RC4 | Documents 9710 | MS Office <= 2003 $0/$1, MD5 + RC4, collider #1 | Documents 9720 | MS Office <= 2003 $0/$1, MD5 + RC4, collider #2 | Documents 9800 | MS Office <= 2003 $3/$4, SHA1 + RC4 | Documents 9810 | MS Office <= 2003 $3, SHA1 + RC4, collider #1 | Documents 9820 | MS Office <= 2003 $3, SHA1 + RC4, collider #2 | Documents 9400 | MS Office 2007 | Documents 9500 | MS Office 2010 | Documents 9600 | MS Office 2013 | Documents
Now we can set the rest of our options using the following command.
hashcat -a 0 -m 9400 --username -o cracked_pass.txt hash.txt /usr/share/wordlists/nmap.lst
- The -a flag sets the attack type as the default straight mode of 0.
- The -m flag specifies the mode we want to use, which we just found.
- The --username option ignores any usernames in the hash file.
- We can specify the output file as cracked.txt with the -o flag.
- And finally, we can pass in hash.txt which contains the hash, and set a word list just like we did earlier.
Hashcat will then begin cracking.
hashcat (v5.1.0) starting... * Device #2: Not a native Intel OpenCL runtime. Expect massive speed loss. You can use --force to override, but do not report related errors. OpenCL Platform #1: Intel(R) Corporation ======================================== * Device #1: Intel(R) Core(TM) i5 CPU M 480 @ 2.67GHz, 934/3736 MB allocatable, 4MCU ...
After some time has passed, the status will show as cracked, and we are ready to view the password.
Session..........: hashcat Status...........: Cracked Hash.Type........: MS Office 2007 Hash.Target......: $office$*2007*20*128*16*a7c7a4eadc2d90fb22c073c6324...2b6870 Time.Started.....: Tue Feb 5 15:08:00 2019 (4 secs) Time.Estimated...: Tue Feb 5 15:08:04 2019 (0 secs) Guess.Base.......: File (/usr/share/wordlists/nmap.lst) Guess.Queue......: 1/1 (100.00%) Speed.#1.........: 610 H/s (8.51ms) @ Accel:512 Loops:128 Thr:1 Vec:4 Recovered........: 1/1 (100.00%) Digests, 1/1 (100.00%) Salts Progress.........: 2048/5084 (40.28%) Rejected.........: 0/2048 (0.00%) Restore.Point....: 0/5084 (0.00%) Restore.Sub.#1...: Salt:0 Amplifier:0-1 Iteration:49920-50000 Candidates.#1....: #!comment: ***********************IMPORTANT NMAP LICENSE TERMS************************ -> Princess Started: Tue Feb 5 15:07:50 2019 Stopped: Tue Feb 5 15:08:05 2019
Simply cat out the specified output file, and it will show the hash with the plaintext password tacked on the end.
Success! Now we know two methods of cracking the hash after extracting it from a password-protected Microsoft Office file with office2john.
When it comes to password cracking of any kind, the best defense technique is to use password best practices. This means using unique passwords that are long and not easily guessable. It helps to utilize a combination of upper and lowercase letters, numbers, and symbols, although recent research has shown that simply using long phrases with high entropy is superior. Even better are long, randomly generated passwords which makes cracking them nearly impossible.
In regards to this specific attack, using Microsoft Office 2016 or 2019 documents or newer may not be effective, since office2john is designed to work on earlier versions of Office. However, as you can see above, Office 2016 may very well spit out a 2013 document without the user even knowing, so it doesn't mean a "new" file can't be cracked. Plus, there are still plenty of older Microsoft Office documents floating around out there, and some organizations continue to use these older versions, making this attack still very feasible today.
Today, we learned that password-protected Microsoft Office files are not quite as secure as one would be led to believe. We used a tool called office2john to extract the hash of a DOCX file, and then cracked that hash using John the Ripper and Hashcat. These types of files are still commonly used today, so if you come across one that has a password on it, rest easy knowing that there is a way to crack it.