Hello Null-Byte. Alex here. I have been lurking on this forum for a while now and have decided to finally join the community to help share and spread knowledge. I should mention this is my first How-To so please criticize for improvement...
My main interest lies in OSINT and even though it might seem useless at times and whatnot, I think not. Especially regarding larg organizations that share PDFs, Excel Sheets, and even even Documents... because of this seemingly innocent cause for sharing, the documents hold valuable information that only someone that is looking for paticular information can see it as valuable. One such piece of information is of course emails. Emails are amazing and from my experience thus far, one of the most overlooked security wise. For example, MX records servers are relatively easy to find and connect to for testing. The reason is due to the fact that the SMTP protocol was created with no thought to security, but that is a post for another time. ;)
Anyways, I am boring even myself writing this introduction... so let's get to the fun part.
Before I can even begin with throwing out some fancy tools and whatnot I need to go over some stuff... I guess (Idfk...). :/
Emails are still a relatively basic and common ways organizations communicate. Usually the servers are Outlook, Office 365, and Exchange since those are marketed more so for business, colleges, etc... and because of this, it is relatively easy to gain access in one shape or form. Btw, the servers have shit security and can be abused with tools such as ruler and can also be password sprayed using spray.sh or BurpSuite using Intruder, but the most successfuly attack vector is still phishing. Phishing is essentially manipulating and expoiting people. It can be used for credential harvesting and also to drop malware. APT28 (Fancy Bear) were so successful with this they were able to penetrate DNC, Whitehouse, and other high profile targets with just phishing (Btw, the Bears are pretty 1337). A good way to figure out if any of these attack vectors are possible by doing a bit of OSINT research. Personally, I use a mix of tools. One of which is using Maltego CE (I am borked for the license...) and to run a few scans to determine what I can see from the outside with a pretty graph of course. I am not going to go over Maltego and the use since I am pretty confident there are a few tutorials out there already plus it seriously requires playing around by the user. Once a picture is somewhat painted there are other methods I use, but my favorite and one of the most underrated is Google Dorking.
What is Google Dorking some of you may ask? Google Dorking I view as carefully crafted search terms using parameters that most do not know about to get more accurate results based on what I am searching for. When I say by 'what I am searching for...' is a concept that I have come up with, you can find anything and anywhere on anyone on the internet if you know where to look. Something as simple as this search term in Google:
filetype:pdf intext:@example.com "email"
(btw, how do I insert code blocks on here?)
You can specify filetype to anything you want, but I have found Excel Sheets and also PDFs the best for finding personal public information. I have even found old email messages posted on a third-party website not directly associated with the original site. I am not really going to into that much detail especially since like cheatsheets exist and I have no shame to use them until I can do it without thinking. Now, you found a bunch of files what now? Download and scrape of course. :)
Linux... Linux... Linux... I swear if you use Windows 10 as main OS please assure me you have Virtualbox or VMWare with several Linux VMs? Linux is the greatest and most amazing OS out there, but this is not a Windows or Mac (ew...) bashing fest. My point is, the command line one should be pretty adapt with especially throwing together a bunch of scripts together to automate a task a bit. Not only that, but most tools are written for Linux and for good reasons. Most OSINT tools are written in Python and used in conjuction with other commands one can make a stupid simple Bash script to do what they want to do without have to type a bunch of commands over and over again. In other words, if it is a redundant task make a script.
Regarding the topic of this subject... emails. So ok, you found a bunch of PDF files with possible emails in them, but it is redundant to go through each and every single file looking for paticular emails with a certain domain. Thankfully, we don't live in the Stone Age of the Computer Revolution and there are tools out there we can use. I run a flavor of Debian, but the instructions should be the same or similar no matter what Linux distro you have. There are several steps I use to get the best results in a bash script. I first use the 'find' command to grab all the files and output to an input file for later processing. I then setup a while loop to read line by line and use 'pdftotext' piping to 'grep' to grab what I want based upon keyword and pipe it to another seperate file. Since grep is not always perfect with first try I grep for just the emails using a pattern from the file where the original lines are and pipe to another file. Of course there will be duplicates so this is where I use 'sort' and finally generate my final file. Clean up of course by 'rm -rf' the temp files. Yep guys... This is what I call Linux Foo.
The script is here:
The arguments that are needed are a keyword surrounded by single quotes and an output file to save the final results. Stupid simple, but simple is how a lot of hacks get pulled off. Don't need anything fancy, but the ability to get creative and weaponize anything and everything.
Yeah... I understand. A lame How-To, but this is my first time and as I said in my introduction, please critic my work, Glish, grammar, and anything else, but please don't be to harsh. What I tried to show are the basics of simplicity to get info on anything and everything. In other words, there are several ways to use the information. For example, run it through tools like h8mail and BreachCompilations to see if past breaches occurred and if the victim recycles credentials. You can also setup a phishing campaign as well to gather more information, credential harvesting, and even more shenanigans. The limit is your own mind.
P.S. I might make a part 2 of this... maybe. I might go over using h8mai and other tools to get even more information on a target or a list of targets, but who knows. Depends on motivation. o.O