Google is an incredibly useful database of indexed websites, but querying Google doesn't search for what you type literally. The algorithms behind Google's searches can lead to a lot of irrelevant results. Still, with the right operators, we can be more exact while searching for information that's time-sensitive or difficult to find.
If you've ever searched for the answer to a programming question and found yourself buried in results that don't work, or tried to look up someone with the same name as someone famous, you may have experienced some of the shortcomings of Google. Because Google by default doesn't search for the literal words you type, it's common to end up with a ton of unrelated results
There are a lot of things Google does well, but looking for certain types of answers expose these flaws pretty clearly. If we're trying to figure out why we're getting a Python error, you can see immediately why these search results might not be useful.
Yikes! For a search about a library that is actively updated, an article from 2014 has a meager chance of still working as described in 2019.
Another problem comes up when we start trying to search for programming and technical terms that may have other, more universal meanings. If what we're looking for is the standard list feature of C++, the query would be "std :: list". Searching for this produces many unpleasant and unrelated results as Google ignores the "::" and returns a list of sexually transmitted diseases.
Staring at a screen full of confusing results can be daunting. Still, we can clean up these searches by understanding the way Google finds information and using operators to zero in on precisely what we want to find. Using these operators can dramatically reduce the time needed to find the result you're trying to dig up.
For this guide, you'll only need a browser connected to the internet and access to Google. We'll be locating information about targets that are difficult to search for, so this should work on any operating system provided you can run Google searches.
Below, you can see a list of the search operators we'll be using to dig through the data.
By using the right operator for the right problem, we can cut down the amount of time we have to spend looking at irrelevant results. You can find a list of Google's documented search operators here.
When you're searching for queries that revolve around software, one of the most important things to consider is time. When something was published is vital to consider when deciding if an answer is useful, to the point that it's not useful to include any results that are too far outside the helpful range.
Consider searching for answers about Python, a common programming language. Python is continuously changing and being updated, and there are multiple versions in use today. As a result, information about it that was published a decade ago would be extremely out of date and most likely inaccurate — especially if you're using a more recently released version.
The first thing you should consider when searching for the answer to a technical question is when an article is too old to be useful. By setting a filter to omit anything too old to be useful, we can easily limit our searches to relevant results. Conversely, if we need to look up a software answer for an old library or an older version of the software you're working with, we can limit our search to results published before a specific date — say, when the newer version was released.
There are two ways of doing this. The first is to click on the "tools" and then "any time" option, and select "past year." The second is to specify a date either to find results before or after. The format for this is before:date and after:date. You can see an example of the "any time" and "before" options in use below.
The date of publication is only one part of the puzzle. We can also chain operators together to specify the source of our data. If we search for "Scapy_Exception", the first result is out of date, and others are from sources that may not be reputable.
Let's say we want to only get answers from high-quality sources, or at least sources we expect won't produce garbage. We can select as many sites to add to the list as we want with the site: operator and the OR operator to chain them together.
site:stackoverflow.com OR site:stackexchange.com OR site:github.com OR site:gitlab.com after:2018 "Scapy_Exception"
By adding after:2018 to the string, we'll only find results after 2018 published on these websites.
Now, the results we see are from the sources we want and limited to dates that are useful.
Let's say we need to remove unwanted search results, using our example of std :: list from before. The simplest way to do this is with the - operator, which we can use to eliminate results that contain key phrases that aren't in the result we want.
When we are dealing with acronyms, the most effective way of doing this is to remove results that contain words in the wrong interpretation. For example, adding a simple -transmitted is enough to clean up the search from earlier.
We can also clean up these results by eliminating websites from the results that are causing a lot of false results. Here, we can get similar results by removing the top three sites giving us irrelevant results.
Both methods are effective for removing results that are cluttering up your search.
We can search for files that might be interesting by combining the site and filetype operator, allowing us to possibly find files that weren't supposed to be made public. For gathering official documents, PDF's are a great format to try digging up.
Here, we search the domain spacex.com for any PDF files that mention the word "internal" to try to find documents that might give us clues into their internal procedures.
You can replace PDF's with PPTX for powerpoint, DOCX for word files, and other formats you may find interesting. If you have a list of multiple domains to search, you can chain them together with the OR operator to search multiple websites for files.
While it isn't as easy as just throwing in an operator to a standard search, you can always access these options in a graphical layout by navigating to Google's advanced search page.
The advanced search page will allow you to use any combination of operators to create a structured search. It is mostly useful for reference, as you might only use a few of these operators for any specific search.
Some useful options here are also language and region, which depending on what research you are doing, can be helpful to filter your results to a specific region or look for documents in a specific language.
By stringing together different search operators, it's possible to take a search full of irrelevant results and cut it down to the perfect answer. This skill is useful not only for hackers but for anyone needing to look up time-sensitive questions about technology or software. A more advanced version of this, Google Dorking, allows us to search for vulnerable systems by using these search operators to locate text strings on the exposed pages, which we'll explore in our next article on using Google for OSINT investigations.
I hope you enjoyed this guide to using Google search operators! If you have any questions about this tutorial on refining your online searches, leave a comment below, and feel free to reach me on Twitter @KodyKinzie.
Want to start making money as a white hat hacker? Jump-start your white-hat hacking career with our 2020 Premium Ethical Hacking Certification Training Bundle from the new Null Byte Shop and get over 60 hours of training from ethical hacking professionals.