Dark Side 102: Google Dorking
Geeking out over “Google Dorking”.

In this part, I’ll be going over what I’ve learned about Google Hacking (also called Dorking). Google is driven by data analytics so it’d be silly for them not to build in operators that can help us techies optimize our searches.
Researching is one of the primary OSINT techniques, regardless of whether it’s done on Google or another search engine, or within a specific site. Being able to hone in on specifics and ignore the rest of the noise is important to gathering information on whatever target you’re researching.
In order to learn how to leverage Google Dorking for our OSINT scavenges, we first need to understand how search engines work. Here are some of the basics:
- Crawlers
- Robots.txt
- Sitemaps
- Search Engine Optimization
Search engines leverage crawlers to gather data. A crawler gathers things like keywords within a domain and then indexes that information for use in the search engine. Looking at the example below we can see the website had some keywords that were sent to the Search Engine. This is the basis for the results someone gets when they key in one of those words.

Obviously, this is a very simple example, as we all know simple searches of “Apple” or “Banana” would yield broad and useless results. This is exactly why companies leverage Search Engine Optimization (SEO) to ensure their site comes up on popular searches. Additionally, some of the operators I’ll go over in a bit can help researchers narrow the scope of their searches by leveraging keywords and things like file types or page titles.
On any given website, there may be files we don’t want to allow any search engines to have access to, like .ini or .conf files. In order to define what crawlers do and don’t have access to index on our website, we use the Robots.txt file.
This file uses simple syntax of Allow: and Disallow: to specify what a given User-Agent can view. In the example below we can see Googlebot has access to index everything, and msnbot doesn’t have any access. Therefore, MSN won’t have any results relating to this site, because it doesn’t have access to index any of the information within it.

Sitemaps are exactly what you are thinking they are. They are maps of websites. They aren’t as appealing to the eye as geographical maps, but they provide useful information, especially for search engines. Sitemaps outline the main pages, their subpages, and the content in a hierarchical view, which allows for search engines to index that information. Sitemaps contain the following:
- URL location
- Last modified data
- Change frequency
- Page priority

Lastly, SEO is difficult to define, but it basically measures how well your site stands out amongst the rest. Optimizing a website is important to ensure it is properly indexed, and therefore coming up in the search results when it should be. There are actually companies out there that provide SEO as a service to help businesses optimize their websites. And if you don’t want to pay, you can do it yourself by using an SEO analyzer, which scans a given page and provides feedback on what SEO checks it passed or failed. Here are two you can try out:
Now that we know how search engines work, we can look at some of the operators that can be used to narrow our searches.
filetype:
Use this to specify the type of file you are looking for. For example “filetype:pdf” or “filetype:doc”
filetype is an operator that can easily be leveraged for directory traversal by specifying sensitive file types, like “index.of”
cache:
Cache can be used to view the cached raw HTML a search engine has for a specific website. Much of the time, this is used for troubleshooting a slow or unresponsive webpage, or analysis of how a website is being indexed.
intitle:
This operator is used to specify the title of the results. If I wanted to return articles about a current event, but only ones with a specific title, or word in the title, I am able to do so with intitle.
- intitle: cyberattack
I could also add a keyword at the beginning to further narrow the scope:
- Fox News intitle: cyberattack

Try these out for yourselves! Get creative in your everyday searching and see how you can combine some of these search operators to get more relevant results or specify the types of results you want to see. I hope this article was a helpful 100-foot view of Google Dorking that you can use to get started on optimized searching!