The idea of using Google as a hacking tool or platform certainly isn’t novel, and hackers have been leveraging this incredibly popular search engine for years. Google Dorks had their roots in 2002 when a man named Johnny Long started using custom queries to search for elements of certain websites that he could leverage in an attack. At its core, that’s what Google Dorks are – a way to use the search engine to pinpoint websites that have certain flaws, vulnerabilities, and sensitive information that can be taken advantage of. As a side note, some people refer to Google Dorks as Google Hacking (they’re more or less synonymous terms).
Google Dorking is a technique used by hackers to find the information exposed accidentally to the internet. For example, log files with usernames and passwords or cameras, etc. It is done mostly by using the queries to go after a specific target gradually. We start with collecting as much data as we can using general queries, and then we can go specific by using complex queries.
It is basically a search string that uses advanced search query to find information that are not easily available on the websites. It is also regarded as illegal google hacking activity which hackers often uses for purposes such as cyber terrorism and cyber theft.
Can Google be used by Hackers to hack websites?
People often take Google as just a search engine used to find text, images, videos, and news. However, in the infosec world, it has a very vast role. Google can also be used as a very useful hacking tool.
You cannot hack websites directly using Google. But, it’s tremendous web crawling capabilities can be of great help to index almost anything within any websites which includes sensitive information. This can include from username, password and other general vulnerabilities you won’t even be knowing.
Basically, using Google Dorking you can find vulnerabilities of any web applications and servers with the help of native Google Search engine.
Special google search operators
Before starting with google dorks, you need to have basic understanding of few special google search operators and also how it functions.
This will ask google to show pages that have the term in their html title.
Searches for specified term in the URL. For example: inurl:register.php
Searched for certain file type. Example: filetype:pdf will search for all the pdf files in the websites.
It works similar to filetype. Example: ext:pdf finds pdf extension files.
This will search content of the page. This works somewhat like plain google search
This limits the search to a specific site only. Example: site:firstname.lastname@example.org will limit search to only email@example.com.
This will show you cached version of any website. Example: cache: aa.com
This works like a wildcard. Example: How to * sites, will show you all the results like “how to…” design/create/hack, etc… “sites”
Basic Formula of Dork
“inurl” = input URL
“domain” = your desired domain ex. .gov
“dorks” = your dork of your choice
Fundamentals of Google Dorking
There are seven fundamentals of google Dorking. These are nothing but just how we can use google with advanced techniques. These seven fundamentals are seven types of main queries which make the basic structure of google Dorking. We will now see one by one how these queries are used by hackers(back/grey/white hat) to get the information related to an organization or even an individual.
Google Dorking is not hacking itself. Google Dorking is a technique that comes in handy in one of the phases of hacking, i.e., Information Gathering, and this is the most important phase of hacking. There are five phases of hacking, i.e., reconnaissance, scanning, gaining access, maintaining access, and clearing tracks. Google Dorking is used in the starting phases where hackers try to get all the information linked to any specific organization or an individual. After getting all information then hackers pick out the information they need for the next phases.
Captcha Issue while using Google Dork
As we can use google for the activity, which can disclose the information of others, and that information can be used for wrong purposes. Many black hat hackers have put bots online to scrawl the websites, find weaknesses in the pages, and then send information back to servers. To stop and degrade this issue, Google has introduced a captcha in this process. You will need to enter a captcha almost every time you use a dork. This way, Google stops bots from using google for illegal purposes.
Understanding Google Dorks Operators
Just like in simple math equations, programming code, and other types of algorithms, Google Dorks has several operators that aspiring white hat hackers need to understand. There are far too many to include in this guide, but we will go over some of the most common:
- intitle – This allows a hacker to search for pages with specific text in their HTML title. So intitle: “login page” will help a hacker scour the web for login pages.
- allintitle – Similar to the previous operator, but only returns results for pages that meet all of the keyword criteria.
- inurl – Allows a hacker to search for pages based on the text contained in the URL (i.e., “login.php”).
- allinurl – Similar to the previous operator, but only returns matches for URLs that meet all the matching criteria.
- filetype – Helps a hacker narrow down search results to specific files such as PHP, PDF, or TXT file types.
- ext – Very similar to filetype, but this looks for files based on their file extension.
- intext – This operator searches the entire content of a given page for keywords supplied by the hacker.
- allintext – Similar to the previous operator but requires a page to match all of the given keywords.
- site – Limits the scope of a query to a single website.
Google not only lists current versions of web pages, but it also stores the previous versions of websites in its cache, and those pages sometimes can give you a lot of information about the technology being used by the developers. It can also sometimes disclose information initially used for testing purposes only and was removed in the later versions but still viewable in the versions that Google has in its cache.
Its syntax is “cache:website address”. For example, let’s use the cache command for a random website and see the results. Results may vary from time to time as we see updates from google as well.
As for results, we have multiple responses that can gather further information related to that website.
We can also use this search query to highlight some keywords in our search results. Let us suppose that we want to highlight the “flex” word in our research, and then we will write the query as follows:
“cache:https://flexstudent.nu.edu.pk/Login flex.” It will highlight this keyword in the results.
intext & allintext Command
The intext command is used to find specific text within the search result from the webpage. Intext can be used in two ways. The first is to get a single keyword in the results and the second way is to get multiple keywords in the search. To accomplish the first task, the syntax for the command is
To accomplish the second task, we use allintext instead of intext. And we separate the keywords using a single space. If we use allintext, then google will add all those pages in the result with all the keywords in their text mentioned in the query. If a web page has some missing keywords, it will be discarded from the results, and the user will not see that web page. That is why these commands are used with proper keywords so that necessary information is not discarded.
Let’s say we want to find out some pages having information related to usernames and passwords, and then we will write the query as follows:
And the result we got in the result is as follows:
As you can see that all the return pages have username and password in them, and that is because of our query, which we have used. It has given us those pages that have both keywords in them.
Filetype is one of those seven famous fundamentals of google Dorking as it helps in filtering out a large number of files. It can filter pdf files for you. It can even filter log files for you. Log files are very useful for collecting information related to an organization as these are the files that keep track of all the events that happen in an organization. If we want to get access to simple log files, we can write this command: filetype:log, and it will give us all types of log files, but this cannot be of much help until and unless we try to narrow down our search with some filters.
Let us make it more specific by specifying that we want those files with usernames and passwords. For this purpose, we will modify our query like this:
It will display those results that have usernames and passwords mentioned in them. If these files belong to any server, one cannot imagine how much damage they can cause.
Opening a random file after gettings result by applying this query is as follows:
As you can see, it may not have any meaning for beginners, but it may play an important role in information gathering related to a company or a server. This information can be the key to many new adventures.
Looking at another file on the internet, we may end up having usernames and passwords as well.
You can use this technique to narrow down the results to some specific user.
First, you will get log files using this query, and then you can easily find the required username after searching through that document.
The intitle is a command which is used when we want to filter out the documents based on the titles of HTML pages. As we know that HTML pages have those keywords in the title that define the whole document. They represent the summary of what is described in the document. We can use this feature to get exactly what we want. Suppose we are looking for documents that contain the information related to IP-Camera then we will write a query to tell google that filter out all the pages based on the provided argument.
The basic syntax to use this command is as follows:
We also have an option to use multiple keywords to get more precise results. To use multiple keywords, we write them in separate commas. Google gets all the pages first, and it then applies filters to the results. Those web pages that do not have provided keywords in the title of the website are discarded. The syntax for using this command is as follows:
allintitle:”ip camera” “dvr”
Below is the result of this query. You can see that it has shown us all those pages that have both these keywords in their title. We can use this technique to filter our results very effectively.
Inurl command works the same as intitle. The difference is that Inurl is a command used when we want to filter out the documents based on the URL’s text, as we know that HTML pages have those keywords in the URL that define the whole document. They represent the summary of what is described in the document. We can use this feature to get exactly what we want. Again, suppose we are looking for documents that contain the information related to IP-Camera. We will write a query to tell google that filters out all the pages based on the provided argument. We also have an option to use multiple keywords to get more precise results.
The basic syntax to use this command is as follows:
We have another command that is very useful when we want to search for a specific entity. At first, we make our search criteria broad and collect information that may or may not be related to our needs. After getting enough for a starting point, we start narrowing down our search using other commands. For example, suppose we want to buy a car, and we were searching about cars introduced later in 2020. After getting a list from the results, we studied the pages and found that Honda and Ford are reliable. Now our next step would be to gather information about these cars from authentic websites. So here comes the use of site command. Now, we will narrow down our search to some specific websites only.
It will give us all related to this website only. Similarly, if we want to search ford now, we may only change the website address and get our results.
Sometimes, we want to search for documents that are of a specific type. For example, we want to write an article about “phishing detection.” We cannot just start writing about it unless we first do our research on it. Research articles are mostly published in pdf formats. Now, if we want to read previous research that has been done on this topic, we would add another dork in our command, which is called ext. Ext is a command that is used to specify file extensions. It works like a filetype command. If we modify our previous search, which we did about ford cars, we may now want to look for only pdf files, and then we will write the query as follows:
From the results below, you can see that we now have only pdf files as our results.
More Sample Examples
Suppose we want to access an FTP server. The command would be to mix queries and then achieve what we want.
Finding FTP servers
Syntax is : intitle:”index of” inurl:ftp
It will find all the index pages related to an FTP server and show the directories.
After getting results, we can check different URLs for information.
We can even see the source code sometimes, which should not be public. The image attached below cannot be considered confidential, but the procedure for this activity is the same. Stating our recommendation not to download files from the unsecured servers as they may have already been compromised.
Accessing Online Cameras
Now, as we have read a lot about these dorks, we may come across something that should not be accessed because it may hurt someone’s privacy. The purpose of this activity is to spread the word that we need to take our privacy seriously. People nowadays are putting CCTV cameras in place to make them secure, but they are not making those cameras safe. They are even doing it worse by making them public. Below are some screenshots of cameras that are public, and anyone can see what is going on there.
You can see that these people are more vulnerable now because people can keep track of their activities easily.
Some more examples:
Even I cannot post more than that. People are even exposing their homes which is not ethical to see even if we access it.
How not to get Dorked!
Google dorks or Google hacking for regular individuals is just scratching the surface. Indexing can discover pictures, videos, ISO or other file types, and even cached versions of a website. Vulnerable data is easily exposed to Google Dorking, which can easily lead to hacking or penetrating the website itself. To prevent such attacks from fingerprinting vulnerable applications, we need to understand how they work in the wild.
Although it’s pretty easy to expose some data using Google Dorks, prevention is not that tough either. There are a couple of things we can do to stay safe from Google hacking.
IP-based Restriction: IP-based restrictions should be given priority. Wherever possible, use two-factor authentication, encryption, IP restriction, and secure password to stay free from Google Dorking.
Vulnerability scans: Vulnerability scans can be a website owners’ best friend. There are so many loopholes to cover, and they can easily be overlooked. Vulnerability scans have specific queries to prevent Google dorks, which is highly beneficial.
Google search console: The Google search console can help remove sensitive content such as payment pages, user information, and insider data. If those are required to run the platform, they can be indexed and, as a result, could be prone to such attacks. Google search console helps to remove them from the search query database.
Be the hacker to keep them away: A great way to find any part exposed is by running dork queries yourself. It is one of the best methods, and even website owners pay others to run dork queries on their website. As it can get quite complicated pretty quickly, running queries yourself on the whole platform and later specific sensitive parts can expose if it’s safe or not. Measures can be taken afterward.
Dorking types: Commonly used methods of Google Dorking include website URL Dorking and sequence-based Dorking.
robots.txt: Another great way to hide sensitive information from the google Dorking query is utilizing the robots.txt document. Robots.txt denies the registry and hides private information. However, it also indicates, sensitive data is on that file, so it can turn out to be a two-way sword.
To set up robots.txt and freely index internal content:
File access restriction:
Restrict access to dynamic URL with ‘?’ symbol:
Testing common keywords: one of the simplest and common Google dork keywords is site. We can use it to narrow down the findings of a particular website. In our case, we keyworded the site to see Google cached pages from hackingloops.com. It is all fun and games, but to further look into the details and block Dorking’s outcome, we can use tools such as Gooscan, Wikto, or Sitedigger. Although, we only recommend trying out sites made specifically for testing or your own.
Security camera dork prevention: We already talked about how Google Dorking can give access to vulnerable security cameras (CCTV live footage), alongside we can have control over camera modifiers such as moving, zoom in in-out, changing resolution, refresh rate, etc. While it can be intriguing to observe this kind of stuff anonymously, it is highly likely to leak your privacy and security information.
To view open CCTV, type in
inurl: top.htm inurl:currenttime
The command will list available CCTV cameras based on routing results. To get the best result, we can open all possible results and view the GUI. Securing those GUI pages not to be affected by indexing is the suggested idea. Also, some users keep the default network monitoring password mode on; the danger is that the password remains quite the same across manufacturers, and it’s easy to get picked. So, if the network camera is your priority, change the default password to a stronger one. Also, disable remotely logging in to the security system as this too is enabled by default.
XSS prevention: XSS remains one of the core fundamentals that must be secured before even deploying a webpage for the public. Severer side sanitation is not a hard problem to solve but remains in the hands of filter blocking function. Though it is quite hard to write a filter to block XSS attacks, rich text formatting, using simpler cases, data sanitization, utilizing tags such as <b> or <i> can solve the issue.
Personal data: Nothing severe is losing personal data to someone you don’t know or can cause harm. It’s as easy as using a simple command like
filetype:php inurl:list/admin/ intittle:”payment methods” or
These simple queries can list personal and even bank information. To avoid them encrypting and password protect data. Even if it’s not your data, there is a responsibility to protect customer information. A directory-level configuration such as .htaccess can protect your directory from Google crawlers.
GoogleDork: As crawlers are important to index data with Google search engine, it will always look for new information. So, data is prone to fall under Dorking unless you’re careful. Exploit-db.com has updated dork commands which is helpful to get familiarized.
Even if you’re trying to stay safe from hackers, checking them will give a rough idea of the latest tactics, so you know how to handle the situation beforehand. Although there are many more prevention techniques, adaption to these will roughly give a solid protection against Google Dorking. For further prevention expand upon the ideas mentioned throughout the article which will make things fairly easy.
Google Dorks automation tools
As we mentioned, Google Dorking requires lots of keywords and patience to find the specific vulnerability. Though it is not recommended to search for other platforms’ vulnerabilities, bug-bounty programs are specifically for that. Attend them! But for testing our skills, practicing in the wild is a must. Sending dork requests to Google can give fascinating results.
But if you ever thought it’s easy to manipulate all day, then you’re wrong. Google’s mechanism to stop Dorking is quite accurate. Sending multiple Dorking requests may block the IP in rare cases, but users get captcha for the common case of various searches. Attackers and hackers both use Google dorks, but to make stuff easy, they use automation tools. There are many automation tools such as Zeus, xgdork, Dorkme, Bingoo, GoogD0rker, gD0rk, M-dork, Gooscan, and many more. These tools are just a few clicks away and help in automating Google Dorking.
Black hat, White hat, Red hat everyone keeps some of these tools in their arsenal for finding vulnerability pretty quick and easily. As Google has a strict limitation on the query sent per IP address, proxies are mandatory. Python is the most popular scripting language for automation. A simple run of Selenium-based script can hack a website or collect its data. Similarly, Python automation scripts are a dangerous combination in a mix with proxy and Google Dorking tools.
The purpose of using google Dorking should be to use these tricks to make people and yourself secure. If you are reading this, it means you have to some extent in cybersecurity. It is the responsibility of every individual to use the information for well-being, which should be the final goal.
To get more knowledge about complex commands, you can refer to Github. People have written complex commands by combining two or more dorks for accurate results. In the end, it is all about practice.
Custom Crafting Google Dork Queries
Now that we have a basic understanding of some of the operators and how Google Dorks can scour the web, it’s time to look at query syntax. The following is the high-level structure of Google Dorks that targets a specific domain:
“inurl: domain/” “additional dorks”
A hacker would plug in the desired parameters as follows:
- inurl = The URL of a site you want to query
- domain = The domain for the site
- dorks = The sub-fields and parameters that a hacker wants to scan
If a hacker wishes to search by a field other than the URL, the following can be effectively substituted:
These options will help a hacker uncover a lot of information about a site that isn’t readily apparent without a Google Dork. These options also offer ways to scan the web to locate hard-to-find content. The following is an example of a Google Dork:
- Explore LOG Files For Login Credentials
This is a process to find the .LOG files accidentally exposed on the internet. This is basically a LOG file containing clues about what the credentials to the system might be or various user/ admin accounts that exists in the system.
Search query to perform the action
allintext:password filetype:log after:2019
When you enter this command in your google search box, you will find list of applications with exposed log files.
Dork command using two google operators
You can also use two combined google operators all in text and filetype.
The above command with expose you all the results that includes username inside *.log files
Website owners must configure a file name robots.txt file properly. It is a must to prevent Google Dorks from accessing important data of your site through a google search. Also, it is very important to keep the plugins up to date.
2. Explore Configurations Using ENV files
.env is used by various popular web development frameworks to declare general variables and configurations for local as well as dev environment.
DB_USERNAME filetype:env DB_PASSWORD filetype:enc=v
By using the command you can find list of sites that expose their env file publicly on the internet. Most of the devs inserts their .env file in the main website public directory, which can cause a great harm to their site if gets in hand of any cyber criminals.
If you click into any of the exposed .env file, you will notice unencrypted usernames, passwords and IPs are directly exposed in the search results.
Move .env files to somewhere that is not publicly accessible.
3. Explore Live Cameras
This sounds a bit creepy but have you ever wondered if your private live camera could be watched by anyone on the internet?
Using Google hacking techniques, you can fetch live camera web pages that are not restricted by IP. If you are creative enough to play with Google Dork, not just view, but you can also to take control of the full admin panel remotely, and even re-configure the cameras as you want to.
Using “top.htm” in the URL with the current time and date, you can find list of live cams that are publicly exposed.
Another dork for cameras for list of common live-view page hosted on routers
4. To Explore Open FTP Servers
The lack of setting access permissions in the FTP can be the direct cause of internal information getting published unintentionally. Even dangerous, if the FTP server is in “Write” status, this can create risk of the server being used as “storage” for computer viruses and illegally copied files.
With the following dork command, you will be able to easily explore the publicly exposed FTP Servers, which can sometimes explore many things.
intitle:"index of" inurl:ftp
In order to search for list of websites that uses HTTP protocol, you can simply type the following dork command.
intitle:"index of" inurl:http after:2018
You can also be more specific and and search for online forums that uses HTTP by simply changing the text in the search title.
intitle:"forum" inurl:http after:2018
5. Explore Specific websites with specific domains
Let’s say you want to explore websites or certain organization that has certain domain. You can simply do that by entering the following code:
You can use the above example to explore all the list of government sites. You can also replace inurl: with some other google search operators for interesting results.
6. View most recent Cache
This can show you the most recent cache of a specified webpage. This can be useful to identify when a page was last crawled.
How can Google Dork Cyber Security Enthusiast?
Google almost indexes everything connected with the internet, which also includes different private informations of misconfigured services. This can often be useful as well as equally harmful at the same time. You need to make sure that do not log in to any of the services, even if the password is exposed, as this could get you into trouble because you don’t have permission.
However, if you have something hosted online, you can use some of the dork commands on your domain just to make sure you did not left anything exposed that hacker can use to get you.
Making Effective Use of Operators
It may seem a little cryptic at first, so let me provide a few examples that show how the different operators can locate content and website data. A user can make effective use of the intitle operator to find anything on a website. Perhaps they are scraping email addresses and want to scan sites for the “@” symbol, or maybe they are looking for an index of other files.
Furthermore, the intext operator can be used to scan individual pages for any text you want, such as a target’s email address, name, the name of a web page (like a login screen), or other personal information to collect data about them.
The more you practice, the further you’ll be able to hone your queries to pinpoint different types of websites, pages, and vulnerabilities. Again, I need to caution you not to use these queries to attack another website because that would be illegal and could get you into a lot of trouble. Still, Google Dorks are a great way to locate hidden information on the web, which is why hackers love to use them to find security flaws in websites.
If you want to dig into some more queries, there are some great Google Dork resources on the web.
In the second part of the tutorial, I will show you more complicated formulas as well as tips and tricks to find web vulnerabilities.