1. Very, very, very very short introduction to Google web indexing.

Google uses the proces called crawling (or fetching) to index new or updated pages. The program that does the crawling is called Googlebot (also known as a robot, bot, or spider). Googlebot uses an algorithmic process: computer programs determine which sites to crawl, how often, and how many pages to fetch from each site. Googlebot uses two types of crawling:

  • Deep crawl – when Googlebot fetches a page, it culls all the links appearing on the page and adds them to a queue for subsequent crawling. By harvesting links from every page it encounters, Googlebot can quickly build a list of links that can cover broad reaches of the web.
    Because of their massive scale, deep crawls can reach almost every page in the web. Because the web is vast, this can take some time, so some pages may be crawled only once a month.
  • Fresh crawl – to keep the index current, Google continuously rescans popular and frequently changing web pages at a rate roughly proportional to how often the pages change. Newspaper pages are downloaded daily, pages with stock quotes are downloaded much more frequently. Of course, fresh crawls return fewer pages than the deep crawl.

2. Google Hacking – What is it then?.

Google Hacking is the technique of using Google’s search engine to find vulnerable or sensitive data. To help refine search results we can use Advanced Search Operators and Special Search Characters.
Advanced operators use syntax as follows:

OperatorPurposeMixes with other Operators?Can be used alone?
intitleSearch page titleyesyes
allintitleSearch page titleyesyes
inurlSearch URLyesyes
allinurlSearch URLnoyes
filetypeSearch specific filesyesno
allintextSearch text of page onlyyesyes
siteSearch specific siteyesyes
linkSearch for links to pagesnoyes
inanchorSearch links anchor textyes yes
numrangeSearch numbers within a desired range.yesyes
daterangeSearch in date rangeyesno


+forced inclusion of something common
exclude a search term
“ ”use quotes around search phrases
.a single wildcard
*any word
|Boolean ‘OR’
(“master card” | mastercard)Parenthesis group queries


3. Examples.

What can we find in Google? Ok, let’s look at a few examples:

Directory Listings

Directory listings provide a list of files and directories in a browser window instead of the typical text-and graphics mix generally associated with web pages. Directory listings are often placed on web servers purposely to allow visitors to browse and download files from a directory tree. Many times, however, directory listings are not intentional and there’s a good chance that an attacker may find something interesting inside a directory listing.


– basic query that returns a large number of false-positive results
But those queries return some more interesting stuff:

intitle:index.of "parent directory"

or Query:

intitle:index.of name size


Web Server Detection

Security Tester can use this information to determine the version of the web server, or to search Google for vulnerable targets. In addition, this indicates that the web server is not well maintained.

intitle:index.of server.at

– This query focuses on the term “index of” in the title and “server at” appearing at the bottom of the directory listing.


intitle:index.of "Apache/2.4.7 Server at"

– This query will find servers with directory listings enabled that are running Apache version 2.4.7.

Files containing usernames and / or passwords

Yes, we can find files with logins and passwords which still work! Query:

xamppdirpasswd.txt filetype:txt

– return password files for XAMPP Server.

Xampp Query:


– FTP login/password credentials on github.com




– Passwords for Java Management Extensions (JMX Remote) used by jconsole.

“# Dumping data for table” (user | username | pass | password)


Sensitive Directories


 inurl:8080 intitle:"Dashboard [Jenkins]"

– Access to Jenkins Dashboard. At the beginning it’s not much, but if you go deeper you may find more interesting stuff.

inurl-8080 intitle--dashboard [jenkins]- - Szukaj w Google 2015-03-06 12-14-06

Sample screen of one of the latest build.


 “.git" intitle:"Index of"

– shows access to publicly browsable .git directories.


Various Online Devices



– displays public status page for Konica Minolta Printer.


Nothing usual so far, but we can go to login screen from this site and switch to administrator account.

KonicaMinolta2Ok, but weed still needs a password. On the previous site we can see specific printer model so maybe default will work? Let’s ask Google. For that we don’t need sophisticated query, and result is :


Ok, let’s put it to the test.

KonicaMinolta4oh…look, it worked 😉


How to secure?

  • Disable directory browsing on the webserver. Directory browsing should be enabled for those web-folders for which you want to give access to anyone on the internet.
  • Don’t put critical and sensitive information on servers without any proper authentication system, which can be directly accessible to anyone on the internet.
  • Install latest security patches available till date for the applications and as well as the operating system running on the servers.
  • Disable anonymous access in the webserver through internet to restricted systems directory.
  • If you find any links to your restricted server or sites in Google search result then it should be removed. Visit the following link for more details: http://www.google.com/remove.html
  • Disable anonymous access in the webserver through internet to restricted systems directory.
  • Google also took some steps to monitor suspicious searches of vulnerable data 😉

screenshot-ipv4 google com 2015-02-18 19-27-44


Google hacking can be a very useful tool in penetration testing. Tools like Metasploit and Nmap now have automated scripts that search Google for useful information related to a particular site or organization. Google hacking also finds excellent use in social engineering attacks and carrying out phishing campaigns. Although google hacking is an old technique, but it remains effective even to this day. Why? Because misconfigured servers, various online devices, vulnerable websites, keep coming up every day all over the internet, and Google monitors it all.

Want to know more?

Please check those articles and links: