Close
Written by
Grzegorz Zawalnicki

Grzegorz Zawalnicki

Google Hacking – how to find vulnerable data using nothing but Google Search Engine.

1. Very, very, very very short introduction to Google web indexing.

Google uses the proces called crawling (or fetching) to index new or updated pages. The program that does the crawling is called Googlebot (also known as a robot, bot, or spider). Googlebot uses an algorithmic process: computer programs determine which sites to crawl, how often, and how many pages to fetch from each site. Googlebot uses two types of crawling:

  • Deep crawl – when Googlebot fetches a page, it culls all the links appearing on the page and adds them to a queue for subsequent crawling. By harvesting links from every page it encounters, Googlebot can quickly build a list of links that can cover broad reaches of the web.
    Because of their massive scale, deep crawls can reach almost every page in the web. Because the web is vast, this can take some time, so some pages may be crawled only once a month.
  • Fresh crawl – to keep the index current, Google continuously rescans popular and frequently changing web pages at a rate roughly proportional to how often the pages change. Newspaper pages are downloaded daily, pages with stock quotes are downloaded much more frequently. Of course, fresh crawls return fewer pages than the deep crawl.

2. Google Hacking – What is it then?.

Google Hacking is the technique of using Google’s search engine to find vulnerable or sensitive data. To help refine search results we can use Advanced Search Operators and Special Search Characters.
Advanced operators use syntax as follows:

operator:search_term
Operator Purpose Mixes with other Operators? Can be used alone?
intitle Search page title yes yes
allintitle Search page title yes yes
inurl Search URL yes yes
allinurl Search URL no yes
filetype Search specific files yes no
allintext Search text of page only yes yes
site Search specific site yes yes
link Search for links to pages no yes
inanchor Search links anchor text yes yes
numrange Search numbers within a desired range. yes yes
daterange Search in date range yes no

 

Character Purpose
+ forced inclusion of something common
exclude a search term
“ ” use quotes around search phrases
. a single wildcard
* any word
| Boolean ‘OR’
(“master card” | mastercard) Parenthesis group queries

 

3. Examples.

What can we find in Google? Ok, let’s look at a few examples:

Directory Listings

Directory listings provide a list of files and directories in a browser window instead of the typical text-and graphics mix generally associated with web pages. Directory listings are often placed on web servers purposely to allow visitors to browse and download files from a directory tree. Many times, however, directory listings are not intentional and there’s a good chance that an attacker may find something interesting inside a directory listing.
Query:

intitle:index.of

– basic query that returns a large number of false-positive results
But those queries return some more interesting stuff:
Query:

intitle:index.of "parent directory"

or Query:

intitle:index.of name size

IndexOfBackup

Web Server Detection

Security Tester can use this information to determine the version of the web server, or to search Google for vulnerable targets. In addition, this indicates that the web server is not well maintained.
Query:

intitle:index.of server.at

– This query focuses on the term “index of” in the title and “server at” appearing at the bottom of the directory listing.

ServerAt
Query:

intitle:index.of "Apache/2.4.7 Server at"

– This query will find servers with directory listings enabled that are running Apache version 2.4.7.
Apache247

Files containing usernames and / or passwords

Yes, we can find files with logins and passwords which still work! Query:

xamppdirpasswd.txt filetype:txt

– return password files for XAMPP Server.

Xampp Query:

site:github.cominurl:sftp-config.json

– FTP login/password credentials on github.com

Git

Query:

filetype:passwordjmxremote

– Passwords for Java Management Extensions (JMX Remote) used by jconsole.
jmx
Query:

“# Dumping data for table” (user | username | pass | password)

DUMP

Sensitive Directories

Query:

 inurl:8080 intitle:"Dashboard [Jenkins]"

– Access to Jenkins Dashboard. At the beginning it’s not much, but if you go deeper you may find more interesting stuff.

inurl-8080 intitle--dashboard [jenkins]- - Szukaj w Google 2015-03-06 12-14-06

Sample screen of one of the latest build.

Jenkins
Query:

 “.git" intitle:"Index of"

– shows access to publicly browsable .git directories.

git1

Various Online Devices

Query:

“inurl:system_device.xml”

– displays public status page for Konica Minolta Printer.

KonicaMinolta

Nothing usual so far, but we can go to login screen from this site and switch to administrator account.

KonicaMinolta2Ok, but weed still needs a password. On the previous site we can see specific printer model so maybe default will work? Let’s ask Google. For that we don’t need sophisticated query, and result is :

KonicaMinolta3

Ok, let’s put it to the test.

KonicaMinolta4oh…look, it worked 😉

KonicaMinolta5

How to secure?

  • Disable directory browsing on the webserver. Directory browsing should be enabled for those web-folders for which you want to give access to anyone on the internet.
  • Don’t put critical and sensitive information on servers without any proper authentication system, which can be directly accessible to anyone on the internet.
  • Install latest security patches available till date for the applications and as well as the operating system running on the servers.
  • Disable anonymous access in the webserver through internet to restricted systems directory.
  • If you find any links to your restricted server or sites in Google search result then it should be removed. Visit the following link for more details: http://www.google.com/remove.html
  • Disable anonymous access in the webserver through internet to restricted systems directory.
  • Google also took some steps to monitor suspicious searches of vulnerable data 😉

screenshot-ipv4 google com 2015-02-18 19-27-44

Conclusion

Google hacking can be a very useful tool in penetration testing. Tools like Metasploit and Nmap now have automated scripts that search Google for useful information related to a particular site or organization. Google hacking also finds excellent use in social engineering attacks and carrying out phishing campaigns. Although google hacking is an old technique, but it remains effective even to this day. Why? Because misconfigured servers, various online devices, vulnerable websites, keep coming up every day all over the internet, and Google monitors it all.

Want to know more?

Please check those articles and links:

Share this post on

Tags

One thought in Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Read next

Observations in the UK

Upfront I have to inform you that this article does not contain any deep thoughts, goals, or hidden layers. It is just an article written by someone who has recently moved from Poland to the UK and in his getting-to-know-the-UK period observed some practical differences that he wants to share with you. This article is not […]

Read more