google suggest scraper (php & simplexml)

Today’s goal is a basic php Google Suggest scraper because I wanted traffic data and keywords for free.

Before we start :

google scraping is bad !

Good People use the Google Adwords API : 25 cents for 1000 units, 15++ units for keyword suggestion so they pay 4 or 5 dollar for 1000 keyword suggestions (if they can find a good programmer which also costs a few dollars). Or they opt for SemRush (also my preference), KeywordSpy, Spyfu, and other services like 7Search PPC programs to get keyword and traffic data and data on their competitors but these also charge about 80 dollars per month for a limited account up to a few hundred per month for seo companies. Good people pay plenty.

We tiny grey webmice of marketing however just want a few estimates, at low or better no cost : like this :

data num queries
google suggest 57800000
google suggestion box 5390000
google suggest api 5030000
google suggestion tool 3670000
google suggest a site 72700000
google suggested users 57000000
google suggestions funny 37400000
google suggest scraper 62800
google suggestions not working 87100000
google suggested user list 254000000

Suggestion autocomplete is AJAX, it outputs XML :

< ?xml version="1.0"? >
   <toplevel>
     <CompleteSuggestion>
       <suggestion data="senior quotes"/>
       <num_queries int="30000000"/>
     </CompleteSuggestion>
     <CompleteSuggestion>
       <suggestion data="senior skip day lyrics"/>
       <num_queries int="441000"/>
     </CompleteSuggestion>
   </toplevel>

Using SimpleXML, the PHP routine is as simple as querying g00gle.c0m/complete/search?, grabbing the autocomplete xml, and extracting the attribute data :

 
        if ($_SERVER['QUERY_STRING']=='') die('enter a query like http://host/filename.php?query');
	$contentstring = @file_get_contents("http://g00gle.c0m/complete/search?output=toolbar&q=".urlencode($kw));  
  	$content = simplexml_load_string($contentstring );

        foreach($content->CompleteSuggestion as $c) {
            $term = (string) $c->suggestion->attributes()->data;
            //note : traffic data is sometimes missing   
            $traffic = (string) $c->num_queries->attributes()->int;
            echo $term. " ".$traffic . "
" ;
	}

I made a quick php script that outputs the terms as a list of new queries so you can walk through the suggestions :

The source is as text file up for download overhere (rename it to suggestit.php and it should run on any server with php5.* and simplexml).

Welcome to the Search Quality Rating Program!

1.0 Welcome to the Search Quality Rating Program!

As a Search Quality Rater, you will work on many different types of rating projects. These guidelines cover just one type of search quality rating – URL rating.
Please take the time to carefully read through these guidelines. The ideas presented here are important for other types of rating. When you can do URL rating, you will be well on your way to becoming a successful Search Quality Rater!

Cool stuff over at SeoKindle, the Google quality rating guidelines.

The document got leaked three weeks ago, it doesn’t contain anything spectacular but next time you tell people not to do something stupid, you can say it’s cos Google said so.

peanuts

Since me and my girlfriend got together, you know what her fantasy has always been? Petting an elephant while giving it peanuts! When we where in the bed room going our thing she asked if I could act like an elephant and she could feed me peanuts. Now she bought an elephant costume for me to wear which is extremely hot inside and she wants me to be on fours while she pets and pretends to feed me peanuts. She won’t let me do anything else! What should I do?

A question on Yahoo Answers :)