juust ~ php oddities

Unordered list of one element
  • rss
  • begin
  • about
    • vcard
    • WTF is BroJesus
  • php scripts
    • flickr wp widget
    • google multi key serp tool, php script
    • gwt plugin
  • php classes
    • php pagerank class
    • fibonacci class
    • robots.txt parser php class
  • serp
    • serp dashboard wordpress plugin
  • services

zend php and google webmaster tools api

juust | 19/10/2008

update 2: Sandrine worked out a set of routines, as far as I know using Zend 1.7, she lists the code here.

update: Google updated their API in oktober (almost at the time I wrote these posts) and this code fails as it still based on the V1 APi. You can access the whole WT: toolset namespace (including sitemaps, verification) through the V2 API now, but you need to send a version id along with your request, that is handled in the new Zend 1.7 download.

The Problem

I can add 32.000 blogs on a standard WordpressMu install. How do I add 32.000 subdomains, verify them and add their sitemaps to Google Webmaster, without having to go to the webmaster page about 96.000 times ?

The solution

Integrating Google Webmaster Tools API into my Wordpress Mu install.

What is it worth ?

If registering and verifying a site and adding a sitemap takes 5 minutes per domain, at E12,- per hour, that makes it 96.000 euros and 4 labor years for 32.000 sites. Writing a script is worth E96.000,- and saves me four years of mindless drone work, so that is well worth having a look at.

Software : Zend

Zend gData is a php framework that is programmed to handle Google Data. Their ClientLogin routine isn’t very flexible and they haven’t covered GWT Api yet, so I’ll have to hack some routines together.

After getting stonewalled by the zend program a few times, I went searching and ended up on ngoprekweb who have a nice post on ClientLogin authorization for the blogger api. Eris Ristemena uses a modified Zend ClientLogin, very nice work. I installed the adapted classes and tried that one to get through the ClientLogin, and it paid off.

The good stuff : Gwt api access

I am not interested in the blogger stuff though, I want access to GWT Google Webmaster Tools, so I worked Eris Ristemena’s blogger routine around a little.

  1.   set_include_path(dirname(__FILE__) . '/Zend_Gdata');
  2.   require_once 'Zend.php';
  3.   Zend::loadClass('Zend_Gdata_ClientLogin');
  4.   Zend::loadClass('Zend_Gdata');
  5.   Zend::loadClass('Zend_Feed');
  6.  
  7.   $username     = '';
  8.   $password     = '';
  9.   $service      = 'sitemaps';
  10.   $source       = 'Zend_ZendFramework-0.1.1'; // companyName-applicationName-versionID
  11.   $logintoken   = $_POST['captchatoken'];
  12.   $logincaptcha = $_POST['captchaanswer'];
  13.  
  14.   try {
  15.     $resp = Zend_Gdata_ClientLogin::getClientLoginAuth($username,$password,$service,$source,$logintoken,$logincaptcha);
  16.  
  17.     if ( $resp['response']=='authorized' )
  18.     {
  19.       $client = Zend_Gdata_ClientLogin::getHttpClient($resp['auth']);
  20.       $gdata = new Zend_Gdata($client);
  21.    
  22.    $feed = $gdata->getFeed("https://www.google.com/webmasters/tools/feeds/sites/");  
  23.          foreach ($feed as $item) {
  24.        echo '<h3><a href="'.$item->title().'" target="_blank">' . $item->title() . '</a></h3>';
  25.          }
  26.     }
  27.     elseif ( $resp['response']=='captcha' )
  28.     {
  29.       echo 'Google requires you to solve this CAPTCHA image';
  30.       echo '<img src="https://www.google.com/accounts/'.$resp['captchaurl'].'" /><br />';
  31.       echo '<form action="'.$_SERVER['PHP_SELF'].'" method="POST">';
  32.       echo 'Answer : <input type="text" name="captchaanswer" size="10" />';
  33.       echo '<input type="hidden" name="captchatoken" value="'.$resp['captchatoken'].'" />';
  34.       echo '<input type="submit" />';
  35.       echo '</form>';
  36.       exit;
  37.     }
  38.     else
  39.     {
  40.       // there is no way you can go here, some exceptions must have been thrown
  41.     }
  42.  
  43.   } catch ( Exception $e )  {
  44.     echo $e->getMessage();
  45.   }

(I added https://www.google.com/accounts/ to the captcha image source, otherwise it keeps drawing blanks.)

Zend uses a “HttpClient” for the connection to Google, and a gData class (usually the main ‘feed’, blogs, sites) that you use to do basic data manipulation. All feed entries are an atom format with a custom namespace : here’s the basic atom format defined :
code.google.com: AD_Retrieving.

Now I am going to add a domain. In my add_site function I put an XML Atom together to post (using the post() function of the gData class) to the sites feed url, and the Google API does the rest :

  1. function add_site($domain, $client) {
  2.   $xml='<entry xmlns="http://www.w3.org/2005/Atom">';
  3.   $xml.='<content src="http://'.$domain.'/" />';
  4.   $xml.='</entry>';
  5.   $fdata = new Zend_Gdata($client);
  6.   $result=$fdata->post($xml,"https://www.google.com/webmasters/tools/feeds/sites/");
  7.   return $result;
  8. }

In the main routine I pass the domain and the running httpclient to the add_site() function :

  1.    if ( $resp['response']=='authorized' )
  2.     {
  3.       $client = Zend_Gdata_ClientLogin::getHttpClient($resp['auth']);
  4.       echo add_site('test.blacknorati.com', $client);
  5.     }

Cool. That saves me up to 32.000 site registrations. The rest of it is still greek to me, but this part functions. Next week : more nonsense (verify the site, add a sitemap, and integrate it in the blog creation function of wordpress mu).

1) about the blogger function : I tried to list the blogger posts with the ngoprekweb php code, but it seems blogger use a different string these days to identify the blog in gData, the id is returned as “tag:blogger.com-blabla-(blogid)” and you want the last part to access the blogs post atom feed :

  1.  $idText = split('-', $item->id());
  2.         $blogid = $idText[2];

(modified from the Zend 1.6.1 codebase)

  1.       foreach ($feed as $item) {
  2.         echo '<a href="'.$item->link("alternate").'">' . $item->title() . '</a>';
  3.  
  4.  $idText = split('-', $item->id());
  5.         $blogid = $idText[2];
  6.  
  7.         $feed1 = $gdata->getFeed("http://www.blogger.com/feeds/$blogid/posts/summary");
  8. //…
  9. }

[Post to Twitter] Tweet This  [Post to Plurk] Plurk This  [Post to Yahoo Buzz] Buzz This  [Post to Delicious] Delicious This  [Post to Reddit] Reddit This 

Categories
php, tool, wordpress
Tags
php, tool, wordpress
Comments rss
Comments rss
Trackback
Trackback

« how to circumvent a php 30 second time out zend php and google webmaster api II : wordpress mu auto-register »

5 Responses to “zend php and google webmaster tools api”

  1. zend php and google webmaster api II : wordpress mu auto-register | serp and pagerank tools says:
    22/10/2008 at 11:16 pm

    [...] Part Deux of automating the registration and verification of a wordpress blog. In the previous post I showed how to add a site to google webmaster tools. [...]

    Reply
  2. sandrine says:
    07/01/2009 at 4:58 am

    You saved my life!!
    thank you

    Reply
  3. sandrine says:
    09/01/2009 at 5:35 am

    Hello,
    I did exactly how you did and it worked for 30 websites.
    but once i wanted to add my 2102 websites i got one error messages :Fatal error: Maximum execution time of 30 seconds exceeded in /www/hosts/googlewebmaster/html/library/Zend/Uri/Http.php on line 64
    Did you get the same error? if yes do you know where i can change the timestamp?
    Thanks

    Reply
  4. juust says:
    09/01/2009 at 6:44 am

    A standard php install has a time limit in php.ini of 30 seconds (to prevent a function from running on endless on the server, usually when it gets stuck on http-connections not returning data). You can call set_time_limit(); inside the loop to reset the 30 second count, providing your host doesn’t use safe mode. Some scripts use set_time_limit(0); at the beginning of a script which sets the limit off, but that can backfire. It indicates a problem with the http-connection, do you login for every site you add, or login and use one client to add the whole site list, in a loop ?

    Reply
  5. tumaji says:
    21/01/2010 at 12:05 am

    Nice information, Thank you.

    Reply

Leave a Reply

Click here to cancel reply.

Recent Posts

  • p2p with wordpress xml-rpc
  • Tweets on Google’s frontpage
  • happy new year
  • metaWeblog.newPost posting to Wordpress from Word
  • IE is retarded

click me!
rss
Comments rss
Blog Directory
Web Developement Blogs - BlogCatalog Blog Directory
Listed in LS Blogs the Blog Directory and Blog Search Engine
Blog Flux Directory
joopita.com free web directory and search engine
design by jide
sitemap
8096 confirmed spam kills