zend php and google webmaster api II : wordpress mu auto-register

Part Deux of automating the registration and verification of a wordpress blog. In the previous post I showed how to add a site to google webmaster tools.

Which site you ask ? Oh dear… in the previous post I did not mention how to create a new blog in wpmu :

include_once('wp-config.php');
  1. include_once('wp-includes/wp-db.php');
  2. include_once('wp-includes/wpmu-functions.php');
  3. $newblogid= wpmu_create_blog('tryout.blacknorati.com', '/', 'tryout', 1);

Very basic, assuming I am the admin user (with ID=1). After creating the blog, I post it’s url to google webmaster tools to start the registration. Then I want to

  • verify the site
  • add a sitemap
  • and blog on!

verifying a site

I can add any url to Google Webmaster Tools, but I only get to use the tools once Google are sure I ‘own’ the domain or subdomain. Verification is done by checking on the presence of a header metatag in the index file, or a specific file on the server. Once Google spots it, Google know I control the site and I can use the webmaster tools.

On a WordPress Mu install I do not, as user, get to have my own template. I currently have 100 standard templates installed to choose from, some with options and widgets and that should be enough. But editing the template itself is not possible for separate users, so I cannot verify sites with a header metatag.

The alternative is putting a file on the server with a particular codename, but users don’t have an actual separate subdomain with a wordpress Mu install, so that one also won’t work.

Eek ! Well, no problem, Google also accept a post with the filename in the url. Just blog a post with the google___.html filename as title, WordPress automatically turns the title into the url and you can use that post to have Google verify the site is yours.

getting the verification filename

A Google Webmaster Tools account has it’s own standard verification code and it’s valid for every site. Once a user registered the site with GWT, I can retrieve that code from the sites data feed :

function get_verification_title($domain, $client) {
  1.   $myfeed = get_site($domain, $client);
  2.     foreach ($myfeed as $item) {
  3.   $tags     = "";
  4.          $subjects = $item->{"wt:verification-method"};
  5.          if (is_array($subjects) and count($subjects) > 0) {
  6.     return $subjects[1];
  7.    }
  8.   }
  9. }
  10.  
  11. function get_site($domain, $client) {
  12.   $fdata = new Zend_Gdata($client);
  13.   $tgt="https://www.google.com/webmasters/tools/feeds/sites/".htmlentities(urlencode('http://'.$domain.'/'));
  14.   $result=$fdata->getFeed($tgt);
  15.   return $result;
  16. }

With the get_site function I retrieve the site’s atom list as zend feed. The feed contains two wt:verification-method tags, one for the metatag and one for the html-file. This function loads both in the $subjects array and i pick item[1] (it’s a 0 based array), the html file name. I need that one to go post on the new blog. Here is a php routine taken from Snipplr.

function add_verify_post($domain, $verification, $logon, $pass) {
  1.  $category='';
  2.  $req = 'title='. $verification . '&content=' . $verification . '&category=' . $category . '&logon=' . $logon . '&pass=' . $pass;
  3.  $header .= "POST /remote_post.php HTTP/1.0\r\n";
  4.  $header .= "Host: ". $domain."\r\n";
  5.  $header .= "Content-Type: application/x-www-form-urlencoded\r\n";
  6.  $header .= "Content-Length: " . strlen ($req) . "\r\n";
  7.  $header .= "Connection: Close\r\n\r\n";
  8.  $fp = fsockopen($domain, 80, $errno, $errstr, 30);
  9.  $SUCCESS = false;
  10.  
  11.  if (!$fp) {
  12.   $status_message = "$errstr ($errno)";
  13.   $res = "FAILED";
  14.  }
  15.  else {
  16.   fputs ($fp, $header . $req);
  17.   while (!feof($fp) && $SUCCESS==false) {
  18.    $res = fgets ($fp, 1024);
  19.    if (strcmp ($res, "SUCCESS") == 0) {
  20.     $SUCCESS = true;
  21.    }
  22.    if(!empty($res)){
  23.     $last_line = $res;
  24.    }
  25.   }
  26.  }
  27.  fclose($fp);
  28.  
  29.  if ($SUCCESS == true){
  30.  }else{
  31.   echo $last_line;
  32.   }
  33.  }
  34. }

The remote_post.php code is the same as the snippet.

I am the owner of the blog so I can use the standard admin login and password in the function. For security purposes I’d use a different login and password for remote access though (this one does not use SSL).

With a simple call I send one new post to the new blog with the google verification file name as title.

add_verify_post('BlogSubdomain.blacknorati.com', 'google12345.html', 'MyLogin', 'MyPassword');
  1. I had some doubts about google accepting <strong>blog.blacknorati.com/year/month/'google12345html</strong> but they actually accept it so I don't have to adapt the permalink settings.
  2.  
  3. Now I have to send Google a 'verify' xml message,
  4. <pre lang="php">function verify_site($domain, $client) {
  5.  //domain without http
  6.  $xml='
  7.     http://'.$domain.'';
  8.  $xml.="";
  9.    $xml.='
  10.     ';
  11.   $fdata = new Zend_Gdata($client);
  12.   $result=$fdata-&gt;post($xml,"https://www.google.com/webmasters/tools/feeds/sites/".urlencode('http://'.$domain)."/");
  13.   return $result;
  14. }

presto, now Google know I control the site, and I can use the webmaster tools. That means I can add the sitemap. And that in turn means my sites are indexed a lot faster.

function add_webmap($domain, $sitemap, $client) {
  1.  //domain without http
  2.  $xml='
  3.     http://'.$sitemap.'';
  4.     $xml.="
  5.      WEB
  6.    ";
  7.  
  8.   $fdata = new Zend_Gdata($client);
  9.   $myaddress= "https://www.google.com/webmasters/tools/feeds/".htmlentities(urlencode('http://'.$domain.'/'), ENT_QUOTES)."/sitemaps/";
  10.   $result=$fdata-&gt;post($xml,$myaddress);
  11.   return $result;
  12. }

Happy now. Google Webmaster Tools API was top of my wish-list. Now I can register and verify 32.000 sites with sitemaps automatically, so that saves me at least 2500 hours of work. And it was actually easier than I thought, with the proper examples and snippets available online.

I am going to clean up the code a bit and stuff it in a class, and move on to developing large scale ‘grey’ ops :)

Posted in php, wordpress and tagged , .

7 Comments

  1. wow … thnaks for the great article, I really really loved it. Bookmarked your blog, keep up the articles coming because I honestly find them very interesting.

    P.S: I also subscribed.

  2. Thanks for the article, really need it.

    I know it’s quite old but I got a problem with website verification. After I put my meta in the website and changed your script from html verification to meta with

    I got a Expected response code 200, got 400 Invalid request

    I tried a lot of solutions, including curl direct PUT request but I always got a 400 error. Any Idea?

    Thanks again

  3. Hi thanks for this code. Do you have sample implementation of Zend Framework on getting the ‘top seach queries’ and the ‘Links in my site’ data inside google webmaster?

  4. Pingback: Domain sitemap wpmu | Hi Tech Stuff Reviews & Updates

  5. I also use Zend to add site or verify site in Google webmaster tools. Add site is no problem for me. But I always get 404 error when I use your posted function
    function verify_site($domain, $client).
    Same issue like Andrea posted.
    You mentioned striping out crlf as the solution , but I don’t quite understand it.
    Can you tell me how I can change that verify_site function and get it work?

    Thank you so much

Leave a Reply

Your email address will not be published. Required fields are marked *