bing

for completeness : php bing serp scraping :

$query = 'serp';
$page = 1;
$start = ($page-1)*10;
$url = 'http://www.bing.com/search?q='.urlencode($query)."&first=".($start+1);

$curl_handle = curl_init();
curl_setopt($curl_handle,CURLOPT_URL, $url);
curl_setopt($curl_handle,CURLOPT_CONNECTTIMEOUT,2);
curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER, 1);
$return = curl_exec($curl_handle);
curl_close($curl_handle);

$parts = split('

', $return); for($j=1;$j((?:(?!).)*)#i', $p, $urls); echo "position: ".($start +$j)." url: ".$urls[1]." title: ".$urls[3].'
'; }

ga api sample : get pageviews

I was going to put that online : how to get the pageviews out of the google analytics api, using simplexml and php. Google use three namespaces in the output file which make it less easy accessible, so here’s a quick sample of how to get your sites pageviews out of it :

//ids           = site identifier (from the site data feed)
//metrics     = what i want to see
//start-date 
//end-date 

$feedUri = "https://www.google.com/analytics/feeds/data?ids=ga:10516419&metrics=ga:pageviews&start-date=2009-04-01&end-date=2009-05-01"; 			

	$curl = curl_init();
	curl_setopt($curl, CURLOPT_URL, $feedUri);
	curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 3);
	curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);

       $headers[] = "Authorization: GoogleLogin auth=".$Authtoken;

//for authtoken : see previous post
	curl_setopt($curl, CURLOPT_HTTPHEADER, $headers); 
	curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0);
	curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
	curl_setopt($curl, CURLOPT_VERBOSE, 1);

//get the string containing the xml file
	$gA = curl_exec($curl);

the feed has three namespaces (atom, opensearch and dxp/analytics), a simple way is accessing the ENTRY tags (from the Atom namespace), in that tag is one DXP: line and that has the answer to the question.

<dxp:metric confidenceInterval=’0.0′ name=’ga:pageviews’ type=’integer’ value=’755’/>

//load the string into a simple xml object
	$feed = simplexml_load_string($gA);

//take the atom namespace
	$children =  $feed->children('http://www.w3.org/2005/Atom');

//take the entry tags
	$parts = $children->entry;
	foreach ($parts as $entry) {

        //from the entry tag,
        //access the dxp namespace
		$dxp = (object) $entry->children('http://schemas.google.com/analytics/2009');

        //METRIC contains the answer to the question
        //grab from the tag METRIC the attribute VALUE
                echo   (string) $dxp->metric->attributes()->value;

        }

Important is using the (string) typecast, normally simplexml returns a simplexml object, when you force a string type, it gives the actual metric ga:pageview value attribute as number.

google analytics have an api !

[note: over at ioncannon Carson McDonald made a cool google analytics plugin for wordpress, i use it on this blog, works fine].

An actual google analytics api, and I missed out on it. This api is already a month old and i havent read anything on the blogs about it.

I found it half an hour ago, I havent checked it completely but it looks promising. Here is the first bit, basic authentication with php and curl.

$USER_EMAIL=""; // #Insert your Google Account email here
$USER_PASS=""; //#Insert your password here

//array with some general data
$data = array(
  "Email" => $USER_EMAIL,
  "Passwd" => $USER_PASS, 
  "accountType" => "GOOGLE", 
  "source" => "curl-accountFeed-v1",
  "service" => "analytics"
);

$friends_url = 'https://www.google.com/accounts/ClientLogin';
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $friends_url);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 3);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);

//http-post that contains the array as data
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_POSTFIELDS, $data);

//go shove the https secure connection verification
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);

curl_setopt($curl, CURLOPT_VERBOSE, 1);
			

$googleAuth = curl_exec($curl);

//optional : some feedback

//check if we get an error code from cUrl
//    echo curl_errno($curl)."
"; // echo curl_error($curl)."
" ; //print the body of the returned data // print_r($googleAuth); //print all the headers // $info = curl_getinfo($curl); // print_r($info);

somewhere in the garbled mess that curl returns is the Authorization token, starts with auth=.

$start = strpos($googleAuth, "Auth=") + 5;
$Authtoken = substr($googleAuth, $start);

//echo $Authtoken;

I put that token in the header of the next calls and google assumes I am kosher : time to get the accounts feed :

//add the authoritzation token as extra header
$headers[] = "Authorization: GoogleLogin auth=".$Authtoken;


$friends_url = 'https://www.google.com/analytics/feeds/accounts/default';

	$curl = curl_init();
	curl_setopt($curl, CURLOPT_URL, $friends_url);
	curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 3);
	curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($curl, CURLOPT_HTTPHEADER, $headers); 
	curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0);
	curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
	curl_setopt($curl, CURLOPT_VERBOSE, 1);
	$googleAccounts = curl_exec($curl);

//check errors
echo curl_errno($curl);
echo curl_error($curl) ;
print_r($googleAccounts);

And there it is : a whole list with weird codes, my account list :) seems easier than the other gData api’s.

note : the google code curl example does not show the ” auth=” part of the token, they assume you use the entire line “auth=…” as token.

Once I have my spectacular visitor count in a sidebar widget I’ll blog another post on this one.