google trends crosstab

I was playing with Google Trends using another blog. Of the seven or eight posts I tried, five made it to the details page, which gave me some traffic. I am too lazy to follow 100 entries during the day, so I want a simple report on how trends develop the past day like this :

google trends watch

Now how do I get that done ?

First I make a cronjob that runs a routine to pull the Google trends xml-feed in every hour and store it in a database. I add a string for the period (year-month-day-hour).

That gives me a table like this one :

DATE SEARCHID POS
2008120101 baboon 78
2008120101 monkey 13
2008120102 baboon 98
2008120102 monkey 5
2008120103 monkey 3

I want an output

2008120101 2008120102 2008120103
baboon 78 98
monkey 13 5 3

The way to do that is a crosstable query, also called a pivot table .

MySql pivot table

I have to make a routine to turn that base table into a table with the dates as column, and per phrase the position in the correct column. I want a complete row, so I take the highest period from the table and the lowest, and from that intrapolate the other periods.

  1. function make_periods($begin, $end) {
  2. $BeginTime=mktime(substr($begin, 8, 2),0,0, substr($begin,4,2), substr($begin,6,2), substr($begin,0,4));
  3. $EndTime=mktime(substr($end,8,2),0,0, substr($end,4,2), substr($end,6,2), substr($end,0,4));
  4. //divide the difference by 60minutes*60seconds
  5. $periods = ($EndTime$BeginTime) / 3600;
  6. //make a row of hour-periods with "+N hour",
  7. for($i=0;$i< ($periods+1);$i++) $myperiods[] = strftime('%Y%m%d%H', strtotime("+$i hour", $BeginTime));
  8. //return an array with all periods
  9. //that are to be the column headers in the crosstable
  10.  return $myperiods;
  11. }

At some point I have to clean up the table, and I want the data to fit in one page,
so I will use 24 periods

  1. $plength=25;
  2. if(count($dates)>$plength) {
  3.  for($j=0;$j< ($plength+1);$j++) $usedates[] = $dates[count($dates) ($plength$j)];
  4. } else {
  5. //note : this one needs some work.
  6.      $usedates[]=$dates;
  7. }

… and delete all older records :

  1. //make the periods
  2. $dates=array();
  3. $dates = make_periods($begin, $end);
  4.  
  5. //use the $dates array, see if I have more than 24 periods
  6. if(count($dates) > 24) {
  7. //get period number-last-minus-24
  8. //delete all records before that period
  9.   mysql_query("DELETE FROM `serp_trends`  WHERE `date`< '".$dates[count($dates)-24] ."'", $link);
  10. }

…that keeps the table small enough for quick querying. I have 24 periods to make columns, how do I get the position as value in each column ? I read the ‘wizard’ article on mysql pivot tables and that sounded difficult.

I don’t get it, but I got it running :

  • group the records by phrase (“searchid”)
  • use the array of periods as columns
  • see if the phrase has a position in a specific period,
    • if so, take the position as value (it only has one per period)
    • otherwise take 0 as value
  • sum the values
  • name the column “T”+period
  1. $usedates=$dates;
  2. $CROSTAB = "SELECT `searchid`, AVG(`pos`) AS AP ";
  3.  for($j=0;$j<count ($usedates);$j++) {
  4.   $CROSTAB.=", SUM(IF(date='".$usedates[$j]."', pos, 0)) AS T".$usedates[$j];
  5.  }
  6. $CROSTAB .= " FROM `serp_trends` GROUP BY `searchid` ORDER BY AP ASC";

I take the average position from the grouped records, order it ascending and output that as table, first the two columns phrase (searchid) and average position, then the periods as column names.

  1. $link=connect_data();
  2.  
  3. $result=mysql_query($CROSTAB, $link) or die(mysql_error());
  4.  
  5. //add the zebra class marker for mootool
  6. $CTB.= '<table class="zebra" cellpadding="0" cellspacing="0"><tbody>';
  7.  
  8. //make the header row
  9. $CTB.='<tr><td>avg</td><td>search</td>';
  10.  
  11. for($j=0;$j<count ($usedates);$j++) {
  12.   $CTB.='<td >'.substr($usedates[$j], -2).'';
  13. }
  14. $CTB.='</count></tr>';
  15.  
  16. //output the crosstable query result
  17. //which has the same format
  18. //searchid, average position, period(01)-period(24).
  19.  
  20. while($row=mysql_fetch_assoc($result)) {
  21.  $CTB.='<tr><td>'.number_format($row['AP'], '0', '0', '1').'</td><td>'.$row['searchid'].'</td>';
  22.  for($j=0;$j<count ($usedates);$j++) {
  23.   $v=$row['T'.$usedates[$j]];
  24.   if($v=='0') {
  25.    $CTB.='<td>'; } else {
  26.    $CTB.='<td>'.$v.'</td>'; }
  27.  }
  28.  $CTB.='</count></tr>';
  29. }
  30. $CTB.='</tbody></table>';

When I echo $CTB, the crosstable, this comes out :

google trends watch

Seeing rows per phrase means I can check a lot easier which trends are running, rising, dropping, or even returning. And that was what I wanted.

I zipped the files for download.

Posted in google, php and tagged , , .

Leave a Reply

Your email address will not be published. Required fields are marked *