google trends II
juust | 22/12/2008I wanted to reply to a question elsewhere on the site, but a ‘comment’ box isn’t fit for it so I’ll put the reply here. The question was about creating ’search engine friendly’ descriptive URL’s based on keywords from the Google Trends atom feed, listing pages a graph of the trend.
I hacked a quick example together on a subdomain over at trends.trismegistos.net, just to be sure it works.
You can get a site to list http://domain.com/trend_title.html type url’s by using mod_rewrite, an apache module.
In the server directory of the application you can use an .htaccess file to set rules for file access in these folders. When the server gets request from browsers or servers it applies any rewriting rules you define in .htaccess to these requests.
I tried this one :
-
<ifmodule mod_rewrite.c>
-
RewriteEngine On
-
RewriteCond %{REQUEST_FILENAME} !-f
-
RewriteCond %{REQUEST_FILENAME} !-d
-
RewriteRule ^(.*).html /trendinfo.php?title=$1
-
</ifmodule>
RewriteEngine On
sets the rewrite mechanism on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
tell the apache server that rewriteconditions apply to file-requests that are not an existing file (F) or directory (D). If the requested filename is anywhere in the servers file table, the server dishes out that file, otherwise it will try to apply a RewriteRule. Applying the rule generates a new request, if that returns anything, the server dishes that out, otherwise it returns an htpp-404 ‘file not found’.
The actual url rewrite rule is :
RewriteRule ^(.*).html /trendinfo.php?title=$1
which means :
- if any filename is requested that satisfies the mask ^(.*).html then
- take everything before .html
- add that as variable $1 to trendinfo.php?title=$1
- see if it sticks
If the browser requests http://domain.com/bob+bowersox.html, the server will assert it is not a file or directory on the server, and test the available rules. When it notices it the requested file ends with .html, it applies the rewrite rule and tries to access http://domain.com/trendinfo.php?title=bob+bowersox.
A browsing user does not notice a thing.
In trendinfo.php I wrote some code to handle the ‘new’ request :
-
if(!isset($_REQUEST['title'])) {
-
//if there is no $1, added as title, fake a 404 "file not found" message
-
echo 'the emptiness…';
-
} else {
-
//get the title from the request
-
$mytitle=htmlentities($_REQUEST['title'], ENT_QUOTES, "UTF-8");
-
//put the google trends graph url together
-
$graphurl = 'http://www.google.com/trends/viz?hl=&q=';
-
$graphurl .= urlencode($mytitle);
-
$graphurl .= '&date='; //leave date blank to get the current graph
-
$graphurl .= '&graph=hot_img&sa=X';
-
echo "<img class=hotGraph width=280 height=190 src='$graphurl'/>";
-
}
…that outputs the Google trend graph on the url http://domain.com/bob+bowersox.html
I zipped the trends.trismegistos.net program files, but that might be a bit over the top, the download file contains a class that relies on a mysql table being filled every hour with new trends (by cron.php on an apache cron-job), parsing and storing the atom feed of google trends, and listing it as a cross-table in index.php spanning the past 24 hours.
You can also put this in index.php :
-
$feed = simplexml_load_file('http://www.google.com/trends/hottrends/atom/hourly');
-
$children = $feed->children('http://www.w3.org/2005/Atom');
-
$parts = $children->entry;
-
foreach ($parts as $entry) {
-
$details = $entry->children('http://www.w3.org/2005/Atom');
-
$dom = new domDocument();
-
$html=$details->content;
-
@$dom->loadHTML($html);
-
$anchors = $dom->getElementsByTagName('a');
-
foreach ($anchors as $anchor) {
-
$url = $anchor->getAttribute('href');
-
$urltext = $anchor->nodeValue;
-
echo '<a href="'.urlencode($urltext).'.html" target="_blank">'.$urltext.'</a> ';
-
}
-
}
-
unset($dom);
-
unset($anchors);
-
unset($parts);
-
unset($feed);
That lists the current 100 google trends with a link. If you use the .htaccess rewrite rules, the server reroutes all the links to trendinfo.php with descriptive urls.
I hope that helps.









