how to keep a traffic log with mysql and php
juust | November 16, 2008A rather obsolete post. I hacked a quick referrer tracking script together because I wanted to see how Googlebot indexed my serp tool pages. I can use Google analytics or phpMyVisites, but that is not as rewarding as keeping my own logs.
I start with a mysql table LOG to store the traffic data
-
CREATE TABLE IF NOT EXISTS `log` (
-
`id` bigint(11) NOT NULL auto_increment,
-
`ip` varchar(20) NOT NULL,
-
`page` varchar(120) NOT NULL,
-
`visitpage` varchar(120) NOT NULL,
-
`visitdate` varchar(25) NOT NULL,
-
`agent` varchar(200) NOT NULL,
-
`protocol` varchar(10) NOT NULL,
-
`url` varchar(120) NOT NULL,
-
`visitpath` varchar(80) NOT NULL,
-
PRIMARY KEY (`id`)
-
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;
on every page I track, I start with
-
session_start();
-
log_hit();
The function log_hit() relies on the php $_SERVER array that holds all data the web server collects:
-
function log_hit();
-
$ip = $_SERVER['REMOTE_ADDR'];
-
-
//if it is my own ip, exit the routine
-
if($ip =="my.own.ip.here") return;
-
-
//what page is my visitor at ?
-
$page = "http://".$_SERVER['HTTP_HOST'].$_SERVER['PHP_SELF'];
-
-
//what page did he come from ?
-
$visitpage = $_SERVER['HTTP_REFERER'];
-
-
$visitdate = date("YmdHi");
-
-
//what user-string is it (iow: is it googlebot and pals or an actual visitor ?)
-
$agent = $_SERVER["HTTP_USER_AGENT"];
-
-
//the rest i dont use but i'll throw them in just for fun :
-
$protocol = $_SERVER['SERVER_PROTOCOL'];
-
$method = $_SERVER['REQUEST_METHOD'];
-
$url = $_SERVER['REQUEST_URI'];
-
$visitpath = parse_url($page, PHP_URL_PATH);
-
-
if(!$ip) {} else {
-
//if I have an Ip, register the hit :
-
//connect to mysql table
-
$m=connect_data();
-
-
//and store it in the LOG table
-
$myqry="INSERT INTO `serp_log` (
-
`ip` ,`page` ,`visitpage` ,`visitdate` ,`agent` ,`protocol` ,`url` ,`visitpath` ) VALUES (
-
'".$ip."', '".$page."', '".$visitpage."', '".$visitdate."', '".$agent."', '".$protocol."', '".$url."', '".$visitpath."')";
-
mysql_query($myqry, $m) or die('store log :'.mysql_error());
-
mysql_close($m);
-
}
the connect_data() function is standard mysql connection
-
function connect_data() {
-
$DB_USER = "user";
-
$DB_PASSWORD = "password";
-
$DB_HOST = "server";
-
$DB_DATA = "database";
-
$link = mysql_connect($DB_HOST, $DB_USER, $DB_PASSWORD) or $error = mysql_error();
-
if (!$link) {
-
return $error;
-
}
-
mysql_select_db($DB_DATA, $link) or $error = mysql_error();
-
return $link;
-
}
It is far from perfect, but it works. It helps check the crawl-rate of pages. You can filter out the useragent ‘Googlebot’ and run a query grouping useragent by date and see what times Googlebot comes by.
-
//66.249.72.143 is GoogleBot
-
check_traffic_by_IP('66.249.72.143' );
-
-
function check_traffic_by_IP($thisIP) {
-
$m = connect_data();
-
$qry = "SELECT * from `log` WHERE `ip` = '".$thisIP."' GROUP BY `visitdate`";
-
$lst = mysql_query($qry, $m);
-
echo '<table><tbody>';
-
while($row=mysql_fetch_assoc($lst)) {
-
echo '<tr><td>'.$row['visitdate'].'</td></tr>';
-
}
-
echo '</tbody></table>';
-
}
…or exclude all records if the UserAgent indicates it was a Bot, and get a listing of organic beings visiting the pages :
-
function check_real_traffic() {
-
$m = connect_data();
-
-
//the bots
-
$bots = array('googlebot', 'anonymouse');
-
-
$qry = "SELECT * from `log`";
-
$lst = mysql_query($qry, $m);
-
-
echo '<table><tbody>';
-
-
while($row=mysql_fetch_assoc($lst)) {
-
-
//check if the entries in the bots-array
-
//occur in the stored useragent string
-
$traffic=-1;
-
for($i=0;$i<count ($bots);$i++) {
-
if(strpos(strtolower($row['agent']), $bots[$i])>0) $traffic+=1;
-
}
-
-
//if it's not a bot (then the $traffic var is still -1), output the info
-
if($traffic==-1) {
-
echo '<tr><td>'.$row['visitdate'].'</td><td>'.$row['ip'].'</td><td>'.$row['page'].'</td></tr>';
-
}
-
}
-
echo '</count></tbody></table>';
-
}
…that shows ‘organic’ visitors.
other resources
tek-tips.com/ Limit Requests per day by IP
phpMyVisites open source free referrer and traffic analysis






