how to keep a traffic log with mysql and php

A rather obsolete post. I hacked a quick referrer tracking script together because I wanted to see how Googlebot indexed my serp tool pages. I can use Google analytics or phpMyVisites, but that is not as rewarding as keeping my own logs.

I start with a mysql table LOG to store the traffic data

CREATE TABLE IF NOT EXISTS `log` (
  `id` bigint(11) NOT NULL auto_increment,
  `ip` varchar(20) NOT NULL,
  `page` varchar(120) NOT NULL,
  `visitpage` varchar(120) NOT NULL,
  `visitdate` varchar(25) NOT NULL,
  `agent` varchar(200) NOT NULL,
  `protocol` varchar(10) NOT NULL,
  `url` varchar(120) NOT NULL,
  `visitpath` varchar(80) NOT NULL,
  PRIMARY KEY  (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;

on every page I track, I start with

session_start();
log_hit();

The function log_hit() relies on the php $_SERVER array that holds all data the web server collects:

function log_hit();
        $ip   = $_SERVER['REMOTE_ADDR'];

//if it is my own ip, exit the routine
	if($ip =="my.own.ip.here") return;

//what page is my visitor at ?
	$page  = "http://".$_SERVER['HTTP_HOST'].$_SERVER['PHP_SELF'];

//what page did he come from ?
	$visitpage  = $_SERVER['HTTP_REFERER'];

	$visitdate  = date("YmdHi");

//what user-string is it (iow: is it googlebot and pals or an actual visitor ?)
	$agent      = $_SERVER["HTTP_USER_AGENT"];        

//the rest i dont use but i'll throw them in just for fun :
	$protocol   = $_SERVER['SERVER_PROTOCOL'];
	$method    = $_SERVER['REQUEST_METHOD'];
	$url           = $_SERVER['REQUEST_URI'];
	$visitpath   = parse_url($page, PHP_URL_PATH);

	if(!$ip) {} else {
//if I have an Ip, register the hit :
//connect to mysql table 
		$m=connect_data();

//and store it in the LOG table
		$myqry="INSERT INTO `serp_log` (
		`ip` ,`page` ,`visitpage` ,`visitdate` ,`agent` ,`protocol` ,`url` ,`visitpath` ) VALUES (
		'".$ip."', '".$page."', '".$visitpage."', '".$visitdate."', '".$agent."', '".$protocol."', '".$url."', '".$visitpath."')";
		mysql_query($myqry, $m) or die('store log :'.mysql_error());
		mysql_close($m);
}

the connect_data() function is standard mysql connection

function connect_data() {
		$DB_USER =  "user";
		$DB_PASSWORD = "password";
		$DB_HOST = "server";
		$DB_DATA = "database";
		$link =  mysql_connect($DB_HOST, $DB_USER, $DB_PASSWORD) or $error = mysql_error();
		if (!$link) {
	    	return $error; 
		} 	 
	    mysql_select_db($DB_DATA, $link) or $error = mysql_error();
		return $link;
	}

It is far from perfect, but it works. It helps check the crawl-rate of pages. You can filter out the useragent ‘Googlebot’ and run a query grouping useragent by date and see what times Googlebot comes by.

//66.249.72.143 is GoogleBot
check_traffic_by_IP('66.249.72.143' );

function check_traffic_by_IP($thisIP) {
	$m = connect_data();
	$qry = "SELECT * from `log` WHERE `ip` = '".$thisIP."' GROUP BY `visitdate`";
	$lst = mysql_query($qry, $m);
	echo '';
	while($row=mysql_fetch_assoc($lst)) {
		echo '';
	}
	echo '
'.$row['visitdate'].'
'; }

…or exclude all records if the UserAgent indicates it was a Bot, and get a listing of organic beings visiting the pages :

function check_real_traffic() {
	$m = connect_data();

//the bots
	$bots = array('googlebot', 'anonymouse');

	$qry = "SELECT * from `log`";
	$lst = mysql_query($qry, $m);

	echo '';

	while($row=mysql_fetch_assoc($lst)) {

//check if the entries in the bots-array
//occur in the stored useragent string 
		$traffic=-1;
		for($i=0;$i0) $traffic+=1;
		} 

//if it's not a bot (then the $traffic var is still -1), output the info
		if($traffic==-1) {
               echo '';
                }
	}
	echo '
'.$row['visitdate'].''.$row['ip'].''.$row['page'].'
'; }

…that shows ‘organic’ visitors.

other resources
tek-tips.com/ Limit Requests per day by IP

phpMyVisites open source free referrer and traffic analysis