2008 08 20
main blog entries :
wordpress serp widget serpent
28-7
the permutation serp creates too much of a load and the results are 90% general so not very usefull per page.
- replaced it with a straight top-50 serp on the msn, yahoo, google engines
- using post_tags
- added a link to the search engine result pages themselves
- added a cache and a timer on 6000seconds, it requeries every 1.5 hours when a page is opened and in the meantime dishes out cached results, to minimize queries and page load times
- added before_widget and after_widget
it needs
- specific options per engine (language etc.)
- a dig option up to 1000 result
- css hooks
- max three keywords constraint
- page keywords list (currently only posts’ tags are used)
- a mysql backend (I don’t want the nonsense in the wordpress db)
- a counter for used queries (otherwise you get ‘no result’)
- an archive report to see how the serp results develop
but it’s a start
the results are accurate but i need a once a day routine to grab the keyword of the index page and other pages (the ‘keywords’ section is often dynamic and changes every day, most sites use plugins to either grab all tags of listed posts and put them in the meta-section as page-keywords, or use headspace to override it with a fixed set.)
querying the wordpress database and for all non-post page use get_meta_tags, put url and tags in a cached list and serp the lot once a day would solve that.
something like that.
once I got that fixed i’ll ask some people to test it.
trismegistos
If you hear a voice within you say “you cannot paint,”
then by all means paint, and that voice will be silenced.
Vincent Van Gogh
trismegistos dot net
On gethost I am programming a small ‘directories’ gimmick, 2300 free submit directories with extended info, also available as rss feed for Pagerank-categories, and xml-feed for extended info (for now just pagerank and alexa, but I’ll be adding compete, zoom, indexed pages and if the sites aint too big, a list of indexed url’s with their pagerank. I am checking whois info.
Once I have it running for the directories, I’ll make a screen for ‘normal’ sites as well with technorati and blogsearch data, backlinks.
I think I am going to run that on ‘trismegistos.net’, a new domain.
tool : pagerank per url from a sitemap
I wired a google pagerank toolbar-query snippet to a simplexml sitemap readout, and put it on a page. You can fill in a sitemap url and get the google pageranks of all ‘mapped’ urls.
It works, I stripped it down and you can download it here or on the sample page.
I mainly wanted the snippet wired to a sitemap to compare the results of my pagerank spider tool with an actual google readout. Running a sitemap through a toolbar query snippet is the fastest way.
I allready had a spider result of siteometrics (calc pr) so now I can compare it to google’s toolbar query on http://www.siteometrics.com/sitemap.xml :
| google pr |
calc pr |
URL |
| 2 |
|
http://www.siteometrics.com/ |
| 2 |
0.80 |
/index.php |
| 0 |
0.32 |
/advertise.html |
| - |
0.77 |
/recommend.php |
| - |
0.75 |
/search-engine-saturation.php |
| 0 |
0.75 |
/link-popularity.php |
| 0 |
0.75 |
/pagerank.php |
| 0 |
0.75 |
/bulk-pagerank.php |
| 0 |
0.75 |
/pagerank-mult-pages.php |
| 0 |
0.75 |
/link-pop-pagerank.php |
| - |
0.75 |
/link-search-pagerank.php |
| 0 |
0.75 |
/alexa.php |
| 0 |
0.75 |
/bulk-alexa.php |
| 0 |
0.75 |
/serpcheck.php |
| 0 |
0.75 |
/keyword-research.php |
| 0 |
0.67 |
/visitor-info.php |
| - |
0.24 |
/useful-links.html |
| 0 |
0.24 |
/contact-us.html |
| - |
0.24 |
/sitemap.html |
| - |
0.24 |
/privacy-policy.html |
Weird result, the sitemap they issue is part old site, part new site. If you check the pageranks on the newer .php files it’s the same, though.
a quarter of the urls link into the archived site, that might cause the drop in pagerank (links to /feed and google.com on every page, see the other article on siteometrics).
for the freaks : here’s the php code (assume url is a valid sitemap-url).
-
-
$myurl=$_REQUEST['url'];
-
$xml = simplexml_load_file($myurl);
-
foreach($xml->url as $u) echo pagerank((string) $u->loc)."<br />";
-
exit;
-
-
function pagerank($url) {
-
if (!preg_match('/^(http:\/\/)?([^\/]+)/i', $url)) { $url='http://'.$url; }
-
$pr=curl_getpr($url);
-
return $pr.';'.$url.';';
-
}
-
-
function getch($url) { return CheckHash(HashURL($url)); }
-
-
function curl_getpr($url) {
-
$googlehost='toolbarqueries.google.com';
-
$googleua='Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.6) Gecko/20060728 Firefox/1.5';
-
-
$ch = getch($url);
-
-
$form="http://toolbarqueries.google.com/search?client=navclient-auto&ch=$ch&features=Rank&q=info:$url";
-
$cr = curl_init($form);
-
curl_setopt($cr, CURLOPT_FAILONERROR, true);
-
curl_setopt($cr, CURLOPT_HEADER, 0);
-
curl_setopt($cr, CURLOPT_USERAGENT, $googleua); // Spoof the user-agent
-
curl_setopt($cr, CURLOPT_RETURNTRANSFER, true);
-
$data = curl_exec($cr);
-
-
if(!$data) {
-
curl_close($cr);
-
unset($cr);
-
$pr='-';
-
return $pr;
-
} else {
-
$pos = strpos($data, "Rank_");
-
if($pos === false) {
-
curl_close($cr);
-
unset($cr);
-
$pr='-';
-
return $pr;
-
} else{
-
$pr=substr($data, $pos + 9);
-
$pr=trim($pr);
-
$pr=str_replace("\n",'',$pr);
-
curl_close($cr);
-
unset($cr);
-
return $pr;
-
}
-
}
-
}
-
-
//PageRank Lookup v1.1 by HM2K (update: 31/01/07)
-
//based on an algorithm found at: http://pagerank.gamesaga.net/
-
//live demo: http://www.highrankforum.com/pagerank.php
-
-
//convert a string to a 32-bit integer
-
function StrToNum($Str, $Check, $Magic) {
-
$Int32Unit = 4294967296; // 2^32
-
-
$length = strlen($Str);
-
for ($i = 0; $i < $length; $i++) {
-
$Check *= $Magic;
-
if ($Check >= $Int32Unit) {
-
$Check = ($Check - $Int32Unit * (int) ($Check / $Int32Unit));
-
//if the check less than -2^31
-
$Check = ($Check < -2147483648) ? ($Check + $Int32Unit) : $Check;
-
}
-
$Check += ord($Str{$i});
-
}
-
return $Check;
-
}
-
-
//genearate a hash for a url
-
function HashURL($String) {
-
$Check1 = StrToNum($String, 0×1505, 0×21);
-
$Check2 = StrToNum($String, 0, 0×1003F);
-
$Check1 >>= 2;
-
$Check1 = (($Check1 >> 4) & 0×3FFFFC0 ) | ($Check1 & 0×3F);
-
$Check1 = (($Check1 >> 4) & 0×3FFC00 ) | ($Check1 & 0×3FF);
-
$Check1 = (($Check1 >> 4) & 0×3C000 ) | ($Check1 & 0×3FFF);
-
$T1 = (((($Check1 & 0×3C0) < < 4) | ($Check1 & 0×3C)) <<2 ) | ($Check2 & 0xF0F );
-
$T2 = (((($Check1 & 0xFFFFC000) << 4) | ($Check1 & 0×3C00)) << 0xA) | ($Check2 & 0xF0F0000 );
-
-
return ($T1 | $T2);
-
}
-
-
//genearate a checksum for the hash string
-
function CheckHash($Hashnum) {
-
$CheckByte = 0;
-
$Flag = 0;
-
-
$HashStr = sprintf('%u', $Hashnum) ;
-
$length = strlen($HashStr);
-
-
for ($i = $length - 1; $i >= 0; $i –) {
-
$Re = $HashStr{$i};
-
if (1 === ($Flag % 2)) {
-
$Re += $Re;
-
$Re = (int)($Re / 10) + ($Re % 10);
-
}
-
$CheckByte += $Re;
-
$Flag ++;
-
}
-
-
$CheckByte %= 10;
-
if (0 !== $CheckByte) {
-
$CheckByte = 10 - $CheckByte;
-
if (1 === ($Flag % 2) ) {
-
if (1 === ($CheckByte % 2)) {
-
$CheckByte += 9;
-
}
-
$CheckByte >>= 1;
-
}
-
}
-
-
return '7'.$CheckByte.$HashStr;
-
}
social bookmarking to get your site indexed
Yesterday I put one link through twitter on twemes.com and four links on del.icio.us to the links.trismegistos.net php Link Directory.
Today I googled ‘trismegistos links’ to see what the effect (if any) would be and the link I put on twemes actually shows up first in google (and top-10 frontpage, spot 6 of 40.000 results).
I also issued a 700 URL sitemap to google webmaster first, and only added the bookmarks after the sitemap was downloaded.
I was just curious which method would yield the best result, and twitter/twemes is the winner.
why bother ?
Because a test I did shows most directory sites subcategory pages (where most links are) have no assigned pagerank and if i want to run an effective directory I have to get a fix on that problem and get a fix to fix it up.
I did two tests on directory sites, where I downloaded Yahoo SiteExplorer indexed urls’ and retrieved the pagerank per url.
| sites |
pages/site |
total |
ranked |
percentage |
| 16 |
1000 |
16.000 |
120 |
0.7% |
| 150 |
50 |
7500 |
200 |
2.5% |
Roughly interpreted, per site most pages are indexed, but only about 2 per 100 pages have a pagerank value assigned.
The others have no value assigned and don’t pass any value on links on it. In all cases (except dmoz, which is ranking on most branches) it was the index page and main category pages that were ranked and the pages with links were all N/A not-available.
So testing the effect of social bookmarking on pages that would hold links is interesting.
A submission now costs 3 to 15 cts, which shows the value of links in a directory is low, and only featured links (which usually appear on the index and main category pages which do rank) are sold for $3,-/year to $40,-/permanent.
An estimate for a link on a PR3 page for a year in a directory page is $13,-/year.
If i can get 700 pages to rank PR1 and sell links for $3,-/year, 700 pages with 20 links times 3 makes $60.000,-++ a year. Compare that to $10/year for 20 links on a PR3 category page, 10 pages is $2.000/year.
And if 700 social bookmarks can make sure after a year my whole directory is ranked, indexed, brings in $60..000,- and delivers the goods (a ranking link for entrants at $3,-/year) then a month linkspamming is well worth the trouble.
Another option is reciprocals on the category page itself (from an indexed page, some link-pages are conveniently not indexed ;)
serp pagerank php seo tool
So how did I do with a few days search engine optimisation ? not bad at all, on php+serp nicely on second spot behind shoemoney.com and on serp+php also front page, and on the 5 keys overall 13th. not bad for a PR0 domain.
5-key serp : serp pagerank php seo tool
| count |
hits |
domain |
| 7.8 |
8 |
www.juust.org |
| 16 |
serp pagerank |
http://www.juust.org/ |
0.2 |
| 8 |
serp php |
http://www.juust.org/ |
1 |
| 9 |
serp php |
http://www.juust.org/index.php/php-serp-scripts/2008/07/ |
1 |
| 99 |
serp tool |
http://www.juust.org/ |
0.2 |
| 42 |
pagerank serp |
http://www.juust.org/ |
0.2 |
| 3 |
php serp |
http://www.juust.org/ |
3 |
| 4 |
php serp |
http://www.juust.org/index.php/php-serp-scripts/2008/07/ |
2 |
| 70 |
seo serp |
http://www.juust.org/ |
0.2 |
and who are the competition :
| points |
results |
domain |
| 20.2 |
13 |
www.seochat.com |
| 16.6 |
21 |
www.seocompany.ca |
| 15.6 |
13 |
sitening.com |
| 14.6 |
10 |
www.webconfs.com |
| 13.4 |
21 |
forums.digitalpoint.com |
| 11.4 |
11 |
www.prchecker.info |
| 11 |
4 |
en.wikipedia.org |
| 10 |
19 |
www.webmasterworld.com |
| 9 |
4 |
www.cristiandarie.ro |
| 8.8 |
8 |
www.seroundtable.com |
| 8.4 |
5 |
www.shoemoney.com |
| 8.4 |
5 |
www.google.com |
| 7.8 |
8 |
www.juust.org |
| 7.6 |
6 |
tools.seobook.com |
| 7.2 |
9 |
www.jumptags.com |
| 7.2 |
14 |
www.seomoz.org |
| 7 |
7 |
www.seoserp.com |
| 6.4 |
14 |
forums.seochat.com |
| 6.4 |
10 |
link.ezer.com |
| 6.4 |
4 |
www.ljfind.com |
| 5.8 |
7 |
www.iwebtool.com |
Now, for the seo+tool niche… an online trackback spider…