<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>juust ~ php oddities &#187; seo tips and tricks</title>
	<atom:link href="http://www.juust.org/index.php/tag/seo-tips-and-tricks/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.juust.org</link>
	<description>Unordered list of one element</description>
	<lastBuildDate>Thu, 02 Sep 2010 16:58:24 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>curl trackbacks</title>
		<link>http://www.juust.org/index.php/curl-trackbacks/2009/03/</link>
		<comments>http://www.juust.org/index.php/curl-trackbacks/2009/03/#comments</comments>
		<pubDate>Wed, 25 Mar 2009 09:53:13 +0000</pubDate>
		<dc:creator>juust</dc:creator>
				<category><![CDATA[links]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[seo tips and tricks]]></category>
		<category><![CDATA[trackback]]></category>

		<guid isPermaLink="false">http://www.juust.org/?p=375</guid>
		<description><![CDATA[I figure i&#8217;d blog a post on trackback linkbuilding. A trackback is &#8230; (post a few and you&#8217;ll get it). The trackback protocol isn&#8217;t that interesting, but the implementation of it by blog-platforms and cms&#8217;es makes it an excellent means for network development, because it uses a simple http-post. cUrl makes that easy).
To post a [...]]]></description>
			<content:encoded><![CDATA[<p>I figure i&#8217;d blog a post on trackback linkbuilding. A trackback is &#8230; (post a few and you&#8217;ll get it). The trackback protocol isn&#8217;t that interesting, but the implementation of it by blog-platforms and cms&#8217;es makes it an excellent means for network development, because it uses a simple http-post. cUrl makes that easy).</p>
<p>To post a succesful link proposal I need some basic data :</p>
<p>about my page </p>
<ul>
<li>url (must exist)</li>
<li>blog owner (free)</li>
<li>blog name (free)</li>
</ul>
<p>about the other page</p>
<ul>
<li>url (must exist)</li>
<li>excerpt (should be proper normal text)</li>
</ul>
<p><em>my page :</em> this is preferably a php routine that hacks some text, pictures and video&#8217;s, PLR or articles together, with a url rewrite. I prefer using xml textfiles in stead of a database, works faster when you set stuff up.</p>
<p><em>other page :</em> don&#8217;t use &#8220;I liked your article so much&#8230;&#8221;, use text that maches text on target pages, preferably get some proper excerpts from xml-feeds like blogsearch, msn and yahoo (excerpts contain the keywords I searched for, as anchor text it works better for search engine visibility and link value). </p>
<p>Let&#8217;s get some stuff from the MSN rss feed :</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="co1">//a generic query = 5% success</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//add &quot;(powered by) wordpress&quot; </span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; <span class="re1">$query</span><span class="sy0">=</span><span class="kw3">urlencode</span><span class="br0">&#40;</span><span class="st0">&#39;keywords+wordpress+trackback&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; <span class="re1">$xml</span> <span class="sy0">=</span> <span class="sy0">@</span>simplexml_load_file<span class="br0">&#40;</span><span class="st0">&quot;http://search.live.com/results.aspx?q=$query&amp;count=50&amp;first=1&amp;format=rss&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; <span class="re1">$count</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; <span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$xml</span><span class="sy0">-&gt;</span><span class="me1">channel</span><span class="sy0">-&gt;</span><span class="me1">item</span> <span class="kw1">as</span> <span class="re1">$i</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="re1">$count</span><span class="sy0">++;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//the data from msn</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="re1">$target</span><span class="br0">&#91;</span><span class="st0">&#39;link&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="br0">&#40;</span>string<span class="br0">&#41;</span> <span class="re1">$i</span><span class="sy0">-&gt;</span><span class="me1">link</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="re1">$target</span><span class="br0">&#91;</span><span class="st0">&#39;title&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="br0">&#40;</span>string<span class="br0">&#41;</span> <span class="re1">$i</span><span class="sy0">-&gt;</span><span class="me1">title</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="re1">$target</span><span class="br0">&#91;</span><span class="st0">&#39;excerpt&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="br0">&#40;</span>string<span class="br0">&#41;</span> <span class="re1">$i</span><span class="sy0">-&gt;</span><span class="me1">description</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//some variables I&#39;ll need later on</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="re1">$target</span><span class="br0">&#91;</span>id<span class="st0">&#39;] = $count;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="st0"> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; $target[&#39;</span>trackback<span class="st0">&#39;] = &#39;</span><span class="st0">&#39;;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="st0"> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; $target[&#39;</span>trackback_success<span class="st0">&#39;] = 0;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="st0"> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; $trackbacks[]=$target;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="st0"> &nbsp; &nbsp; &nbsp; }</span></div>
</li>
<li class="li1">
<div class="de1"><span class="st0"</span></div>
</li>
</ol>
</div>
<p>25% of the cms sites in the top of the search engines are Wordpress scripts and Wordpress always uses /trackback/ in the rdf-url. I get the source of the urls in the search-feed and grab all link-url&#8217;s in it, if any contains /t<strong>rackbac</strong>k/, I post a trackback to that url  and see if it sticks. </p>
<p>(I can also spider all links and check if there is an rdf-segment in the target&#8217;s source (*1), but that takes a lot of time, I could also program a curl array and use multicurl, for my purposes this works fast enough).</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$t</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$t</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$trackbacks</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$t</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//I could use curl </span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//but 95% of the urls offered are kosher and respond fast</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="re1">$content</span> <span class="sy0">=</span> <span class="sy0">@</span><span class="kw3">file_get_contents</span><span class="br0">&#40;</span><span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="re1">$t</span><span class="br0">&#93;</span><span class="br0">&#91;</span><span class="st0">&#39;link&#39;</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="kw3">preg_match_all</span> <span class="br0">&#40;</span><span class="st0">&quot;/a[<span class="es0">\s</span>]+[^&gt;]*?href[<span class="es0">\s</span>]?=[<span class="es0">\s</span><span class="es0">\&quot;</span><span class="es0">\&#39;</span>]+&quot;</span><span class="sy0">.</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="st0">&quot;(.*?)[<span class="es0">\&quot;</span><span class="es0">\&#39;</span>]+.*?&gt;&quot;</span><span class="sy0">.</span><span class="st0">&quot;([^&lt; ]+|.*?)?&lt;<span class="es0">\/</span>a&gt;/&quot;</span><span class="sy0">,</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$content</span><span class="sy0">,</span> <span class="sy0">&amp;</span><span class="re1">$matches</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$uri_array</span> <span class="sy0">=</span> <span class="re1">$matches</span><span class="br0">&#91;</span><span class="nu0">1</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$uri_array</span> <span class="kw1">as</span> <span class="re1">$key</span> <span class="sy0">=&gt;</span> <span class="re1">$link</span><span class="br0">&#41;</span> <span class="br0">&#123;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">strpos</span><span class="br0">&#40;</span><span class="re1">$link</span><span class="sy0">,</span> <span class="st0">&#39;rackbac&#39;</span><span class="br0">&#41;</span><span class="sy0">&gt;</span><span class="nu0">0</span><span class="br0">&#41;</span> <span class="br0">&#123;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="re1">$t</span><span class="br0">&#93;</span><span class="br0">&#91;</span><span class="st0">&#39;trackback&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="re1">$link</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">break</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="sy0">&lt;/</span>count<span class="sy0">&gt;</span></div>
</li>
</ol>
</div>
<p>When I fire a trackback, the other script will try and assert if my page has a link and matching text. I have to make sure my page shows the excerpts and links, so I stuff all candidates in a cached xml file.  </p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="kw2">function</span> cache_xml_store<span class="br0">&#40;</span><span class="re1">$trackbacks</span><span class="sy0">,</span> <span class="re1">$pagetitle</span><span class="br0">&#41;</span> </div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$xml</span> <span class="sy0">=</span> <span class="st0">&#39;&lt; ?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="st0"> &lt;trackbacks&gt;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$a</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$a</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$trackbacks</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$a</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$arr</span> <span class="sy0">=</span> <span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="re1">$a</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$xml</span> <span class="sy0">.=</span> <span class="st0">&#39;&lt;entry&gt;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$xml</span> <span class="sy0">.=</span> <span class="st0">&#39;&lt;id&gt;&#39;</span><span class="sy0">.</span><span class="re1">$arr</span><span class="br0">&#91;</span><span class="st0">&#39;id&#39;</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&#39;&lt;/id&gt;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$xml</span> <span class="sy0">.=</span> <span class="st0">&#39;&lt;excerpt&gt;&#39;</span><span class="sy0">.</span><span class="re1">$arr</span><span class="br0">&#91;</span><span class="st0">&#39;excerpt&#39;</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&#39;&lt;/excerpt&gt;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$xml</span> <span class="sy0">.=</span> <span class="st0">&#39;&lt;link&gt;&#39;</span><span class="sy0">.</span><span class="re1">$arr</span><span class="br0">&#91;</span><span class="st0">&#39;link&#39;</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&#39;&lt;/link&gt;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$xml</span> <span class="sy0">.=</span> <span class="st0">&#39;&lt;title&gt;&#39;</span><span class="sy0">.</span><span class="re1">$arr</span><span class="br0">&#91;</span><span class="st0">&#39;title&#39;</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&#39;&lt;/title&gt;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$xml</span> <span class="sy0">.=</span> <span class="st0">&#39;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$xml</span> <span class="sy0">.=</span> <span class="st0">&#39;&lt;/count&gt;&lt;/trackbacks&gt;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$fname</span> <span class="sy0">=</span> <span class="st0">&#39;cache/trackback&#39;</span><span class="sy0">.</span><span class="kw3">urlencode</span><span class="br0">&#40;</span><span class="re1">$pagetitle</span><span class="br0">&#41;</span><span class="sy0">.</span><span class="st0">&#39;.xml&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">file_exists</span><span class="br0">&#40;</span><span class="re1">$fname</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="kw3">unlink</span><span class="br0">&#40;</span><span class="st0">&#39;cache/&#39;</span><span class="sy0">.</span><span class="re1">$fname</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$fhandle</span> <span class="sy0">=</span> <span class="kw3">fopen</span><span class="br0">&#40;</span><span class="re1">$fname</span><span class="sy0">,</span> <span class="st0">&#39;w&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw3">fwrite</span><span class="br0">&#40;</span><span class="re1">$fhandle</span><span class="sy0">,</span> <span class="re1">$xml</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw3">fclose</span><span class="br0">&#40;</span><span class="re1">$fhandle</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">return</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
</ol>
</div>
<p>I use simplexml to read that cached file and show the excertps and links once the page is requested. </p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="co1">// retrieve the cached xml and return it as array.</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw2">function</span> cache_xml_retrieve<span class="br0">&#40;</span><span class="re1">$pagetitle</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$fname</span> <span class="sy0">=</span> <span class="st0">&#39;cache/trackback&#39;</span><span class="sy0">.</span><span class="kw3">urlencode</span><span class="br0">&#40;</span><span class="re1">$pagetitle</span><span class="br0">&#41;</span><span class="sy0">.</span><span class="st0">&#39;.xml&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">file_exists</span><span class="br0">&#40;</span><span class="re1">$fname</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$xml</span><span class="sy0">=@</span>simplexml_load_file<span class="br0">&#40;</span><span class="re1">$fname</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="sy0">!</span><span class="re1">$xml</span><span class="br0">&#41;</span> <span class="kw1">return</span> <span class="kw2">false</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$xml</span><span class="sy0">-&gt;</span><span class="me1">entry</span> <span class="kw1">as</span> <span class="re1">$e</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$trackback</span><span class="br0">&#91;</span><span class="st0">&#39;id&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span><span class="br0">&#40;</span>string<span class="br0">&#41;</span> <span class="re1">$e</span><span class="sy0">-&gt;</span><span class="me1">id</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$trackback</span><span class="br0">&#91;</span><span class="st0">&#39;link&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span> &nbsp;rid<span class="br0">&#40;</span><span class="br0">&#40;</span>string<span class="br0">&#41;</span> <span class="re1">$e</span><span class="sy0">-&gt;</span><span class="me1">link</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$trackback</span><span class="br0">&#91;</span><span class="st0">&#39;title&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span> &nbsp;<span class="br0">&#40;</span>string<span class="br0">&#41;</span> <span class="re1">$e</span><span class="sy0">-&gt;</span><span class="me1">title</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$trackback</span><span class="br0">&#91;</span><span class="st0">&#39;description&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span> &nbsp;<span class="br0">&#40;</span>string<span class="br0">&#41;</span> <span class="re1">$e</span><span class="sy0">-&gt;</span><span class="me1">description</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="re1">$arr</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">return</span> <span class="re1">$trackbacks</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">return</span> <span class="kw2">false</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
</ol>
</div>
<p>(this setup requires a subdirectory <strong>cache</strong> set to read/write with chmod 777)</p>
<p>I use http://www.domain.com/financial+trends.html and extract the pagetitle as &#8220;financial trends&#8217;, which has an xml-file http://www.domain.com/cache/financial+trends.xml. (In my own script I use sef urls with mod_rewrite, you can also use the $_SERVER array).</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="re1">$pagetitle</span><span class="sy0">=</span><span class="kw3">preg_replace</span><span class="br0">&#40;</span><span class="st0">&#39;/<span class="es0">\+</span>/&#39;</span><span class="sy0">,</span> <span class="st0">&#39; &#39;</span><span class="sy0">,</span> <span class="kw3">htmlentities</span><span class="br0">&#40;</span><span class="re1">$_REQUEST</span><span class="br0">&#91;</span><span class="st0">&#39;title&#39;</span><span class="br0">&#93;</span><span class="sy0">,</span> ENT_QUOTES<span class="sy0">,</span> <span class="st0">&quot;UTF-8&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$cached_excerpts</span> <span class="sy0">=</span> cache_xml_retrieve<span class="br0">&#40;</span><span class="re1">$pagetitle</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//do some stuff with, make it look nice &nbsp;:</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$s</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$s</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$cached_excerpts</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$s</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//this lists the trackback (candidates)</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw3">echo</span> <span class="re1">$cached_excerpts</span><span class="br0">&#91;</span><span class="re1">$s</span><span class="br0">&#93;</span><span class="br0">&#91;</span><span class="st0">&#39;excerpt&#39;</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw3">echo</span> <span class="st0">&#39;&lt;a href=&quot;&#39;</span><span class="sy0">.</span><span class="re1">$cached_excerpts</span><span class="br0">&#91;</span><span class="re1">$s</span><span class="br0">&#93;</span><span class="br0">&#91;</span><span class="st0">&#39;link&#39;</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&#39;&quot;&gt;&#39;</span><span class="sy0">.</span><span class="re1">$cached_excerpts</span><span class="br0">&#91;</span><span class="st0">&#39;title&#39;</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&#39;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="sy0">&lt;/</span>count<span class="sy0">&gt;</span></div>
</li>
</ol>
</div>
<p>Now I prepare the data for the trackback post :</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$t</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$t</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$trackbacks</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$t</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$trackback_url</span> <span class="sy0">=</span> <span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="re1">$t</span><span class="br0">&#93;</span><span class="br0">&#91;</span><span class="st0">&#39;trackback&#39;</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//does it have a trackback target url ? then prepare data :</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$trackback_url</span> <span class="sy0">!=</span><span class="st0">&#39;&#39;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$trackback_data</span> <span class="sy0">=</span> <span class="kw3">array</span><span class="br0">&#40;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&quot;url&quot;</span> <span class="sy0">=&gt;</span> <span class="st0">&quot;url of my page with the link to the target&quot;</span><span class="sy0">,</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="st0">&quot;title&quot;</span> <span class="sy0">=&gt;</span> <span class="st0">&quot;title of my page&quot;</span><span class="sy0">,</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&quot;blog_name&quot;</span> <span class="sy0">=&gt;</span> <span class="st0">&quot;name of my blog&quot;</span><span class="sy0">,</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&quot;excerpt&quot;</span> <span class="sy0">=&gt;</span> <span class="st0">&#39;[...]&#39;</span><span class="sy0">.</span><span class="kw3">trim</span><span class="br0">&#40;</span><span class="kw3">substr</span><span class="br0">&#40;</span><span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="re1">$t</span><span class="br0">&#93;</span><span class="br0">&#91;</span><span class="st0">&#39;description&#39;</span><span class="br0">&#93;</span><span class="sy0">,</span> <span class="nu0">0</span><span class="sy0">,</span> <span class="nu0">150</span><span class="br0">&#41;</span><span class="sy0">.</span><span class="st0">&#39;[...]&#39;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="co1">//&#8230;and try the trackback</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="re1">$t</span><span class="br0">&#93;</span><span class="br0">&#91;</span><span class="st0">&#39;trackback_success&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span> trackback_ping<span class="br0">&#40;</span><span class="re1">$trackback_url</span><span class="sy0">,</span> <span class="re1">$mytrackbackdata</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="sy0">&lt;/</span>count<span class="sy0">&gt;</span></div>
</li>
</ol>
</div>
<p>This the actual trackback post using cUrl. cUrl has a convenient timeout setting, I  use three seconds. If a host does not respond in half a second it&#8217;s probably dead. Three seconds is generous.</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="kw2">function</span> trackback_ping<span class="br0">&#40;</span><span class="re1">$trackback_url</span><span class="sy0">,</span> <span class="re1">$trackback</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//make a string of the data array to post</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$trackback</span> <span class="kw1">as</span> <span class="re1">$key</span><span class="sy0">=&gt;</span><span class="re1">$value</span><span class="br0">&#41;</span> <span class="re1">$strout</span><span class="br0">&#91;</span><span class="br0">&#93;</span><span class="sy0">=</span><span class="re1">$key</span><span class="sy0">.</span><span class="st0">&quot;=&quot;</span><span class="sy0">.</span><span class="kw3">rawurlencode</span><span class="br0">&#40;</span><span class="re1">$value</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$postfields</span><span class="sy0">=</span> <span class="kw3">implode</span><span class="br0">&#40;</span><span class="st0">&#39;&amp;&#39;</span><span class="sy0">,</span> <span class="re1">$strout</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; </div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//create a curl instance</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$ch</span> <span class="sy0">=</span> curl_init<span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_URL<span class="sy0">,</span> <span class="re1">$trackback_url</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_TIMEOUT<span class="sy0">,</span> <span class="nu0">3</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_USERAGENT<span class="sy0">,</span> <span class="st0">&quot;Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_RETURNTRANSFER<span class="sy0">,</span> <span class="kw2">true</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//set a custom form header</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_HTTPHEADER<span class="sy0">,</span> <span class="kw3">array</span><span class="br0">&#40;</span><span class="st0">&#39;Content-type: application/x-www-form-urlencoded&#39;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_NOBODY<span class="sy0">,</span> <span class="kw2">true</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_POST<span class="sy0">,</span> <span class="kw2">true</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_POSTFIELDS<span class="sy0">,</span> <span class="re1">$postfields</span><span class="br0">&#41;</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$content</span> <span class="sy0">=</span> curl_exec<span class="br0">&#40;</span><span class="re1">$ch</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//if the return has a tag &#39;error&#39; with as value 0 it went flawless</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$success</span> <span class="sy0">=</span> <span class="nu0">0</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">strpos</span><span class="br0">&#40;</span><span class="re1">$content</span><span class="sy0">,</span> <span class="st0">&#39;&gt;0&#39;</span><span class="br0">&#41;</span><span class="sy0">&gt;</span><span class="nu0">0</span><span class="br0">&#41;</span> <span class="re1">$success</span> <span class="sy0">=</span> <span class="nu0">1</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;curl_close <span class="br0">&#40;</span><span class="re1">$ch</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw3">unset</span><span class="br0">&#40;</span><span class="re1">$ch</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">return</span> <span class="re1">$success</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
</ol>
</div>
<p>Now the last routine : rewrite the cached xml file with only the successful trackbacks (seo stuff) :</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$t</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$t</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$trackbacks</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$t</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="re1">$t</span><span class="br0">&#93;</span><span class="br0">&#91;</span><span class="st0">&#39;trackback_success&#39;</span><span class="br0">&#93;</span><span class="sy0">&gt;</span><span class="nu0">0</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$store_trackbacks</span><span class="br0">&#91;</span><span class="br0">&#93;</span><span class="sy0">=</span><span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="re1">$t</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">cache_xml_store<span class="br0">&#40;</span><span class="re1">$store_trackbacks</span><span class="sy0">,</span> <span class="re1">$pagetitle</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="sy0">&lt;/</span>count<span class="sy0">&gt;</span></div>
</li>
</ol>
</div>
<p>voila : a page with only successful trackbacks. </p>
<p>Google (the backrub engine) don&#8217;t like sites that use automated link-building methods, other engines (Baidu, MSN, Yahoo) use a more normal link popularity keyword matching algorithm. Trackback linking helps getting you a clear engine profile at relative low cost. </p>
<p>0) for brevity and clarity, the code above is rewritten (taken from a trackback script I am developing on another site), it can contain some typo&#8217;s.</p>
<p>*1) If you want to spider links for rdf-segments : <a href="https://svn.typo3.org/TYPO3v4/Extensions/yablog/trunk/class.tx_yablog_ping.php" rel="nofollow">TYPO3v4</a> have some code for easy retrieval of trackback-uri&#8217;s :</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="coMULTI">/**</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; * Fetches ping url from the given url</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; *</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; * @param string $url URL to probe for RDF</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; * @return string Ping URL</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; */</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;protected <span class="kw2">function</span> getPingURL<span class="br0">&#40;</span><span class="re1">$url</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$pingUrl</span> <span class="sy0">=</span> <span class="st0">&#39;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="co1">// Get URL content</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$urlContent</span> <span class="sy0">=</span> t3lib_div<span class="sy0">::</span><span class="me2">getURL</span><span class="br0">&#40;</span><span class="re1">$url</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span><span class="re1">$urlContent</span> <span class="sy0">&amp;&amp;</span> <span class="br0">&#40;</span><span class="re1">$rdfPos</span> <span class="sy0">=</span> <span class="kw3">strpos</span><span class="br0">&#40;</span><span class="re1">$urlContent</span><span class="sy0">,</span> <span class="st0">&#39;&lt;rdf :RDF&#39;</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="sy0">!==</span> <span class="kw2">false</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="co1">// RDF exists in this content. Get it and parse</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$urlContent</span> <span class="sy0">=</span> <span class="kw3">substr</span><span class="br0">&#40;</span><span class="re1">$urlContent</span><span class="sy0">,</span> <span class="re1">$rdfPos</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw1">if</span> <span class="br0">&#40;</span><span class="br0">&#40;</span><span class="re1">$endPos</span> <span class="sy0">=</span> <span class="kw3">strpos</span><span class="br0">&#40;</span><span class="re1">$urlContent</span><span class="sy0">,</span> <span class="st0">&#39;&lt;/rdf:RDF&gt;&#39;</span><span class="sy0">,</span> <span class="re1">$rdfPos</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="sy0">!==</span> <span class="kw2">false</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="co1">// We will use quick regular expression to find ping URL</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$rdfContent</span> <span class="sy0">=</span> <span class="kw3">substr</span><span class="br0">&#40;</span><span class="re1">$urlContent</span><span class="sy0">,</span> <span class="re1">$rdfPos</span><span class="sy0">,</span> <span class="re1">$endPos</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$pingUrl</span> <span class="sy0">=</span> <span class="kw3">preg_replace</span><span class="br0">&#40;</span><span class="st0">&#39;/trackback:ping=&quot;([^&quot;]+)&quot;/&#39;</span><span class="sy0">,</span> <span class="st0">&#39;<span class="es0">\1</span>&#39;</span><span class="sy0">,</span> <span class="re1">$rdfContent</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">return</span> <span class="re1">$pingUrl</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="sy0">&lt;/</span>rdf<span class="sy0">&gt;</span></div>
</li>
</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.juust.org/index.php/curl-trackbacks/2009/03/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>proxies !</title>
		<link>http://www.juust.org/index.php/icanhazproxies/2009/02/</link>
		<comments>http://www.juust.org/index.php/icanhazproxies/2009/02/#comments</comments>
		<pubDate>Sat, 21 Feb 2009 03:41:16 +0000</pubDate>
		<dc:creator>juust</dc:creator>
				<category><![CDATA[php]]></category>
		<category><![CDATA[seo tips and tricks]]></category>
		<category><![CDATA[scrape]]></category>

		<guid isPermaLink="false">http://www.juust.org/?p=336</guid>
		<description><![CDATA[I got a site banned at Google so I got pissed and took a script from the blackbox @ digerati marketing to scrape proxy addresses, wired a database and curl into it, so now it scrapes proxies, random picks a proxy, prunes dead proxies and returns data. 
Basic, it uses anonymous (level 2) proxies, but [...]]]></description>
			<content:encoded><![CDATA[<p>I got a site banned at Google so I got pissed and took a script from the blackbox <a href="http://www.digeratimarketing.co.uk/2008/06/12/blackhat-seo-tools-scripts-the-digerati-blackbox/" rel="nofollow">@ digerati marketing</a> to scrape proxy addresses, wired a database and curl into it, so now it scrapes proxies, random picks a proxy, prunes dead proxies and returns data. </p>
<p>Basic, it uses anonymous (level 2) proxies, but it works. You can check the source <a href="http://serp.trismegistos.net/proxyscript.txt" rel="nofollow">here</a></p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">/* (mysql table)</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">CREATE TABLE IF NOT EXISTS `serp_proxies` (</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; `id` int(11) NOT NULL auto_increment,</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; `ip` text NOT NULL,</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; `port` text NOT NULL,</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; PRIMARY KEY &nbsp;(`id`)</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">) ENGINE=MyISAM &nbsp;DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">*/</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//initialize database class, replace with own code</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">include</span><span class="br0">&#40;</span><span class="st0">&#39;init.php&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//main class</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$p</span><span class="sy0">=</span><span class="kw2">new</span> MyProxies<span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//do I have proxies in the database ?</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//if not, get some and store them</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">GetCount</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="sy0">&lt;</span> <span class="nu0">1</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">GetSomeAir</span><span class="br0">&#40;</span><span class="nu0">1</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">store2database</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//pick one</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">RandomProxy</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//get the page</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">DoRequest</span><span class="br0">&#40;</span><span class="st0">&#39;http://www.domain.com/robots.txt&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//error handling</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">ProxyError</span> <span class="sy0">&gt;</span> <span class="nu0">0</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//7 &nbsp; no connect</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//28 &nbsp; timed out</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//52 &nbsp; empty reply</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//if it is dead, doesn&#39;t allow connections : prune it</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">ProxyError</span><span class="sy0">==</span><span class="nu0">7</span><span class="br0">&#41;</span> <span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">DeleteProxy</span><span class="br0">&#40;</span><span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">proxy_ip</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">ProxyError</span><span class="sy0">==</span><span class="nu0">52</span><span class="br0">&#41;</span> <span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">DeleteProxy</span><span class="br0">&#40;</span><span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">proxy_ip</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//you could loop back until you get a 0-error proxy, but that ain&#39;t the point</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//give me the content</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw3">echo</span> <span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">Content</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw2">Class</span> MyProxies <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$Proxies</span> <span class="sy0">=</span> <span class="kw3">array</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$ThisProxy</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$MyCount</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//picks a random proxy from the database</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">function</span> RandomProxy<span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw3">global</span> <span class="re1">$serpdb</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$offset_result</span> <span class="sy0">=</span> &nbsp;<span class="re1">$serpdb</span><span class="sy0">-&gt;</span><span class="me1">query</span><span class="br0">&#40;</span><span class="st0">&quot;SELECT FLOOR(RAND() * COUNT(*)) AS `offset` FROM `serp_proxies`&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$offset_row</span> <span class="sy0">=</span> <span class="kw3">mysql_fetch_object</span><span class="br0">&#40;</span><span class="re1">$offset_result</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$offset</span> <span class="sy0">=</span> <span class="re1">$offset_row</span><span class="sy0">-&gt;</span><span class="me1">offset</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$result</span> <span class="sy0">=</span> <span class="re1">$serpdb</span><span class="sy0">-&gt;</span><span class="me1">query</span><span class="br0">&#40;</span><span class="st0">&quot;SELECT * FROM `serp_proxies` LIMIT $offset, 1&quot;</span> <span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">while</span><span class="br0">&#40;</span><span class="re1">$row</span><span class="sy0">=</span><span class="kw3">mysql_fetch_assoc</span><span class="br0">&#40;</span><span class="re1">$result</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//make instance of Proxy, with proxy_host ip and port</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span> <span class="sy0">=</span> <span class="kw2">new</span> Proxy<span class="br0">&#40;</span><span class="re1">$row</span><span class="br0">&#91;</span><span class="st0">&#39;ip&#39;</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&#39;:&#39;</span><span class="sy0">.</span><span class="re1">$row</span><span class="br0">&#91;</span><span class="st0">&#39;port&#39;</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">proxy_ip</span> <span class="sy0">=</span> <span class="re1">$row</span><span class="br0">&#91;</span><span class="st0">&#39;ip&#39;</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">proxy_port</span> <span class="sy0">=</span> <span class="re1">$row</span><span class="br0">&#91;</span><span class="st0">&#39;port&#39;</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw1">break</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//visit the famous russian site </span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">function</span> GetSomeAir<span class="br0">&#40;</span><span class="re1">$pages</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$index</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span> <span class="re1">$index</span><span class="sy0">&lt;</span> <span class="re1">$pages</span><span class="sy0">;</span> <span class="re1">$index</span><span class="sy0">++</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$pageno</span> <span class="sy0">=</span> <span class="kw3">sprintf</span><span class="br0">&#40;</span><span class="st0">&quot;%02d&quot;</span><span class="sy0">,</span><span class="re1">$index</span><span class="nu0">+1</span><span class="br0">&#41;</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$page_url</span> <span class="sy0">=</span> <span class="st0">&quot;http://www.samair.ru/proxy/proxy-&quot;</span> <span class="sy0">.</span> <span class="re1">$pageno</span> <span class="sy0">.</span> <span class="st0">&quot;.htm&quot;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$page_html</span> <span class="sy0">=</span> <span class="sy0">@</span><span class="kw3">file_get_contents</span><span class="br0">&#40;</span><span class="re1">$page_url</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//get rid of the crap and extract the proxies</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw3">preg_match</span><span class="br0">&#40;</span><span class="st0">&quot;/&lt;tr&gt;&lt;td&gt;(.*)&lt; <span class="es0">\/</span>td&gt;&lt; <span class="es0">\/</span>tr&gt;/&quot;</span><span class="sy0">,</span> <span class="re1">$page_html</span><span class="sy0">,</span> <span class="re1">$matches</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$txt</span> <span class="sy0">=</span> <span class="re1">$matches</span><span class="br0">&#91;</span><span class="nu0">1</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$main</span> <span class="sy0">=</span> <span class="kw3">split</span><span class="br0">&#40;</span><span class="st0">&#39;&lt;/td&gt;&lt;tr&gt;&lt;td&gt;&#39;</span><span class="sy0">,</span> <span class="re1">$txt</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$x</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$x</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$main</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$x</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="re1">$arr</span> <span class="sy0">=</span> <span class="kw3">split</span><span class="br0">&#40;</span><span class="st0">&#39;&lt;/td&gt;&lt;td&gt;&#39;</span><span class="sy0">,</span> <span class="re1">$main</span><span class="br0">&#91;</span><span class="re1">$x</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">Proxies</span><span class="br0">&#91;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="kw3">split</span><span class="br0">&#40;</span><span class="st0">&#39;:&#39;</span><span class="sy0">,</span> <span class="re1">$arr</span><span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//store the retrieved proxies (stored in this-&gt;Proxies) in the database</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">function</span> store2database<span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw3">global</span> <span class="re1">$serpdb</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">Proxies</span> <span class="kw1">as</span> <span class="re1">$p</span><span class="br0">&#41;</span> <span class="br0">&#123;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$result</span> <span class="sy0">=</span> <span class="re1">$serpdb</span><span class="sy0">-&gt;</span><span class="me1">query</span><span class="br0">&#40;</span><span class="st0">&quot;SELECT * FROM serp_proxies WHERE ip=&#39;&quot;</span><span class="sy0">.</span><span class="re1">$p</span><span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&quot;&#39;&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">mysql_num_rows</span><span class="br0">&#40;</span><span class="re1">$result</span><span class="br0">&#41;</span><span class="sy0">&amp;</span>lt<span class="sy0">;</span><span class="nu0">1</span><span class="br0">&#41;</span> <span class="re1">$serpdb</span><span class="sy0">-&gt;</span><span class="me1">query</span><span class="br0">&#40;</span><span class="st0">&quot;INSERT INTO serp_proxies (`ip`, `port`) VALUES (&#39;&quot;</span><span class="sy0">.</span><span class="re1">$p</span><span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&quot;&#39;, &#39;&quot;</span><span class="sy0">.</span><span class="re1">$p</span><span class="br0">&#91;</span><span class="nu0">1</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&quot;&#39;)&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$serpdb</span><span class="sy0">-&gt;</span><span class="me1">query</span><span class="br0">&#40;</span><span class="st0">&quot;DELETE FROM serp_proxies WHERE `ip`=&#39;&#39;&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">function</span> DeleteProxy<span class="br0">&#40;</span><span class="re1">$ip</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw3">global</span> <span class="re1">$serpdb</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$serpdb</span><span class="sy0">-&gt;</span><span class="me1">query</span><span class="br0">&#40;</span><span class="st0">&quot;DELETE FROM serp_proxies WHERE `ip`=&#39;&quot;</span><span class="sy0">.</span><span class="re1">$ip</span><span class="sy0">.</span><span class="st0">&quot;&#39;&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span> &nbsp; </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">function</span> GetCount<span class="br0">&#40;</span><span class="br0">&#41;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//use this to check how many proxies there are in the database</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw3">global</span> <span class="re1">$serpdb</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">MyCount</span> <span class="sy0">=</span> <span class="kw3">mysql_num_rows</span><span class="br0">&#40;</span><span class="re1">$serpdb</span><span class="sy0">-&gt;</span><span class="me1">query</span><span class="br0">&#40;</span><span class="st0">&quot;SELECT * FROM `serp_proxies`&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">return</span> <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">MyCount</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw2">Class</span> Proxy <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$proxy_ip</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$proxy_port</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$proxy_host</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$proxy_auth</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$ch</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$Content</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$USERAGENT</span> <span class="sy0">=</span> <span class="st0">&quot;Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)&quot;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$ProxyError</span> <span class="sy0">=</span> <span class="nu0">0</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$ProxyErrorMsg</span> <span class="sy0">=</span> <span class="st0">&#39;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$TimeOut</span><span class="sy0">=</span><span class="nu0">3</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$IncludeHeaders</span> <span class="sy0">=</span> <span class="nu0">0</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">function</span> Proxy<span class="br0">&#40;</span><span class="re1">$host</span><span class="sy0">,</span> <span class="re1">$username</span><span class="sy0">=</span><span class="st0">&#39;&#39;</span><span class="sy0">,</span> <span class="re1">$pwd</span><span class="sy0">=</span><span class="st0">&#39;&#39;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//initialize class, set host </span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">proxy_host</span> <span class="sy0">=</span> <span class="re1">$host</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">if</span> <span class="br0">&#40;</span><span class="kw3">strlen</span><span class="br0">&#40;</span><span class="re1">$username</span><span class="br0">&#41;</span> <span class="sy0">&gt;</span> <span class="nu0">0</span> <span class="sy0">||</span> <span class="kw3">strlen</span><span class="br0">&#40;</span><span class="re1">$pwd</span><span class="br0">&#41;</span> <span class="sy0">&gt;</span> <span class="nu0">0</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">proxy_auth</span> <span class="sy0">=</span> <span class="re1">$username</span><span class="sy0">.</span><span class="st0">&quot;:&quot;</span><span class="sy0">.</span><span class="re1">$pwd</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">function</span> CURL_PROXY<span class="br0">&#40;</span><span class="re1">$cc</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw1">if</span> <span class="br0">&#40;</span><span class="kw3">strlen</span><span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">proxy_host</span><span class="br0">&#41;</span> <span class="sy0">&gt;</span> <span class="nu0">0</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; curl_setopt<span class="br0">&#40;</span><span class="re1">$cc</span><span class="sy0">,</span> CURLOPT_PROXY<span class="sy0">,</span> <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">proxy_host</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span><span class="kw3">strlen</span><span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">proxy_auth</span><span class="br0">&#41;</span> <span class="sy0">&gt;</span> <span class="nu0">0</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;curl_setopt<span class="br0">&#40;</span><span class="re1">$cc</span><span class="sy0">,</span> CURLOPT_PROXYUSERPWD<span class="sy0">,</span> <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">proxy_auth</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">function</span> DoRequest<span class="br0">&#40;</span><span class="re1">$url</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span> <span class="sy0">=</span> curl_init<span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; curl_setopt<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="sy0">,</span> CURLOPT_URL<span class="sy0">,</span><span class="re1">$url</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">CURL_PROXY</span><span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; curl_setopt<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="sy0">,</span> CURLOPT_HEADER<span class="sy0">,</span> <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">IncludeHeaders</span><span class="br0">&#41;</span><span class="sy0">;</span> <span class="co1">// baca header</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; </div>
</li>
<li class="li1">
<div class="de1">&nbsp; curl_setopt<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="sy0">,</span> CURLOPT_USERAGENT<span class="sy0">,</span> <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">USERAGENT</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; curl_setopt<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="sy0">,</span> CURLOPT_RETURNTRANSFER<span class="sy0">,</span> <span class="nu0">1</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; curl_setopt<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="sy0">,</span> CURLOPT_TIMEOUT<span class="sy0">,</span> <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">TimeOut</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">Content</span> <span class="sy0">=</span> curl_exec<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//if an error occurs, store the number and message</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span>curl_errno<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="br0">&#41;</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#123;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ProxyError</span> <span class="sy0">=</span> &nbsp;curl_errno<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ProxyErrorMsg</span> <span class="sy0">=</span> &nbsp;curl_error<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="sy0">&lt;/</span>td<span class="sy0">&gt;&lt;/</span>count<span class="sy0">&gt;&lt;/</span>td<span class="sy0">&gt;&lt;/</span>tr<span class="sy0">&gt;</span></div>
</li>
</ol>
</div>
<p>There is not much to say about it, just a rough outline. I would prefer elite level 1 proxies but for now it will have to do.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.juust.org/index.php/icanhazproxies/2009/02/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>synonymizer with api</title>
		<link>http://www.juust.org/index.php/synonymizer-with-api/2008/12/</link>
		<comments>http://www.juust.org/index.php/synonymizer-with-api/2008/12/#comments</comments>
		<pubDate>Sun, 28 Dec 2008 12:09:46 +0000</pubDate>
		<dc:creator>juust</dc:creator>
				<category><![CDATA[optimization]]></category>
		<category><![CDATA[seo tips and tricks]]></category>
		<category><![CDATA[optimisation]]></category>

		<guid isPermaLink="false">http://www.juust.org/?p=172</guid>
		<description><![CDATA[If you want to put some old content on the net and have it indexed as fresh unique content, this works wonders for seo-friendly backlinks : the automated synonymizer. I want one that makes my content unique without having to type one character.
Lucky for me, mister John Watson&#8217;s synonym database comes with a free 10.000 [...]]]></description>
			<content:encoded><![CDATA[<p>If you want to put some old content on the net and have it indexed as fresh unique content, this works wonders for seo-friendly backlinks : the automated synonymizer. I want one that makes my content unique without having to type one character.</p>
<p>Lucky for me, mister <a href="http://words.bighugelabs.com/" rel="nofollow">John Watson&#8217;s synonym database</a> comes with a free 10.000 request a day API and boy is it sweet! </p>
<p>API Requests are straightforward :<br />
http://words.bighugelabs.com/api/2/[<a href="http://words.bighugelabs.com/api.php" rel="nofollow">apikey</a>]/[keyword]/xml</p>
<p>A number of return formats are supported but xml is easiest, either for parsing with simplexml or regular pattern matching.</p>
<p>It returns on request :<br />
<strong>black</strong> (slightly shortened)<br />
an xml file like :<br />
&lt;words&gt;<br />
&lt;w p=&#8221;adjective&#8221; r=&#8221;syn&#8221;&gt;bleak&lt;/w&gt;<br />
&lt;w p=&#8221;adjective&#8221; r=&#8221;syn&#8221;&gt;sinister&lt;/w&gt;<br />
&lt;w p=&#8221;adjective&#8221; r=&#8221;sim&#8221;&gt;dark&lt;/w&gt;<br />
&lt;w p=&#8221;adjective&#8221; r=&#8221;sim&#8221;&gt;angry&lt;/w&gt;<br />
&lt;w p=&#8221;noun&#8221; r=&#8221;syn&#8221;&gt;blackness&lt;/w&gt;<br />
&lt;w p=&#8221;noun&#8221; r=&#8221;syn&#8221;&gt;inkiness&lt;/w&gt;<br />
&lt;w p=&#8221;verb&#8221; r=&#8221;syn&#8221;&gt;blacken&lt;/w&gt;<br />
&lt;w p=&#8221;verb&#8221; r=&#8221;syn&#8221;&gt;melanize&lt;/w&gt;<br />
&lt;/words&gt;</p>
<p>&#8230;which is easiest handled with preg_match_all :</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="kw2">function</span> getsynonyms<span class="br0">&#40;</span><span class="re1">$keyword</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$pick</span> <span class="sy0">=</span> <span class="kw3">array</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$apikey</span> <span class="sy0">=</span> <span class="st0">&#39;get your own key&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$xml</span><span class="sy0">=</span><span class="kw3">file_get_contents</span><span class="br0">&#40;</span><span class="st0">&#39;http://words.bighugelabs.com/api/2/&#39;</span><span class="sy0">.</span><span class="re1">$apikey</span><span class="sy0">.</span><span class="st0">&#39;/&#39;</span><span class="sy0">.</span><span class="re1">$keyword</span><span class="sy0">.</span><span class="st0">&#39;/xml&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="sy0">!</span><span class="re1">$xml</span><span class="br0">&#41;</span> <span class="kw1">return</span> <span class="re1">$pick</span><span class="sy0">;</span> <span class="co1">//return empty array</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw3">preg_match_all</span><span class="br0">&#40;</span><span class="st0">&#39;/&lt;w p=&quot;adjective&quot; r=&quot;syn&quot;&gt;(.*?)&lt; <span class="es0">\/</span>w&gt;/&#39;</span><span class="sy0">,</span> <span class="re1">$xml</span><span class="sy0">,</span> <span class="re1">$adj_syns</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//preg_match_all(&#39;/&lt;/w&gt;&lt;w p=&quot;adjective&quot; r=&quot;sim&quot;&gt;(.*?)&lt; \/w&gt;/&#39;, $xml, $adj_sims);</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//preg_match_all(&#39;/&lt;/w&gt;&lt;w p=&quot;noun&quot; r=&quot;syn&quot;&gt;(.*?)&lt; \/w&gt;/&#39;, $xml, $noun_syns);</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//preg_match_all(&#39;/&lt;/w&gt;&lt;w p=&quot;verb&quot; r=&quot;syn&quot;&gt;(.*?)&lt; \/w&gt;/&#39;, $xml, $verb_syns);</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$adj_syns</span><span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span> <span class="kw1">as</span> <span class="re1">$adj_syn</span><span class="br0">&#41;</span> <span class="re1">$pick</span><span class="br0">&#91;</span><span class="br0">&#93;</span><span class="sy0">=</span><span class="re1">$adj_syn</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="co1">//same for verb/noun synonyms, I just want adjectives</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">return</span> <span class="re1">$pick</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="sy0">&lt;/</span>w<span class="sy0">&gt;</span></div>
</li>
</ol>
</div>
<p>practically applying it,<br />
I take a slab of stale old content and&#8230;</p>
<ul>
<li>strip tags</li>
<li>do a regular match on all alphanumeric sequences dropping other stuff</li>
<li>trim the resulting array elements</li>
<li>(merge all blog tags, categories, and a list of common words)</li>
<li>excluding common terms from the array with text elements</li>
<li>excluding words smaller than N characters</li>
<li>set a percentage words to be synonimized</li>
<li>attempt to retrieve synonyms for remaining terms</li>
<li>replace these words in the original text, keep count</li>
<li>when I reach the target replacement percentage, abort</li>
<li>return (hopefully) a revived text</li>
</ul>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="kw2">function</span> synonymize<span class="br0">&#40;</span><span class="re1">$origtext</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//make a copy of the original text to dissect</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$content</span><span class="sy0">=</span><span class="re1">$origtext</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//content = $this-&gt;body;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$perc</span><span class="sy0">=</span><span class="nu0">3</span><span class="sy0">;</span> &nbsp; <span class="co1">//target percentage changed terms</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$minlength</span><span class="sy0">=</span><span class="nu0">4</span><span class="sy0">;</span> &nbsp;<span class="co1">//minimum length candidates</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$maxrequests</span><span class="sy0">=</span><span class="nu0">80</span><span class="sy0">;</span> <span class="co1">//max use of api-requests</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//dump tags </span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$content</span> <span class="sy0">=</span> &nbsp;<span class="kw3">strip_tags</span><span class="br0">&#40;</span><span class="re1">$content</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//dump non-alphanumeric string characters</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$content</span> <span class="sy0">=</span> <span class="kw3">preg_replace</span><span class="br0">&#40;</span><span class="st0">&#39;/[^A-Za-z0-9<span class="es0">\-</span>]/&#39;</span><span class="sy0">,</span> <span class="st0">&#39; &#39;</span><span class="sy0">,</span> <span class="re1">$content</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//explode on blank space</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$wrds</span> <span class="sy0">=</span> <span class="kw3">explode</span><span class="br0">&#40;</span><span class="st0">&#39; &#39;</span><span class="sy0">,</span> <span class="kw3">strtolower</span><span class="br0">&#40;</span><span class="re1">$content</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//trim off blank spaces just in case</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$w</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$w</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$wrds</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$w</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="re1">$words</span><span class="br0">&#91;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="kw3">trim</span><span class="br0">&#40;</span><span class="re1">$wrds</span><span class="br0">&#91;</span><span class="re1">$w</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//this should be all words</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$wordcount</span> <span class="sy0">=</span> <span class="kw3">count</span><span class="br0">&#40;</span><span class="re1">$words</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//how many words do I want changed ?</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$toswitch</span> <span class="sy0">=</span> <span class="kw3">round</span><span class="br0">&#40;</span><span class="re1">$wordcount</span><span class="sy0">*</span><span class="re1">$perc</span><span class="sy0">/</span><span class="nu0">100</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//only use uniques</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$words_unique</span><span class="sy0">=</span><span class="kw3">array_unique</span><span class="br0">&#40;</span><span class="re1">$words</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//sort, start with words at the end of the text </span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw3">sort</span><span class="br0">&#40;</span><span class="re1">$words_unique</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//merge common with tags, categories, linked_tags</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$common</span> <span class="sy0">=</span> <span class="kw3">array</span><span class="br0">&#40;</span><span class="st0">&quot;never&quot;</span><span class="sy0">,</span> <span class="st0">&quot;about&quot;</span><span class="sy0">,</span> <span class="st0">&quot;price&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//note : setting the minlength to 4 excludes lots of common terms</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$i</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">&lt;</span>count<span class="br0">&#40;</span><span class="re1">$words_unique</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//if in common array, not selectable for synonymizing</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">in_array</span><span class="br0">&#40;</span><span class="re1">$words_unique</span><span class="br0">&#91;</span><span class="re1">$i</span><span class="br0">&#93;</span><span class="sy0">,</span> <span class="re1">$common</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span><span class="br0">&#125;</span> <span class="kw1">else</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="co1">//only terms bigger than minlength</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">strlen</span><span class="br0">&#40;</span><span class="re1">$words_unique</span><span class="br0">&#91;</span><span class="re1">$i</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">&gt;</span><span class="re1">$minlength</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="co1">//words_select contains candidates for synonyms</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$words_select</span><span class="br0">&#91;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="kw3">trim</span><span class="br0">&#40;</span><span class="re1">$words_unique</span><span class="br0">&#91;</span><span class="re1">$i</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//terms that can be changed</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$max</span> <span class="sy0">=</span> <span class="kw3">count</span><span class="br0">&#40;</span><span class="re1">$words_select</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//no more requests than max</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$max</span><span class="sy0">&gt;</span><span class="re1">$maxrequests</span><span class="br0">&#41;</span> <span class="re1">$max</span><span class="sy0">=</span><span class="re1">$maxrequests</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$i</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">&lt;</span> <span class="re1">$max</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//get synonyms, give server some time</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw3">usleep</span><span class="br0">&#40;</span><span class="nu0">100000</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="co1">//retrieve synonyms etc.</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$these_words</span> <span class="sy0">=</span> getsynonyms<span class="br0">&#40;</span><span class="re1">$words_select</span><span class="br0">&#91;</span><span class="re1">$i</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$jmax</span><span class="sy0">=</span><span class="kw3">count</span><span class="br0">&#40;</span><span class="re1">$these_words</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$jmax</span><span class="sy0">&amp;</span>lt<span class="sy0">;</span><span class="nu0">1</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="co1">//no results</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span> <span class="kw1">else</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$count</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$j</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//the replacements are done in the original text</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$origtext</span><span class="sy0">=</span> <span class="kw3">preg_replace</span><span class="br0">&#40;</span><span class="st0">&#39;/&#39;</span><span class="sy0">.</span><span class="re1">$words_select</span><span class="br0">&#91;</span><span class="re1">$i</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&#39;/i&#39;</span><span class="sy0">,</span> <span class="re1">$these_words</span><span class="br0">&#91;</span><span class="re1">$j</span><span class="br0">&#93;</span><span class="sy0">,</span> <span class="re1">$origtext</span><span class="sy0">,</span> <span class="nu0">-1</span><span class="sy0">,</span> <span class="re1">$count</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$total_switched</span><span class="sy0">+=</span><span class="re1">$count</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span> <span class="co1">//have we reached the percentage ? </span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$total_switched</span><span class="sy0">&gt;=</span><span class="re1">$toswitch</span><span class="br0">&#41;</span> <span class="kw1">break</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//okay!</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">return</span> <span class="re1">$origtext</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw2">function</span> getsynonyms<span class="br0">&#40;</span><span class="re1">$keyword</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$pick</span><span class="sy0">=</span><span class="kw3">array</span> <span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$apikey</span> <span class="sy0">=</span> <span class="st0">&#39;get your own key at bighugelabs.com&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$xml</span><span class="sy0">=@</span><span class="kw3">file_get_contents</span><span class="br0">&#40;</span><span class="st0">&#39;http://words.bighugelabs.com/api/2/&#39;</span><span class="sy0">.</span><span class="re1">$apikey</span><span class="sy0">.</span><span class="st0">&#39;/&#39;</span><span class="sy0">.</span><span class="kw3">urlencode</span><span class="br0">&#40;</span><span class="re1">$keyword</span><span class="br0">&#41;</span><span class="sy0">.</span><span class="st0">&#39;/xml&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="sy0">!</span><span class="re1">$xml</span><span class="br0">&#41;</span> <span class="kw1">return</span> <span class="re1">$pick</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw3">preg_match_all</span><span class="br0">&#40;</span><span class="st0">&#39;/&lt;w p=&quot;adjective&quot; r=&quot;syn&quot;&gt;(.*?)&lt; <span class="es0">\/</span>w&gt;/&#39;</span><span class="sy0">,</span> <span class="re1">$xml</span><span class="sy0">,</span> <span class="re1">$adj_syns</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$adj_syns</span><span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span> <span class="kw1">as</span> <span class="re1">$adj_syn</span><span class="br0">&#41;</span> <span class="re1">$pick</span><span class="br0">&#91;</span><span class="br0">&#93;</span><span class="sy0">=</span><span class="re1">$adj_syn</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">return</span> <span class="re1">$pick</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="sy0">&lt;/</span>w<span class="sy0">&gt;&lt;/</span>count<span class="sy0">&gt;</span></div>
</li>
</ol>
</div>
<p>Nothing fancy, a straightforward search-replace routine. A 1200 word text has about 150 candidates and for 3% synonyms I need to replace 36 words, it can do that. If I were to use it for real I would build a table with non-returning terms, and store often used terms, that would speed up the synonimizing, allow the use of preferences and take a load of the api use.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.juust.org/index.php/synonymizer-with-api/2008/12/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>RedHat Seo : scraper auto-blogging</title>
		<link>http://www.juust.org/index.php/redhat-seo-christmas-edition/2008/12/</link>
		<comments>http://www.juust.org/index.php/redhat-seo-christmas-edition/2008/12/#comments</comments>
		<pubDate>Fri, 26 Dec 2008 18:07:01 +0000</pubDate>
		<dc:creator>juust</dc:creator>
				<category><![CDATA[google]]></category>
		<category><![CDATA[seo]]></category>
		<category><![CDATA[seo tips and tricks]]></category>
		<category><![CDATA[tool]]></category>
		<category><![CDATA[wordpress]]></category>
		<category><![CDATA[xml-rpc]]></category>
		<category><![CDATA[scrape]]></category>

		<guid isPermaLink="false">http://www.juust.org/?p=270</guid>
		<description><![CDATA[Just give us your endpoint and we&#8217;ll take it from there, sparky!
I was going to make one of these tools to scrape google and conjur a full blog out of nowhere, as Christmas special, RedHat Seo. The rough sketch has arrived , far from perfect, but it does produce a blog and don&#8217;t even look [...]]]></description>
			<content:encoded><![CDATA[<blockquote><p>Just give us your endpoint and we&#8217;ll take it from there, sparky!</p></blockquote>
<p>I was going to make one of these tools to scrape google and conjur a full blog out of nowhere, as Christmas special, RedHat Seo. The rough sketch has arrived , far from perfect, but it does produce a blog and don&#8217;t even look too shabby. I scraped a <a href="" rel="nofollow" target="_blank">small batch</a> of posts off of blogs, keeping the links intact and adding a tribute links. I hope they will pardon me for it. </p>
<h3>structure</h3>
<p>I use three main classes, </p>
<table>
<tbody>
<tr>
<td>BlogMaker    </td>
<td>     the application</td>
</tr>
<tr>
<td>Target         </td>
<td>     the blogs you aim for</td>
</tr>
<tr>
<td>WPContent   </td>
<td>     the scraped goodies</td>
</tr>
</tbody>
</table>
<p>&#8230;and two support classes</p>
<table>
<tbody>
<tr>
<td>SerpResult    </td>
<td>    scraped urls</td>
</tr>
<tr>
<td>Custom_RPC   </td>
<td>    a simple rpc-poster</td>
</tr>
</tbody>
</table>
<p>Target blogs have three texts, </p>
<table>
<tbody>
<tr>
<td>file</td>
<td>contents</td>
<td>maintenance</td>
</tr>
<tr>
<td>blog categories</td>
<td>category you post under</td>
<td>manual</td>
</tr>
<tr>
<td>blog tags</td>
<td> tags you list on the blog</td>
<td>manual</td>
</tr>
<tr>
<td>blog urls</td>
<td> urls already used for the blog</td>
<td>system</td>
</tr>
</tbody>
</table>
<h3>routine</h3>
<p>The BlogMaker class grabs a result list (up to 1000 urls per phrase) from Google, extracts the urls and stores them in SerpResult,  scrapes the urls and extracts the <strong>entry</strong> divs, stores div-entries in the WPContent class (that has some basic functions to sanitize the text), and uses the BlogTarget-definitions to post it up blogs with xml-rpc.</p>
<h3>usage</h3>
<p>My highlighter tends to mess up text with div markers in it, copying off the blog may not work,<br />
the full text source (about 500 lines) is <a href="http://serp.trismegistos.net/fastblog.txt" target="_blank" rel="nofollow">overhere</a>. Underneath I&#8217;ll list the main program loop :</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//make main instance</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$Blog</span> <span class="sy0">=</span> <span class="kw2">new</span> BlogMaker<span class="br0">&#40;</span><span class="st0">&quot;keyword&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//define a target blog, you can define multiple blogs and refer with code</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//then add rpc-url, password and user</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//and for every target blog three text-files </span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$T</span><span class="sy0">=</span><span class="re1">$Blog</span><span class="sy0">-&gt;</span><span class="me1">AddTarget</span><span class="br0">&#40;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&#39;blogcode&#39;</span><span class="sy0">,</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&#39;http://my.blog.com/xmlrpc.php&#39;</span><span class="sy0">,</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&#39;password&#39;</span><span class="sy0">,</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&#39;user&#39;</span><span class="sy0">,</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&#39;keyword.categories.txt&#39;</span><span class="sy0">,</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&#39;keyword.tags.txt&#39;</span><span class="sy0">,</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&#39;keyword.urls.txt&#39;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//read the tags, cats and url text files stored on the server </span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//all retrieved urls are tested, if the target blog already has that</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//scraped url, it is discarded.</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$T</span><span class="sy0">-&gt;</span><span class="me1">CSV_GetTags</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$T</span><span class="sy0">-&gt;</span><span class="me1">List_GetCats</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$T</span><span class="sy0">-&gt;</span><span class="me1">ReadURL</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//grab the google result list</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//use params (pages, keywords) to specify search</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$Blog</span><span class="sy0">-&gt;</span><span class="me1">GoogleResults</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$a</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$Blog</span><span class="sy0">-&gt;</span><span class="me1">Results</span> <span class="kw1">as</span> <span class="re1">$BlogUrl</span><span class="br0">&#41;</span> <span class="br0">&#123;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$a</span><span class="sy0">++;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw3">echo</span> <span class="re1">$BlogUrl</span><span class="sy0">-&gt;</span><span class="me1">url</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//see if the url isnt used yet</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$T</span><span class="sy0">-&gt;</span><span class="me1">checkURL</span><span class="br0">&#40;</span><span class="kw3">trim</span><span class="br0">&#40;</span><span class="re1">$BlogUrl</span><span class="sy0">-&gt;</span><span class="me1">url</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">!=</span><span class="kw2">true</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw3">echo</span> <span class="st0">&#39;&#8230;checking &#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw3">flush</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//if not used, get the source</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$BlogUrl</span><span class="sy0">-&gt;</span><span class="me1">scrape</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//check for divs marked &quot;entry&quot;, if they arent there, check &quot;post&quot;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//some blogs use other indications for the content</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//but entry and post cover 40%</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$entries</span> <span class="sy0">=</span> <span class="re1">$BlogUrl</span><span class="sy0">-&gt;</span><span class="me1">get_entries</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">count</span><span class="br0">&#40;</span><span class="re1">$entries</span><span class="br0">&#41;</span><span class="sy0">&amp;</span>lt<span class="sy0">;</span><span class="nu0">1</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw3">echo</span> <span class="st0">&#39;no entries&#8230;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw3">flush</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$entries</span> <span class="sy0">=</span> <span class="re1">$BlogUrl</span><span class="sy0">-&gt;</span><span class="me1">get_posts</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">count</span><span class="br0">&#40;</span><span class="re1">$entries</span><span class="br0">&#41;</span><span class="sy0">&amp;</span>lt<span class="sy0">;</span><span class="nu0">1</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; <span class="kw3">echo</span> <span class="st0">&#39;no posts either&#8230;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//if no entry-post div, mark url as done</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; <span class="re1">$T</span><span class="sy0">-&gt;</span><span class="me1">RegisterURL</span><span class="br0">&#40;</span><span class="re1">$BlogUrl</span><span class="sy0">-&gt;</span><span class="me1">url</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$ct</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$BlogUrl</span><span class="sy0">-&gt;</span><span class="me1">WpContentPieces</span> <span class="kw1">as</span> <span class="re1">$WpContent</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//in the get_entries/get_post function the fragments are stored</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//as wpcontent</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$ct</span><span class="sy0">++;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$WpContent</span><span class="sy0">-&gt;</span><span class="me1">judge</span><span class="br0">&#40;</span><span class="nu0">2000</span><span class="sy0">,</span> <span class="nu0">200</span><span class="sy0">,</span> <span class="nu0">5</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="re1">$WpContent</span><span class="sy0">-&gt;</span><span class="me1">tribute</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span> &nbsp;<span class="co1">//add tribute link</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="re1">$T</span><span class="sy0">-&gt;</span><span class="me1">settags</span><span class="br0">&#40;</span><span class="re1">$WpContent</span><span class="sy0">-&gt;</span><span class="me1">divcontent</span><span class="br0">&#41;</span><span class="sy0">;</span> <span class="co1">//add tags</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="re1">$T</span><span class="sy0">-&gt;</span><span class="me1">postCustomRPC</span><span class="br0">&#40;</span><span class="re1">$WpContent</span><span class="sy0">-&gt;</span><span class="me1">title</span><span class="sy0">,</span> <span class="re1">$WpContent</span><span class="sy0">-&gt;</span><span class="me1">divcontent</span><span class="sy0">,</span> <span class="nu0">1</span><span class="br0">&#41;</span><span class="sy0">;</span> <span class="co1">//1=publish, 0=draft</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="re1">$T</span><span class="sy0">-&gt;</span><span class="me1">RegisterURL</span><span class="br0">&#40;</span><span class="re1">$WpContent</span><span class="sy0">-&gt;</span><span class="me1">url</span><span class="br0">&#41;</span><span class="sy0">;</span> &nbsp;<span class="co1">//register use of url</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw3">usleep</span><span class="br0">&#40;</span><span class="nu0">20000000</span><span class="br0">&#41;</span><span class="sy0">;</span> &nbsp;<span class="co1">//20 seconds break, for sitemapping</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
</ol>
</div>
<h3>notes</h3>
<ul>
<li>xml-rpc needs to be activated explicitly on the wordpress dashboard under settings/writing.</li>
<li>categories must be present in the blog</li>
<li>url file must be writeable by the server (777)</li>
</ul>
<p>It seems wordpress builds the sitemap as background process, the standard google xml sitemap plugin wil attempt to build in the cache (takes anywhere between 2 and 10 seconds), and apart from building a sitemap the posts also get pinged around. Giving the install 10 to 20 seconds between posts allows for all the hooked in functions to be completed.</p>
<h3>period</h3>
<p>That&#8217;s about all,<br />
consider it gpl, I added some comments in the source but I will not develop this any further. A mysql backed blogfarm tool (euphemistically called &#8216;publishing tool&#8217;) is more interesting, besides, I am off to the wharves to do some painting.</p>
<p>if you use it, send some feedback,<br />
merry christmas dogheads</p>
]]></content:encoded>
			<wfw:commentRss>http://www.juust.org/index.php/redhat-seo-christmas-edition/2008/12/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>How to grab keywords from 7search</title>
		<link>http://www.juust.org/index.php/how-to-grab-keywords-from-7search/2008/10/</link>
		<comments>http://www.juust.org/index.php/how-to-grab-keywords-from-7search/2008/10/#comments</comments>
		<pubDate>Fri, 24 Oct 2008 09:00:57 +0000</pubDate>
		<dc:creator>juust</dc:creator>
				<category><![CDATA[seo tips and tricks]]></category>
		<category><![CDATA[wordpress]]></category>

		<guid isPermaLink="false">http://www.juust.org/?p=198</guid>
		<description><![CDATA[&#8220;Seo tips and tricks&#8221; was not due til November, but this one just popped up today. I was looking for a tool to build a rapid keyword set for a blog, without doing extensive keyword research. The blackhat &#8217;scraper&#8217; scripts I found come up with  &#8216;michigan seo&#8217; far too many times :) so I [...]]]></description>
			<content:encoded><![CDATA[<p>&#8220;Seo tips and tricks&#8221; was not due til November, but this one just popped up today. I was looking for a tool to build a rapid keyword set for a blog, without doing extensive keyword research. The blackhat &#8217;scraper&#8217; scripts I found come up with  &#8216;michigan seo&#8217; far too many times :) so I built me a <a href="http://www.blacknorati.com/blacklist.txt" rel="nofollow" target="_blank">quick alternative.</a>  </p>
<h4>How to grab keyword sets from 7Search</h4>
<p>I want a set of keywords as blog categories to write a blog that contains material with the most popular keywords covering the whole active search pattern set. A nice tool for that is <a href="http://conversion.7search.com/scripts/advertisertools/keywordsuggestion.aspx" rel="nofollow">7Search</a>&#8217;s keyword tool. </p>
<p>It has a captcha protection, you have to answer it once and then you can query as much as you like, it shows last months top 100 search patterns with that keyword and the search volumes : </p>
<table cellspacing="3" cellpadding="2" rules="all" border="0" width="100%">
<tr align="Center" bgcolor="#F0F0F0">
<td><font face="Arial" size="2"><b>seo</b><br />
</font></td>
<td>	1,991,112	</td>
<td>$0.34</td>
<td>$0.33</td>
<td>$0.21</td>
<td>$0.09</td>
<td>	$0.08</td>
</tr>
<tr align="Center" bgcolor="#F0F0F0">
<td><font face="Arial" size="2"><b>seo web design</b></font></td>
<td>8,085</td>
<td>$0.07</td>
<td>$0.02</td>
<td>$0.01</td>
<td></td>
</tr>
<tr align="Center" bgcolor="#F0F0F0">
<td><font face="Arial" size="2"><b>seo tool</b></font></td>
<td>2,647</td>
<td>$0.05</td>
<td>$0.02</td>
<td>$0.02</td>
<td>$0.01</td>
</tr>
</table>
<p>As I am extremely lazy and hate typing data, I&#8217;ll make a quick script to cut and paste that list and have it magically transformed in a wordpress blog category list.</p>
<p>It turned out to be a simple one page program : <a href="http://www.blacknorati.com/blacklist.txt" rel="nofollow">source text file</a>. Cut and paste stuff, I do a query on a keyword, select the result  table area of the 7Search page (with the mouse : ) and paste it as text  into my own form&#8217;s textarea, add the main key, and post it. </p>
<p>From the $_POST array, I take the textarea input and explode it on linebreaks. To get the keywords, I check for the first occurrence of 0-9, take the part that comes before it, and have the keywords.</p>
<p>In this function I test for the first 0-9. Had I stopped at the first number and started at 0, I would get thrown out of the loop if there is any 0 in the line (or 1, 2, 3&#8230;), regardless of there being any number before the first 0 : </p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="re1">$pos</span> <span class="sy0">=</span> <span class="kw3">strpos</span><span class="br0">&#40;</span><span class="re1">$linesarr</span><span class="br0">&#91;</span><span class="re1">$x</span><span class="br0">&#93;</span><span class="sy0">,</span> <span class="re1">$i</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$pos</span> <span class="sy0">&gt;</span><span class="nu0">0</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$pos</span><span class="sy0">&lt;</span> <span class="re1">$minpos</span><span class="br0">&#41;</span> <span class="br0">&#123;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$mykeys</span> <span class="sy0">=</span> <span class="kw3">substr</span><span class="br0">&#40;</span><span class="re1">$linesarr</span><span class="br0">&#91;</span><span class="re1">$x</span><span class="br0">&#93;</span><span class="sy0">,</span><span class="nu0">0</span><span class="sy0">,</span><span class="re1">$minpos</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw3">echo</span> <span class="re1">$mykeys</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">break</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span><span class="br0">&#125;</span><span class="br0">&#125;</span></div>
</li>
</ol>
</div>
<p>So I test for the first 0 and store the position in minpos, then test for the first 1, 2, 3&#8230;, if it comes before the first 0, minpos is set to the lowest position.</p>
</pre>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="re1">$lines</span><span class="sy0">=</span><span class="re1">$_POST</span><span class="br0">&#91;</span><span class="st0">&#39;textarea&#39;</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$linesarr</span> <span class="sy0">=</span> <span class="kw3">explode</span><span class="br0">&#40;</span><span class="st0">&quot;<span class="es0">\r</span><span class="es0">\n</span>&quot;</span><span class="sy0">,</span> <span class="re1">$lines</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">for</span> <span class="br0">&#40;</span><span class="re1">$x</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$x</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$linesarr</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$x</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//set minpos to the length of the line</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$minpos</span><span class="sy0">=</span><span class="kw3">strlen</span><span class="br0">&#40;</span><span class="re1">$linesarr</span><span class="br0">&#91;</span><span class="re1">$x</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//check numbers 0-9</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$i</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">&amp;</span>lt<span class="sy0">;</span><span class="nu0">10</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//get position of first number $i in the line</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$pos</span> <span class="sy0">=</span> <span class="kw3">strpos</span><span class="br0">&#40;</span><span class="re1">$linesarr</span><span class="br0">&#91;</span><span class="re1">$x</span><span class="br0">&#93;</span><span class="sy0">,</span> <span class="re1">$i</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$pos</span> <span class="sy0">&gt;</span><span class="nu0">0</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$pos</span><span class="sy0">&lt;</span> <span class="re1">$minpos</span><span class="br0">&#41;</span> <span class="re1">$minpos</span><span class="sy0">=</span><span class="re1">$pos</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#125;</span> &nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//is minpos smaller than the length of the line ? then its valid data</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$minpos</span><span class="sy0">&lt;</span>strlen<span class="br0">&#40;</span><span class="re1">$linesarr</span><span class="br0">&#91;</span><span class="re1">$x</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$mykeys</span> <span class="sy0">=</span> <span class="kw3">substr</span><span class="br0">&#40;</span><span class="re1">$linesarr</span><span class="br0">&#91;</span><span class="re1">$x</span><span class="br0">&#93;</span><span class="sy0">,</span><span class="nu0">0</span><span class="sy0">,</span><span class="re1">$minpos</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw3">echo</span> <span class="re1">$mykeys</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
</ol>
</div>
<p>That way I always get the first number in the line and the part before it is the whole keyword text. </p>
<p>I also want the search volumes, which is the first full string after the keywords up till the first $-dollar sign. The minpos counter is already at the start digit of the volume. I can get the position of the first dollar sign, and trim off the blanks.</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="co1">//volume is the is at the start of the string after minpos</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$volstr</span> <span class="sy0">=</span> <span class="kw3">trim</span><span class="br0">&#40;</span><span class="kw3">substr</span><span class="br0">&#40;</span><span class="re1">$linesarr</span><span class="br0">&#91;</span><span class="re1">$x</span><span class="br0">&#93;</span><span class="sy0">,</span> <span class="re1">$minpos</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//and before the first dollar sign</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$volcut</span> <span class="sy0">=</span> <span class="kw3">strpos</span><span class="br0">&#40;</span><span class="re1">$volstr</span><span class="sy0">,</span> <span class="st0">&quot;$&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//it contains &quot;,&quot; : 9,111,222 so filter out the nonsense for mysql :</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$vol</span> <span class="sy0">=</span> <span class="kw3">preg_replace</span><span class="br0">&#40;</span><span class="st0">&#39;/,/&#39;</span><span class="sy0">,</span> <span class="st0">&#39;&#39;</span><span class="sy0">,</span> <span class="kw3">trim</span><span class="br0">&#40;</span><span class="kw3">substr</span><span class="br0">&#40;</span><span class="re1">$volstr</span><span class="sy0">,</span> <span class="nu0">0</span><span class="sy0">,</span> <span class="re1">$volcut</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$minpos</span><span class="sy0">&lt;</span>strlen <span class="br0">&#40;</span><span class="re1">$linesarr</span><span class="br0">&#91;</span><span class="re1">$x</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$mykeys</span> <span class="sy0">=</span> <span class="kw3">substr</span><span class="br0">&#40;</span><span class="re1">$linesarr</span><span class="br0">&#91;</span><span class="re1">$x</span><span class="br0">&#93;</span><span class="sy0">,</span><span class="nu0">0</span><span class="sy0">,</span><span class="re1">$minpos</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw3">echo</span> <span class="re1">$mykeys</span><span class="sy0">.</span><span class="st0">&quot;_&quot;</span><span class="sy0">.</span><span class="re1">$vol</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
</ol>
</div>
<p><small>[After this I stuff the data in a mysql table `sevencats`]</small>. </p>
<h4>How to add a keyword list to wordpress as categories</h4>
<p>Let's add the keywords to a wordpress blog as categories. Wordpress has a very simple function for it wp_insert_term in the taxonomy.php file. </p>
<p>In wpmu you do first have to pick the target blog, as you work on a blogs tableset, wp1_, wp2_ etcetera and if you start it up you get the admin users main blog as active tableset. If you want to add data like categories in another blogs taxonomy table you have to switch to that table set first.</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="kw2">function</span> connect_data<span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$DB_USER</span> <span class="sy0">=</span> &nbsp;<span class="st0">&quot;&quot;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$DB_PASSWORD</span> <span class="sy0">=</span> <span class="st0">&quot;&quot;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$DB_HOST</span> <span class="sy0">=</span> <span class="st0">&quot;&quot;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$DB_DATA</span> <span class="sy0">=</span> <span class="st0">&quot;&quot;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$link</span> <span class="sy0">=</span> &nbsp;<span class="kw3">mysql_connect</span><span class="br0">&#40;</span><span class="re1">$DB_HOST</span><span class="sy0">,</span> <span class="re1">$DB_USER</span><span class="sy0">,</span> <span class="re1">$DB_PASSWORD</span><span class="br0">&#41;</span> or <span class="re1">$error</span> <span class="sy0">=</span> <span class="kw3">mysql_error</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span><span class="sy0">!</span><span class="re1">$link</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; <span class="kw1">return</span> <span class="re1">$error</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span> &nbsp; </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw3">mysql_select_db</span><span class="br0">&#40;</span><span class="re1">$DB_DATA</span><span class="sy0">,</span> <span class="re1">$link</span><span class="br0">&#41;</span> or <span class="re1">$error</span> <span class="sy0">=</span> <span class="kw3">mysql_error</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">return</span> <span class="re1">$link</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//link</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$cats</span><span class="sy0">=</span>connect_data<span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//get array with categories </span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$categories</span><span class="sy0">=</span><span class="kw3">array</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$qry</span><span class="sy0">=</span><span class="st0">&quot;SELECT cat FROM `sevencats`&quot;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$lst</span><span class="sy0">=</span><span class="kw3">mysql_query</span><span class="br0">&#40;</span><span class="re1">$qry</span><span class="sy0">,</span> <span class="re1">$cats</span><span class="br0">&#41;</span> or <span class="kw3">die</span><span class="br0">&#40;</span><span class="st0">&#39;list error &#39;</span><span class="sy0">.</span><span class="kw3">mysql_error</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">while</span><span class="br0">&#40;</span><span class="re1">$row</span><span class="sy0">=</span><span class="kw3">mysql_fetch_assoc</span><span class="br0">&#40;</span><span class="re1">$lst</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$categories</span><span class="br0">&#91;</span><span class="br0">&#93;</span><span class="sy0">=</span><span class="re1">$row</span><span class="br0">&#91;</span><span class="st0">&#39;cat&#39;</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//close db connection</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw3">mysql_close</span><span class="br0">&#40;</span><span class="re1">$cats</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//open wordpress connection</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">include_once</span><span class="br0">&#40;</span><span class="st0">&#39;wp-config.php&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">include_once</span><span class="br0">&#40;</span><span class="st0">&#39;wp-includes/wp-db.php&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">include_once</span><span class="br0">&#40;</span><span class="st0">&#39;wp-includes/taxonomy.php&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//select target blog by id</span></div>
</li>
<li class="li1">
<div class="de1">switch_to_blog<span class="br0">&#40;</span><span class="nu0">3</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//insert categories</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">for</span> <span class="br0">&#40;</span><span class="re1">$i</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$categories</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; wp_insert_term<span class="br0">&#40;</span><span class="re1">$categories</span><span class="br0">&#91;</span><span class="re1">$i</span><span class="br0">&#93;</span><span class="sy0">,</span> <span class="st0">&#39;category&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//switch back to users main blog</span></div>
</li>
<li class="li1">
<div class="de1">restore_current_blog<span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
</ol>
</div>
<p>For a normal wordpress install you'd not have to switch blogs : </p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="co1">//open wordpress connection</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">include_once</span><span class="br0">&#40;</span><span class="st0">&#39;wp-config.php&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">include_once</span><span class="br0">&#40;</span><span class="st0">&#39;wp-includes/wp-db.php&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">include_once</span><span class="br0">&#40;</span><span class="st0">&#39;wp-includes/taxonomy.php&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//insert categories</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">for</span> <span class="br0">&#40;</span><span class="re1">$i</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$categories</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; wp_insert_term<span class="br0">&#40;</span><span class="re1">$categories</span><span class="br0">&#91;</span><span class="re1">$i</span><span class="br0">&#93;</span><span class="sy0">,</span> <span class="st0">&#39;category&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
</ol>
</div>
<p>That gets me the top 100 searches of last month as categories for my new blog all. You can fiddle with it a bit and only pick searches with a volume above 2000 monthly searches (just in case you want to go scraping and only want material that gets you in the serp pages for the high volume search terms).</p>
<p>Next edition : Red Hat Seo (with jingle bells) the Christmas Special :)</count></pre>
<p></count></pre>
<p></strlen></pre>
<p></count></pre>
]]></content:encoded>
			<wfw:commentRss>http://www.juust.org/index.php/how-to-grab-keywords-from-7search/2008/10/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>seo tricks : the magpie incident</title>
		<link>http://www.juust.org/index.php/seo-tricks-the-magpie-incident/2008/10/</link>
		<comments>http://www.juust.org/index.php/seo-tricks-the-magpie-incident/2008/10/#comments</comments>
		<pubDate>Wed, 01 Oct 2008 22:05:07 +0000</pubDate>
		<dc:creator>juust</dc:creator>
				<category><![CDATA[links]]></category>
		<category><![CDATA[seo]]></category>
		<category><![CDATA[seo tips and tricks]]></category>

		<guid isPermaLink="false">http://www.juust.org/?p=166</guid>
		<description><![CDATA[Some universities like Southern California, Harvard and Michigan State have their web-guru&#8217;s explain to us how rss feeds work with the elegant Magpie parser demo : 
Some example on how to use Magpie:
* magpie_simple.php *
  Simple example of fetching and parsing an RSS file. Expects to be
  called with a query param &#8216;rss_url=http://(some [...]]]></description>
			<content:encoded><![CDATA[<p>Some universities like Southern California, Harvard and Michigan State have their web-guru&#8217;s explain to us how rss feeds work with the elegant Magpie parser demo : </p>
<blockquote><p>Some example on how to use Magpie:</p>
<p>* magpie_simple.php *<br />
  Simple example of fetching and parsing an RSS file. Expects to be<br />
  called with a query param &#8216;rss_url=http://(some rss file)&#8217;<br />
&#8230;.</p>
<p>* magpie_debug.php *<br />
  Displays all the information available from a parsed feed.</p></blockquote>
<p>Note : magpie_debug.php is the one to watch for, you can do a google search on :<br />
      <center><strong>site:.edu magpie_debug.php</strong></center><br />
and you get a number of educational facilities that kindly demonstrate the use of the magpie rss parser.</p>
<p>These demo pages have a textbox where you can enter an rss feed url, the magpie demo parses your feed and outputs it as an html-page. </p>
<p>You have to be careful with these programs, though : I actually found one domain (www.scripps.edu) with this remark under the &#8216;parse rss&#8217; button :</p>
<blockquote><p>Security Note:<br />
This is a simple example script. If this was a real script we probably wouldn&#8217;t allow strangers to submit random URLs, and we certainly wouldn&#8217;t simply echo anything passed in the URL. Additionally its a bad idea to leave this example script lying around.
</p></blockquote>
<p>Thank you, you are surely wise like the buddha, I shall try to remember your insight !</p>
<p>&#8230;.<br />
note: after a while I decided I had had enough fun with magpies and took the blog off-line.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.juust.org/index.php/seo-tricks-the-magpie-incident/2008/10/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>seo tricks : old wine in new bags&#8230;</title>
		<link>http://www.juust.org/index.php/seo-tricks-old-wine-in-new-bags/2008/09/</link>
		<comments>http://www.juust.org/index.php/seo-tricks-old-wine-in-new-bags/2008/09/#comments</comments>
		<pubDate>Fri, 26 Sep 2008 05:19:05 +0000</pubDate>
		<dc:creator>juust</dc:creator>
				<category><![CDATA[links]]></category>
		<category><![CDATA[pagerank]]></category>
		<category><![CDATA[seo]]></category>
		<category><![CDATA[seo tips and tricks]]></category>

		<guid isPermaLink="false">http://www.juust.org/?p=160</guid>
		<description><![CDATA[Get some pagerank : this trick would require tedious boring link checking, but since SeoLinx (an extension of SeoQuake) that has become a lot easier. SeoLinx shows the stats of a links target url so you don&#8217;t have to go to every page to retrieve the stats. Cool plugin. Let&#8217;s put it to some practical [...]]]></description>
			<content:encoded><![CDATA[<p>Get some pagerank : this trick would require tedious boring link checking, but since SeoLinx (an extension of <a href="http://www.seoquake.com/" rel="nofollow">SeoQuake</a>) that has become a lot easier. SeoLinx shows the stats of a links target url so you don&#8217;t have to go to every page to retrieve the stats. Cool plugin. Let&#8217;s put it to some practical use.</p>
<h4>the trick : comment on old forum threads</h4>
<p>Once you have SeoLinx installed find an &#8216;old&#8217; forum, register if you haven&#8217;t already and make sure you get a signature link. Sometimes you first have to be a member for a week or write ten posts, but once you have a sig-link you get backlinks off the forum.</p>
<p>Then go comment on really <strong>old forum threads</strong>. </p>
<p>With SeoLinx you can easily spot the juicy old threads. Old threads on for instance <a href="http://forums.digitalpoint.com/showthread.php?t=179" rel="nofollow">DigitalPoint</a> or Webmasterworld are sometimes pagerank 3. In case of the <a href="http://forums.digitalpoint.com/showthread.php?t=179" rel="nofollow" title="go see for yourself, seo-heathen">DP post</a>, PR2 with 8 posts at the time of writing. </p>
<p>Pick a forum, and browse to the last page of the threads. Hover over the thread anchor and SeoLinx shows you the pagerank of the thread page. As long as the number of posts is below (10, 16 depending on the forum settings) you can put your comments in and they will appear on the first page of that thread, that has that nice pagerank and juice. </p>
<p>Old wine in new bags can be a sweet thing.</p>
<h4>the benefit</h4>
<p>A pagerank 3 &#8216;targetted&#8217; anchor is worth about $9,- a month, $100,- per year. It can take an hour to find a juicy one, but hey, $100,- value for an hours work is well worth the trouble. </p>
<hr />
I might make this a blog feature, <strong>seo tips and tricks of the month</strong>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.juust.org/index.php/seo-tricks-old-wine-in-new-bags/2008/09/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
