<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>juust ~ php oddities &#187; seo tips and tricks</title>
	<atom:link href="http://www.juust.org/index.php/category/seo-tips-and-tricks/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.juust.org</link>
	<description>Unordered list of one element</description>
	<lastBuildDate>Wed, 28 Jul 2010 14:26:26 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Pagerank sculpting session</title>
		<link>http://www.juust.org/index.php/pagerank-sculpting-session/2010/07/</link>
		<comments>http://www.juust.org/index.php/pagerank-sculpting-session/2010/07/#comments</comments>
		<pubDate>Wed, 28 Jul 2010 14:24:00 +0000</pubDate>
		<dc:creator>juust</dc:creator>
				<category><![CDATA[seo tips and tricks]]></category>

		<guid isPermaLink="false">http://www.juust.org/?p=1307</guid>
		<description><![CDATA[In the series &#8216;how to manipulate google&#8216; : pagerank sculpting 101.
If I build a site about &#8220;LCD television&#8221; and want to promote three  specific brands/offers, I want Google to index the product/offer-pages  as most important in the site, and not the index page.
How do I achieve that ?
Some basic theory : I have [...]]]></description>
			<content:encoded><![CDATA[<p>In the series &#8216;<em>how to manipulate google</em>&#8216; : pagerank sculpting 101.</p>
<p>If I build a site about &#8220;LCD television&#8221; and want to promote three  specific brands/offers, I want Google to index the product/offer-pages  as most important in the site, and not the index page.</p>
<p>How do I achieve that ?</p>
<p>Some basic theory : I have two rings of pages, one with four pages (B), and one with three (A), linked over one page (AB),</p>
<p>.</p>
<p><img class="alignnone size-medium wp-image-1310" title="pagerank sculpting 000" src="http://www.juust.org/wp-content/uploads/2010/07/pagerank-sculpting-000-300x232.jpg" alt="pagerank sculpting 000" width="300" height="232" /></p>
<p>&#8230;after running that through a pagerank simulation, I get these results :</p>
<table border="0">
<tbody>
<tr>
<td>item</td>
<td>importance</td>
</tr>
<tr>
<td>b</td>
<td>0.97</td>
</tr>
<tr>
<td>ab</td>
<td>1.60</td>
</tr>
<tr>
<td>a</td>
<td>0.73</td>
</tr>
</tbody>
</table>
<p>&#8230;the linking page (AB) sheds its juice to two rings, 3/5 vs 2/5, and by doing so drains the smaller ring. Being the only page that gets links and juice from all other pages, the ab-page itself scores the highest &#8216;importance&#8217; in the website.</p>
<p>Conclusion : adding some subpages in a smaller ring to a page makes it relatively more important in the website.</p>
<p><img class="alignnone size-medium wp-image-1313" title="pagerank sculpting 001" src="http://www.juust.org/wp-content/uploads/2010/07/pagerank-sculpting-001-300x223.jpg" alt="pagerank sculpting 001" width="300" height="223" /></p>
<table border="0">
<tbody>
<tr>
<td>item</td>
<td>importance</td>
</tr>
<tr>
<td>home, prod1, prod3</td>
<td>0.97</td>
</tr>
<tr>
<td>prod2</td>
<td>1.60</td>
</tr>
<tr>
<td>sub</td>
<td>0.73</td>
</tr>
</tbody>
</table>
<p>Let&#8217;s add some more subrings :</p>
<p><img class="alignnone size-medium wp-image-1314" title="pagerank sculpting 002" src="http://www.juust.org/wp-content/uploads/2010/07/pagerank-sculpting-002-276x300.jpg" alt="pagerank sculpting 002" width="276" height="300" /></p>
<table border="0">
<tbody>
<tr>
<td>item</td>
<td>importance</td>
</tr>
<tr>
<td>I, Prod1</td>
<td>0.96</td>
</tr>
<tr>
<td>Prod2, 3</td>
<td>1.58</td>
</tr>
<tr>
<td>sub</td>
<td>0.72</td>
</tr>
</tbody>
</table>
<p><img class="alignnone size-medium wp-image-1311" title="pagerank sculpting 003" src="http://www.juust.org/wp-content/uploads/2010/07/pagerank-sculpting-003-300x264.jpg" alt="pagerank sculpting 003" width="300" height="264" /></p>
<table border="0">
<tbody>
<tr>
<td>item</td>
<td>importance</td>
</tr>
<tr>
<td>home</td>
<td>0.94</td>
</tr>
<tr>
<td>Prod 1,2,3</td>
<td>1.56</td>
<td></td>
</tr>
<tr>
<td>sub</td>
<td>0.72</td>
<td></td>
</tr>
</tbody>
</table>
<p>The importance of the product pages seems to drop, but the ratio prod-n/home improves, so it works out better.</p>
<p><strong>Home-links</strong></p>
<p>By not linking back to the index page from the subpages, the product pages end up with a higher rank within the site.</p>
<p>If I do link from each subpage (S) to the index page (home) (what wordpress themes generally do)</p>
<p><img class="alignnone size-medium wp-image-1312" title="pagerank sculpting 004" src="http://www.juust.org/wp-content/uploads/2010/07/pagerank-sculpting-004-300x268.jpg" alt="pagerank sculpting 004" width="300" height="268" /></p>
<p>&#8230;I get these results&#8230;</p>
<table border="0">
<tbody>
<tr>
<td>item</td>
<td>importance</td>
</tr>
<tr>
<td>home</td>
<td>1.91</td>
</tr>
<tr>
<td>Prod 1,2,3</td>
<td>1.54</td>
</tr>
<tr>
<td>Sub</td>
<td>0.57</td>
</tr>
</tbody>
</table>
<p>&#8230;the home page is indicated as most important in the site, which isn&#8217;t what I wanted, so I omit the home-link on subpages.</p>
<p>End of the sculpting session.</p>
<p>(note: this is, of course, all theoretic)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.juust.org/index.php/pagerank-sculpting-session/2010/07/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>interesting : seo panel</title>
		<link>http://www.juust.org/index.php/interesting-seo-panel/2010/07/</link>
		<comments>http://www.juust.org/index.php/interesting-seo-panel/2010/07/#comments</comments>
		<pubDate>Fri, 02 Jul 2010 01:19:24 +0000</pubDate>
		<dc:creator>juust</dc:creator>
				<category><![CDATA[optimization]]></category>
		<category><![CDATA[seo]]></category>
		<category><![CDATA[seo tips and tricks]]></category>
		<category><![CDATA[tool]]></category>

		<guid isPermaLink="false">http://www.juust.org/?p=1278</guid>
		<description><![CDATA[That seems fun, an open source seo toolkit. It is a five second install multi-user package offering simple stats, but more interesting, a semi automated website directory submitter in a clean interface, and could be a valuable service offer if you run an seo community site. 
It is PHP MVC and, what sparked my interest, [...]]]></description>
			<content:encoded><![CDATA[<p>That seems fun, an <a href="http://www.seopanel.in/download/">open source seo toolkit</a>. It is a five second install multi-user package offering simple stats, but more interesting, a semi automated website directory submitter in a clean interface, and could be a valuable service offer if you run an seo community site. </p>
<p>It is PHP MVC and, what sparked my interest, it has a plugin interface.<br />
I love that, It could be going somewhere over the next few years. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.juust.org/index.php/interesting-seo-panel/2010/07/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>tweeting pipes</title>
		<link>http://www.juust.org/index.php/tweeting-pipes/2009/06/</link>
		<comments>http://www.juust.org/index.php/tweeting-pipes/2009/06/#comments</comments>
		<pubDate>Tue, 30 Jun 2009 15:15:17 +0000</pubDate>
		<dc:creator>juust</dc:creator>
				<category><![CDATA[seo tips and tricks]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://www.juust.org/?p=650</guid>
		<description><![CDATA[&#8230;but serious, channels on Twitter are a hot item. 
Twitter  seem to want branded channels for commerce by using verified accounts to prevent spoofing celebrities, and the same goes for brandnames. There is already a growing trade in twitter accounts like @nike-shoes, @skyeurope. 
To build an attractive channel I need credibility, provide regular good [...]]]></description>
			<content:encoded><![CDATA[<p>&#8230;but serious, channels on Twitter are a hot item. </p>
<p>Twitter  seem to want branded channels for commerce by using verified accounts to prevent spoofing celebrities, and the same goes for brandnames. There is already a growing <a href="http://www.assetize.com/">trade in twitter</a> accounts like @nike-shoes, @skyeurope. </p>
<p>To build an attractive channel I need credibility, provide regular good quality fresh content, so where do I get that : </p>
<h3>Yahoo pipes</h3>
<p>I am very lazy and Yahoo have a nice example online, the <a href="http://pipes.yahoo.com/pipes/pipe.info?_id=fELaGmGz2xGtBTC3qe5lkA">news aggregator</a> with 14 sources like blogsearch, icerocket, technorati, that you can clone and use out of the box. So I cloned it, replaced the technorati api key and run the pipe with &#8216;banking&#8217; as keyword. I grab the rss feed url and read that with simplexml (you can use that pipe with any keyword).</p>
<p>Then I take a <a href="http://sourceforge.net/projects/phptwitterclass/">twitter php api class</a> from sourceforge (it only reads the account, it doesnt have the post-routines), by simon <a href="http://wippich.org">wippich</a>, wire in the rss-feed and start posting content.</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="kw1">require_once</span><span class="br0">&#40;</span><span class="st0">&#39;twitter.class.php&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$Twitter</span> <span class="sy0">=</span> Twitter<span class="sy0">::</span><span class="me2">getInstance</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$Twitter</span><span class="sy0">-&gt;</span><span class="me1">setUser</span><span class="br0">&#40;</span><span class="st0">&#39;Account&#39;</span><span class="sy0">,</span><span class="st0">&#39;SomePassword&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$rss</span><span class="sy0">=</span> simplexml_load_file<span class="br0">&#40;</span><span class="st0">&quot;http://pipes.yahoo.com/pipes/pipe.run?_id=1234567890&amp;_render=rss&amp;textinput1=banking&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$rss</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$rss</span><span class="sy0">-&gt;</span><span class="me1">channel</span><span class="sy0">-&gt;</span><span class="me1">item</span> <span class="kw1">as</span> <span class="re1">$e</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$shrunk</span> <span class="sy0">=</span> <span class="kw3">file_get_contents</span><span class="br0">&#40;</span><span class="st0">&#39;http://bit.ly/api?url=&#39;</span><span class="sy0">.</span><span class="re1">$e</span><span class="sy0">-&gt;</span><span class="me1">link</span><span class="br0">&#41;</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$msg</span> <span class="sy0">=</span> <span class="kw3">trim</span><span class="br0">&#40;</span><span class="kw3">substr</span><span class="br0">&#40;</span><span class="re1">$e</span><span class="sy0">-&gt;</span><span class="me1">title</span><span class="sy0">,</span> <span class="nu0">0</span><span class="sy0">,</span> <span class="br0">&#40;</span><span class="nu0">137</span><span class="sy0">-</span><span class="kw3">strlen</span><span class="br0">&#40;</span><span class="re1">$shrunk</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">.</span><span class="st0">&#39; &#39;</span><span class="sy0">.</span><span class="re1">$shrunk</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$output</span> <span class="sy0">=</span> <span class="re1">$Twitter</span><span class="sy0">-&gt;</span><span class="me1">post</span><span class="br0">&#40;</span><span class="re1">$msg</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
</ol>
</div>
<p><img src="http://www.juust.org/wp-content/uploads/2009/06/neofinance.png" alt="neofinance" title="neofinance" width="500" height="110" class="alignnone size-full wp-image-651" /></p>
<p>Now I can post proper stuff.  </p>
<p>The second part of a channel is the audience. </p>
<p>Where to get my audience ? </p>
<h3>Google Search</h3>
<p>Google serp scrapers are always good for 1000 targetted results on any keyword : i use<br />
<strong>allinanchor:twitter.com/ site:twitter.com banking</strong><br />
as search phrase, that gets me 95% valid accounts with my keyword banking in the description</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="re1">$key</span> <span class="sy0">=</span> <span class="st0">&#39;banking&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//scrape urls</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$urls</span> <span class="sy0">=</span> twt_Google<span class="br0">&#40;</span><span class="st0">&#39;allinanchor:twitter.com/ site:twitter.com &#39;</span><span class="sy0">.</span><span class="re1">$key</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//get the account names</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$accounts</span> <span class="sy0">=</span> twt_Google_getaccounts<span class="br0">&#40;</span><span class="re1">$urls</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw2">function</span> twt_Google<span class="br0">&#40;</span><span class="re1">$keywords</span><span class="sy0">,</span> <span class="re1">$pages</span><span class="sy0">=</span><span class="nu0">1</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//scrape results off of google serp &nbsp; &nbsp;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$lang</span><span class="sy0">=</span><span class="st0">&#39;en&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$results</span><span class="sy0">=</span><span class="nu0">100</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$i</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">&lt;</span> <span class="re1">$pages</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">++</span><span class="br0">&#41;</span><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$start</span> <span class="sy0">=</span> <span class="re1">$i</span><span class="sy0">*</span><span class="nu0">100</span><span class="nu0">+1</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$vargoogleresultpage</span> <span class="sy0">=</span> <span class="st0">&quot;http://www.google.com/search?as_q=&quot;</span><span class="sy0">.</span><span class="kw3">urlencode</span><span class="br0">&#40;</span><span class="kw3">trim</span><span class="br0">&#40;</span><span class="re1">$keywords</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">.</span><span class="st0">&quot;&amp;num=&quot;</span><span class="sy0">.</span><span class="re1">$results</span><span class="sy0">.</span><span class="st0">&quot;&amp;start=&quot;</span><span class="sy0">.</span><span class="re1">$start</span><span class="sy0">.</span><span class="st0">&quot;&amp;hl=en&amp;lr=lang_en&quot;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$googleresponse</span> <span class="sy0">=</span> <span class="kw3">join</span><span class="br0">&#40;</span><span class="st0">&quot;&quot;</span><span class="sy0">,</span><span class="kw3">file</span><span class="br0">&#40;</span><span class="re1">$vargoogleresultpage</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$googlehits</span> <span class="sy0">=</span> <span class="kw3">preg_split</span><span class="br0">&#40;</span><span class="st0">&#39;/class=r&gt;&lt;a /&#39;</span><span class="sy0">,</span> <span class="re1">$googleresponse</span><span class="sy0">,</span> <span class="nu0">-1</span><span class="sy0">,</span> PREG_SPLIT_OFFSET_CAPTURE<span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$i</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$googlehits</span> <span class="kw1">as</span> <span class="re1">$googlehit</span><span class="br0">&#41;</span><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$i</span><span class="sy0">++;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw3">preg_match</span><span class="br0">&#40;</span><span class="st0">&quot;/href=<span class="es0">\&quot;</span>(.*?)<span class="es0">\&quot;</span>/&quot;</span><span class="sy0">,</span> <span class="re1">$googlehit</span><span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span><span class="sy0">,</span> <span class="re1">$t</span><span class="sy0">,</span> PREG_OFFSET_CAPTURE<span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$the_urls</span><span class="br0">&#91;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="re1">$t</span><span class="br0">&#91;</span><span class="nu0">1</span><span class="br0">&#93;</span><span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span> &nbsp; &nbsp; &nbsp; &nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="co1">//return a set with twitter urls http://www.twitter.com/account</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">return</span> <span class="re1">$the_urls</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw2">function</span> twt_Google_getaccounts<span class="br0">&#40;</span><span class="re1">$arr</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//get the account name from the twitter-url</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$i</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">&lt;</span>count<span class="br0">&#40;</span><span class="re1">$arr</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">++</span><span class="br0">&#41;</span><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$parts</span> <span class="sy0">=</span> <span class="kw3">explode</span><span class="br0">&#40;</span><span class="st0">&#39;/&#39;</span><span class="sy0">,</span> <span class="re1">$arr</span><span class="br0">&#91;</span><span class="re1">$i</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="co1">//account is 3 : http: // &#8230; / account</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$myaccounts</span><span class="br0">&#91;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="re1">$parts</span><span class="br0">&#91;</span><span class="nu0">3</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">return</span> <span class="re1">$myaccounts</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
</ol>
</div>
<p>There is my audience, lets make some friends :</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$i</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$accounts</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">++</span><span class="br0">&#41;</span><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;followthisone<span class="br0">&#40;</span><span class="re1">$accounts</span><span class="br0">&#91;</span><span class="re1">$i</span><span class="br0">&#93;</span><span class="sy0">,</span> <span class="st0">&#39;Account&#39;</span><span class="sy0">,</span><span class="st0">&#39;SomePassword&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw2">function</span> followthisone<span class="br0">&#40;</span><span class="re1">$accountname</span><span class="sy0">,</span> <span class="re1">$name</span><span class="sy0">,</span> <span class="re1">$pass</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$url</span> <span class="sy0">=</span> <span class="st0">&quot;http://twitter.com/friendships/create/&quot;</span><span class="sy0">.</span><span class="re1">$accountname</span><span class="sy0">.</span><span class="st0">&quot;.xml&quot;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$ch</span> <span class="sy0">=</span> curl_init<span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_URL<span class="sy0">,</span><span class="re1">$url</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_POST<span class="sy0">,</span> <span class="nu0">1</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_USERPWD<span class="sy0">,</span> <span class="re1">$name</span><span class="sy0">.</span><span class="st0">&quot;:&quot;</span><span class="sy0">.</span><span class="re1">$pass</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$result</span><span class="sy0">=</span> curl_exec <span class="br0">&#40;</span><span class="re1">$ch</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; curl_close <span class="br0">&#40;</span><span class="re1">$ch</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
</ol>
</div>
<p>hello friends! </p>
<p>Anyways, that&#8217;s the basic ingredients of a marketing channel, proper content and an audience.</count></pre>
<p></a></pre>
]]></content:encoded>
			<wfw:commentRss>http://www.juust.org/index.php/tweeting-pipes/2009/06/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>curl trackbacks</title>
		<link>http://www.juust.org/index.php/curl-trackbacks/2009/03/</link>
		<comments>http://www.juust.org/index.php/curl-trackbacks/2009/03/#comments</comments>
		<pubDate>Wed, 25 Mar 2009 09:53:13 +0000</pubDate>
		<dc:creator>juust</dc:creator>
				<category><![CDATA[links]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[seo tips and tricks]]></category>
		<category><![CDATA[trackback]]></category>

		<guid isPermaLink="false">http://www.juust.org/?p=375</guid>
		<description><![CDATA[I figure i&#8217;d blog a post on trackback linkbuilding. A trackback is &#8230; (post a few and you&#8217;ll get it). The trackback protocol isn&#8217;t that interesting, but the implementation of it by blog-platforms and cms&#8217;es makes it an excellent means for network development, because it uses a simple http-post. cUrl makes that easy).
To post a [...]]]></description>
			<content:encoded><![CDATA[<p>I figure i&#8217;d blog a post on trackback linkbuilding. A trackback is &#8230; (post a few and you&#8217;ll get it). The trackback protocol isn&#8217;t that interesting, but the implementation of it by blog-platforms and cms&#8217;es makes it an excellent means for network development, because it uses a simple http-post. cUrl makes that easy).</p>
<p>To post a succesful link proposal I need some basic data :</p>
<p>about my page </p>
<ul>
<li>url (must exist)</li>
<li>blog owner (free)</li>
<li>blog name (free)</li>
</ul>
<p>about the other page</p>
<ul>
<li>url (must exist)</li>
<li>excerpt (should be proper normal text)</li>
</ul>
<p><em>my page :</em> this is preferably a php routine that hacks some text, pictures and video&#8217;s, PLR or articles together, with a url rewrite. I prefer using xml textfiles in stead of a database, works faster when you set stuff up.</p>
<p><em>other page :</em> don&#8217;t use &#8220;I liked your article so much&#8230;&#8221;, use text that maches text on target pages, preferably get some proper excerpts from xml-feeds like blogsearch, msn and yahoo (excerpts contain the keywords I searched for, as anchor text it works better for search engine visibility and link value). </p>
<p>Let&#8217;s get some stuff from the MSN rss feed :</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="co1">//a generic query = 5% success</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//add &quot;(powered by) wordpress&quot; </span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; <span class="re1">$query</span><span class="sy0">=</span><span class="kw3">urlencode</span><span class="br0">&#40;</span><span class="st0">&#39;keywords+wordpress+trackback&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; <span class="re1">$xml</span> <span class="sy0">=</span> <span class="sy0">@</span>simplexml_load_file<span class="br0">&#40;</span><span class="st0">&quot;http://search.live.com/results.aspx?q=$query&amp;count=50&amp;first=1&amp;format=rss&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; <span class="re1">$count</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; <span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$xml</span><span class="sy0">-&gt;</span><span class="me1">channel</span><span class="sy0">-&gt;</span><span class="me1">item</span> <span class="kw1">as</span> <span class="re1">$i</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="re1">$count</span><span class="sy0">++;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//the data from msn</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="re1">$target</span><span class="br0">&#91;</span><span class="st0">&#39;link&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="br0">&#40;</span>string<span class="br0">&#41;</span> <span class="re1">$i</span><span class="sy0">-&gt;</span><span class="me1">link</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="re1">$target</span><span class="br0">&#91;</span><span class="st0">&#39;title&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="br0">&#40;</span>string<span class="br0">&#41;</span> <span class="re1">$i</span><span class="sy0">-&gt;</span><span class="me1">title</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="re1">$target</span><span class="br0">&#91;</span><span class="st0">&#39;excerpt&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="br0">&#40;</span>string<span class="br0">&#41;</span> <span class="re1">$i</span><span class="sy0">-&gt;</span><span class="me1">description</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//some variables I&#39;ll need later on</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="re1">$target</span><span class="br0">&#91;</span>id<span class="st0">&#39;] = $count;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="st0"> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; $target[&#39;</span>trackback<span class="st0">&#39;] = &#39;</span><span class="st0">&#39;;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="st0"> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; $target[&#39;</span>trackback_success<span class="st0">&#39;] = 0;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="st0"> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; $trackbacks[]=$target;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="st0"> &nbsp; &nbsp; &nbsp; }</span></div>
</li>
<li class="li1">
<div class="de1"><span class="st0"</span></div>
</li>
</ol>
</div>
<p>25% of the cms sites in the top of the search engines are Wordpress scripts and Wordpress always uses /trackback/ in the rdf-url. I get the source of the urls in the search-feed and grab all link-url&#8217;s in it, if any contains /t<strong>rackbac</strong>k/, I post a trackback to that url  and see if it sticks. </p>
<p>(I can also spider all links and check if there is an rdf-segment in the target&#8217;s source (*1), but that takes a lot of time, I could also program a curl array and use multicurl, for my purposes this works fast enough).</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$t</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$t</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$trackbacks</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$t</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//I could use curl </span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//but 95% of the urls offered are kosher and respond fast</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="re1">$content</span> <span class="sy0">=</span> <span class="sy0">@</span><span class="kw3">file_get_contents</span><span class="br0">&#40;</span><span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="re1">$t</span><span class="br0">&#93;</span><span class="br0">&#91;</span><span class="st0">&#39;link&#39;</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="kw3">preg_match_all</span> <span class="br0">&#40;</span><span class="st0">&quot;/a[<span class="es0">\s</span>]+[^&gt;]*?href[<span class="es0">\s</span>]?=[<span class="es0">\s</span><span class="es0">\&quot;</span><span class="es0">\&#39;</span>]+&quot;</span><span class="sy0">.</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="st0">&quot;(.*?)[<span class="es0">\&quot;</span><span class="es0">\&#39;</span>]+.*?&gt;&quot;</span><span class="sy0">.</span><span class="st0">&quot;([^&lt; ]+|.*?)?&lt;<span class="es0">\/</span>a&gt;/&quot;</span><span class="sy0">,</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$content</span><span class="sy0">,</span> <span class="sy0">&amp;</span><span class="re1">$matches</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$uri_array</span> <span class="sy0">=</span> <span class="re1">$matches</span><span class="br0">&#91;</span><span class="nu0">1</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$uri_array</span> <span class="kw1">as</span> <span class="re1">$key</span> <span class="sy0">=&gt;</span> <span class="re1">$link</span><span class="br0">&#41;</span> <span class="br0">&#123;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">strpos</span><span class="br0">&#40;</span><span class="re1">$link</span><span class="sy0">,</span> <span class="st0">&#39;rackbac&#39;</span><span class="br0">&#41;</span><span class="sy0">&gt;</span><span class="nu0">0</span><span class="br0">&#41;</span> <span class="br0">&#123;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="re1">$t</span><span class="br0">&#93;</span><span class="br0">&#91;</span><span class="st0">&#39;trackback&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="re1">$link</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">break</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="sy0">&lt;/</span>count<span class="sy0">&gt;</span></div>
</li>
</ol>
</div>
<p>When I fire a trackback, the other script will try and assert if my page has a link and matching text. I have to make sure my page shows the excerpts and links, so I stuff all candidates in a cached xml file.  </p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="kw2">function</span> cache_xml_store<span class="br0">&#40;</span><span class="re1">$trackbacks</span><span class="sy0">,</span> <span class="re1">$pagetitle</span><span class="br0">&#41;</span> </div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$xml</span> <span class="sy0">=</span> <span class="st0">&#39;&lt; ?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="st0"> &lt;trackbacks&gt;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$a</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$a</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$trackbacks</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$a</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$arr</span> <span class="sy0">=</span> <span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="re1">$a</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$xml</span> <span class="sy0">.=</span> <span class="st0">&#39;&lt;entry&gt;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$xml</span> <span class="sy0">.=</span> <span class="st0">&#39;&lt;id&gt;&#39;</span><span class="sy0">.</span><span class="re1">$arr</span><span class="br0">&#91;</span><span class="st0">&#39;id&#39;</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&#39;&lt;/id&gt;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$xml</span> <span class="sy0">.=</span> <span class="st0">&#39;&lt;excerpt&gt;&#39;</span><span class="sy0">.</span><span class="re1">$arr</span><span class="br0">&#91;</span><span class="st0">&#39;excerpt&#39;</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&#39;&lt;/excerpt&gt;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$xml</span> <span class="sy0">.=</span> <span class="st0">&#39;&lt;link&gt;&#39;</span><span class="sy0">.</span><span class="re1">$arr</span><span class="br0">&#91;</span><span class="st0">&#39;link&#39;</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&#39;&lt;/link&gt;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$xml</span> <span class="sy0">.=</span> <span class="st0">&#39;&lt;title&gt;&#39;</span><span class="sy0">.</span><span class="re1">$arr</span><span class="br0">&#91;</span><span class="st0">&#39;title&#39;</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&#39;&lt;/title&gt;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$xml</span> <span class="sy0">.=</span> <span class="st0">&#39;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$xml</span> <span class="sy0">.=</span> <span class="st0">&#39;&lt;/count&gt;&lt;/trackbacks&gt;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$fname</span> <span class="sy0">=</span> <span class="st0">&#39;cache/trackback&#39;</span><span class="sy0">.</span><span class="kw3">urlencode</span><span class="br0">&#40;</span><span class="re1">$pagetitle</span><span class="br0">&#41;</span><span class="sy0">.</span><span class="st0">&#39;.xml&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">file_exists</span><span class="br0">&#40;</span><span class="re1">$fname</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="kw3">unlink</span><span class="br0">&#40;</span><span class="st0">&#39;cache/&#39;</span><span class="sy0">.</span><span class="re1">$fname</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$fhandle</span> <span class="sy0">=</span> <span class="kw3">fopen</span><span class="br0">&#40;</span><span class="re1">$fname</span><span class="sy0">,</span> <span class="st0">&#39;w&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw3">fwrite</span><span class="br0">&#40;</span><span class="re1">$fhandle</span><span class="sy0">,</span> <span class="re1">$xml</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw3">fclose</span><span class="br0">&#40;</span><span class="re1">$fhandle</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">return</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
</ol>
</div>
<p>I use simplexml to read that cached file and show the excertps and links once the page is requested. </p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="co1">// retrieve the cached xml and return it as array.</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw2">function</span> cache_xml_retrieve<span class="br0">&#40;</span><span class="re1">$pagetitle</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$fname</span> <span class="sy0">=</span> <span class="st0">&#39;cache/trackback&#39;</span><span class="sy0">.</span><span class="kw3">urlencode</span><span class="br0">&#40;</span><span class="re1">$pagetitle</span><span class="br0">&#41;</span><span class="sy0">.</span><span class="st0">&#39;.xml&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">file_exists</span><span class="br0">&#40;</span><span class="re1">$fname</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$xml</span><span class="sy0">=@</span>simplexml_load_file<span class="br0">&#40;</span><span class="re1">$fname</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="sy0">!</span><span class="re1">$xml</span><span class="br0">&#41;</span> <span class="kw1">return</span> <span class="kw2">false</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$xml</span><span class="sy0">-&gt;</span><span class="me1">entry</span> <span class="kw1">as</span> <span class="re1">$e</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$trackback</span><span class="br0">&#91;</span><span class="st0">&#39;id&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span><span class="br0">&#40;</span>string<span class="br0">&#41;</span> <span class="re1">$e</span><span class="sy0">-&gt;</span><span class="me1">id</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$trackback</span><span class="br0">&#91;</span><span class="st0">&#39;link&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span> &nbsp;rid<span class="br0">&#40;</span><span class="br0">&#40;</span>string<span class="br0">&#41;</span> <span class="re1">$e</span><span class="sy0">-&gt;</span><span class="me1">link</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$trackback</span><span class="br0">&#91;</span><span class="st0">&#39;title&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span> &nbsp;<span class="br0">&#40;</span>string<span class="br0">&#41;</span> <span class="re1">$e</span><span class="sy0">-&gt;</span><span class="me1">title</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$trackback</span><span class="br0">&#91;</span><span class="st0">&#39;description&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span> &nbsp;<span class="br0">&#40;</span>string<span class="br0">&#41;</span> <span class="re1">$e</span><span class="sy0">-&gt;</span><span class="me1">description</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="re1">$arr</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">return</span> <span class="re1">$trackbacks</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">return</span> <span class="kw2">false</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
</ol>
</div>
<p>(this setup requires a subdirectory <strong>cache</strong> set to read/write with chmod 777)</p>
<p>I use http://www.domain.com/financial+trends.html and extract the pagetitle as &#8220;financial trends&#8217;, which has an xml-file http://www.domain.com/cache/financial+trends.xml. (In my own script I use sef urls with mod_rewrite, you can also use the $_SERVER array).</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="re1">$pagetitle</span><span class="sy0">=</span><span class="kw3">preg_replace</span><span class="br0">&#40;</span><span class="st0">&#39;/<span class="es0">\+</span>/&#39;</span><span class="sy0">,</span> <span class="st0">&#39; &#39;</span><span class="sy0">,</span> <span class="kw3">htmlentities</span><span class="br0">&#40;</span><span class="re1">$_REQUEST</span><span class="br0">&#91;</span><span class="st0">&#39;title&#39;</span><span class="br0">&#93;</span><span class="sy0">,</span> ENT_QUOTES<span class="sy0">,</span> <span class="st0">&quot;UTF-8&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$cached_excerpts</span> <span class="sy0">=</span> cache_xml_retrieve<span class="br0">&#40;</span><span class="re1">$pagetitle</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//do some stuff with, make it look nice &nbsp;:</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$s</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$s</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$cached_excerpts</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$s</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//this lists the trackback (candidates)</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw3">echo</span> <span class="re1">$cached_excerpts</span><span class="br0">&#91;</span><span class="re1">$s</span><span class="br0">&#93;</span><span class="br0">&#91;</span><span class="st0">&#39;excerpt&#39;</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw3">echo</span> <span class="st0">&#39;&lt;a href=&quot;&#39;</span><span class="sy0">.</span><span class="re1">$cached_excerpts</span><span class="br0">&#91;</span><span class="re1">$s</span><span class="br0">&#93;</span><span class="br0">&#91;</span><span class="st0">&#39;link&#39;</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&#39;&quot;&gt;&#39;</span><span class="sy0">.</span><span class="re1">$cached_excerpts</span><span class="br0">&#91;</span><span class="st0">&#39;title&#39;</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&#39;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="sy0">&lt;/</span>count<span class="sy0">&gt;</span></div>
</li>
</ol>
</div>
<p>Now I prepare the data for the trackback post :</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$t</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$t</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$trackbacks</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$t</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$trackback_url</span> <span class="sy0">=</span> <span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="re1">$t</span><span class="br0">&#93;</span><span class="br0">&#91;</span><span class="st0">&#39;trackback&#39;</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//does it have a trackback target url ? then prepare data :</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$trackback_url</span> <span class="sy0">!=</span><span class="st0">&#39;&#39;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$trackback_data</span> <span class="sy0">=</span> <span class="kw3">array</span><span class="br0">&#40;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&quot;url&quot;</span> <span class="sy0">=&gt;</span> <span class="st0">&quot;url of my page with the link to the target&quot;</span><span class="sy0">,</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="st0">&quot;title&quot;</span> <span class="sy0">=&gt;</span> <span class="st0">&quot;title of my page&quot;</span><span class="sy0">,</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&quot;blog_name&quot;</span> <span class="sy0">=&gt;</span> <span class="st0">&quot;name of my blog&quot;</span><span class="sy0">,</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&quot;excerpt&quot;</span> <span class="sy0">=&gt;</span> <span class="st0">&#39;[...]&#39;</span><span class="sy0">.</span><span class="kw3">trim</span><span class="br0">&#40;</span><span class="kw3">substr</span><span class="br0">&#40;</span><span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="re1">$t</span><span class="br0">&#93;</span><span class="br0">&#91;</span><span class="st0">&#39;description&#39;</span><span class="br0">&#93;</span><span class="sy0">,</span> <span class="nu0">0</span><span class="sy0">,</span> <span class="nu0">150</span><span class="br0">&#41;</span><span class="sy0">.</span><span class="st0">&#39;[...]&#39;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="co1">//&#8230;and try the trackback</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="re1">$t</span><span class="br0">&#93;</span><span class="br0">&#91;</span><span class="st0">&#39;trackback_success&#39;</span><span class="br0">&#93;</span> <span class="sy0">=</span> trackback_ping<span class="br0">&#40;</span><span class="re1">$trackback_url</span><span class="sy0">,</span> <span class="re1">$mytrackbackdata</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="sy0">&lt;/</span>count<span class="sy0">&gt;</span></div>
</li>
</ol>
</div>
<p>This the actual trackback post using cUrl. cUrl has a convenient timeout setting, I  use three seconds. If a host does not respond in half a second it&#8217;s probably dead. Three seconds is generous.</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="kw2">function</span> trackback_ping<span class="br0">&#40;</span><span class="re1">$trackback_url</span><span class="sy0">,</span> <span class="re1">$trackback</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//make a string of the data array to post</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$trackback</span> <span class="kw1">as</span> <span class="re1">$key</span><span class="sy0">=&gt;</span><span class="re1">$value</span><span class="br0">&#41;</span> <span class="re1">$strout</span><span class="br0">&#91;</span><span class="br0">&#93;</span><span class="sy0">=</span><span class="re1">$key</span><span class="sy0">.</span><span class="st0">&quot;=&quot;</span><span class="sy0">.</span><span class="kw3">rawurlencode</span><span class="br0">&#40;</span><span class="re1">$value</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$postfields</span><span class="sy0">=</span> <span class="kw3">implode</span><span class="br0">&#40;</span><span class="st0">&#39;&amp;&#39;</span><span class="sy0">,</span> <span class="re1">$strout</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; </div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//create a curl instance</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$ch</span> <span class="sy0">=</span> curl_init<span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_URL<span class="sy0">,</span> <span class="re1">$trackback_url</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_TIMEOUT<span class="sy0">,</span> <span class="nu0">3</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_USERAGENT<span class="sy0">,</span> <span class="st0">&quot;Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_RETURNTRANSFER<span class="sy0">,</span> <span class="kw2">true</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//set a custom form header</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_HTTPHEADER<span class="sy0">,</span> <span class="kw3">array</span><span class="br0">&#40;</span><span class="st0">&#39;Content-type: application/x-www-form-urlencoded&#39;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_NOBODY<span class="sy0">,</span> <span class="kw2">true</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_POST<span class="sy0">,</span> <span class="kw2">true</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;curl_setopt<span class="br0">&#40;</span><span class="re1">$ch</span><span class="sy0">,</span> CURLOPT_POSTFIELDS<span class="sy0">,</span> <span class="re1">$postfields</span><span class="br0">&#41;</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$content</span> <span class="sy0">=</span> curl_exec<span class="br0">&#40;</span><span class="re1">$ch</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//if the return has a tag &#39;error&#39; with as value 0 it went flawless</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$success</span> <span class="sy0">=</span> <span class="nu0">0</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">strpos</span><span class="br0">&#40;</span><span class="re1">$content</span><span class="sy0">,</span> <span class="st0">&#39;&gt;0&#39;</span><span class="br0">&#41;</span><span class="sy0">&gt;</span><span class="nu0">0</span><span class="br0">&#41;</span> <span class="re1">$success</span> <span class="sy0">=</span> <span class="nu0">1</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;curl_close <span class="br0">&#40;</span><span class="re1">$ch</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw3">unset</span><span class="br0">&#40;</span><span class="re1">$ch</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">return</span> <span class="re1">$success</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
</ol>
</div>
<p>Now the last routine : rewrite the cached xml file with only the successful trackbacks (seo stuff) :</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$t</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$t</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$trackbacks</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$t</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="re1">$t</span><span class="br0">&#93;</span><span class="br0">&#91;</span><span class="st0">&#39;trackback_success&#39;</span><span class="br0">&#93;</span><span class="sy0">&gt;</span><span class="nu0">0</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$store_trackbacks</span><span class="br0">&#91;</span><span class="br0">&#93;</span><span class="sy0">=</span><span class="re1">$trackbacks</span><span class="br0">&#91;</span><span class="re1">$t</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">cache_xml_store<span class="br0">&#40;</span><span class="re1">$store_trackbacks</span><span class="sy0">,</span> <span class="re1">$pagetitle</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="sy0">&lt;/</span>count<span class="sy0">&gt;</span></div>
</li>
</ol>
</div>
<p>voila : a page with only successful trackbacks. </p>
<p>Google (the backrub engine) don&#8217;t like sites that use automated link-building methods, other engines (Baidu, MSN, Yahoo) use a more normal link popularity keyword matching algorithm. Trackback linking helps getting you a clear engine profile at relative low cost. </p>
<p>0) for brevity and clarity, the code above is rewritten (taken from a trackback script I am developing on another site), it can contain some typo&#8217;s.</p>
<p>*1) If you want to spider links for rdf-segments : <a href="https://svn.typo3.org/TYPO3v4/Extensions/yablog/trunk/class.tx_yablog_ping.php" rel="nofollow">TYPO3v4</a> have some code for easy retrieval of trackback-uri&#8217;s :</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="coMULTI">/**</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; * Fetches ping url from the given url</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; *</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; * @param string $url URL to probe for RDF</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; * @return string Ping URL</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; */</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;protected <span class="kw2">function</span> getPingURL<span class="br0">&#40;</span><span class="re1">$url</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$pingUrl</span> <span class="sy0">=</span> <span class="st0">&#39;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="co1">// Get URL content</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$urlContent</span> <span class="sy0">=</span> t3lib_div<span class="sy0">::</span><span class="me2">getURL</span><span class="br0">&#40;</span><span class="re1">$url</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span><span class="re1">$urlContent</span> <span class="sy0">&amp;&amp;</span> <span class="br0">&#40;</span><span class="re1">$rdfPos</span> <span class="sy0">=</span> <span class="kw3">strpos</span><span class="br0">&#40;</span><span class="re1">$urlContent</span><span class="sy0">,</span> <span class="st0">&#39;&lt;rdf :RDF&#39;</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="sy0">!==</span> <span class="kw2">false</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="co1">// RDF exists in this content. Get it and parse</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$urlContent</span> <span class="sy0">=</span> <span class="kw3">substr</span><span class="br0">&#40;</span><span class="re1">$urlContent</span><span class="sy0">,</span> <span class="re1">$rdfPos</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw1">if</span> <span class="br0">&#40;</span><span class="br0">&#40;</span><span class="re1">$endPos</span> <span class="sy0">=</span> <span class="kw3">strpos</span><span class="br0">&#40;</span><span class="re1">$urlContent</span><span class="sy0">,</span> <span class="st0">&#39;&lt;/rdf:RDF&gt;&#39;</span><span class="sy0">,</span> <span class="re1">$rdfPos</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="sy0">!==</span> <span class="kw2">false</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="co1">// We will use quick regular expression to find ping URL</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$rdfContent</span> <span class="sy0">=</span> <span class="kw3">substr</span><span class="br0">&#40;</span><span class="re1">$urlContent</span><span class="sy0">,</span> <span class="re1">$rdfPos</span><span class="sy0">,</span> <span class="re1">$endPos</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$pingUrl</span> <span class="sy0">=</span> <span class="kw3">preg_replace</span><span class="br0">&#40;</span><span class="st0">&#39;/trackback:ping=&quot;([^&quot;]+)&quot;/&#39;</span><span class="sy0">,</span> <span class="st0">&#39;<span class="es0">\1</span>&#39;</span><span class="sy0">,</span> <span class="re1">$rdfContent</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">return</span> <span class="re1">$pingUrl</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="sy0">&lt;/</span>rdf<span class="sy0">&gt;</span></div>
</li>
</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.juust.org/index.php/curl-trackbacks/2009/03/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>proxies !</title>
		<link>http://www.juust.org/index.php/icanhazproxies/2009/02/</link>
		<comments>http://www.juust.org/index.php/icanhazproxies/2009/02/#comments</comments>
		<pubDate>Sat, 21 Feb 2009 03:41:16 +0000</pubDate>
		<dc:creator>juust</dc:creator>
				<category><![CDATA[php]]></category>
		<category><![CDATA[seo tips and tricks]]></category>
		<category><![CDATA[scrape]]></category>

		<guid isPermaLink="false">http://www.juust.org/?p=336</guid>
		<description><![CDATA[I got a site banned at Google so I got pissed and took a script from the blackbox @ digerati marketing to scrape proxy addresses, wired a database and curl into it, so now it scrapes proxies, random picks a proxy, prunes dead proxies and returns data. 
Basic, it uses anonymous (level 2) proxies, but [...]]]></description>
			<content:encoded><![CDATA[<p>I got a site banned at Google so I got pissed and took a script from the blackbox <a href="http://www.digeratimarketing.co.uk/2008/06/12/blackhat-seo-tools-scripts-the-digerati-blackbox/" rel="nofollow">@ digerati marketing</a> to scrape proxy addresses, wired a database and curl into it, so now it scrapes proxies, random picks a proxy, prunes dead proxies and returns data. </p>
<p>Basic, it uses anonymous (level 2) proxies, but it works. You can check the source <a href="http://serp.trismegistos.net/proxyscript.txt" rel="nofollow">here</a></p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">/* (mysql table)</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">CREATE TABLE IF NOT EXISTS `serp_proxies` (</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; `id` int(11) NOT NULL auto_increment,</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; `ip` text NOT NULL,</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; `port` text NOT NULL,</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">&nbsp; PRIMARY KEY &nbsp;(`id`)</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">) ENGINE=MyISAM &nbsp;DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="coMULTI">*/</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//initialize database class, replace with own code</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">include</span><span class="br0">&#40;</span><span class="st0">&#39;init.php&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//main class</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$p</span><span class="sy0">=</span><span class="kw2">new</span> MyProxies<span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//do I have proxies in the database ?</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//if not, get some and store them</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">GetCount</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="sy0">&lt;</span> <span class="nu0">1</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">GetSomeAir</span><span class="br0">&#40;</span><span class="nu0">1</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">store2database</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//pick one</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">RandomProxy</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//get the page</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">DoRequest</span><span class="br0">&#40;</span><span class="st0">&#39;http://www.domain.com/robots.txt&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//error handling</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">ProxyError</span> <span class="sy0">&gt;</span> <span class="nu0">0</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//7 &nbsp; no connect</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//28 &nbsp; timed out</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//52 &nbsp; empty reply</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//if it is dead, doesn&#39;t allow connections : prune it</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">ProxyError</span><span class="sy0">==</span><span class="nu0">7</span><span class="br0">&#41;</span> <span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">DeleteProxy</span><span class="br0">&#40;</span><span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">proxy_ip</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">ProxyError</span><span class="sy0">==</span><span class="nu0">52</span><span class="br0">&#41;</span> <span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">DeleteProxy</span><span class="br0">&#40;</span><span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">proxy_ip</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//you could loop back until you get a 0-error proxy, but that ain&#39;t the point</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//give me the content</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw3">echo</span> <span class="re1">$p</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">Content</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw2">Class</span> MyProxies <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$Proxies</span> <span class="sy0">=</span> <span class="kw3">array</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$ThisProxy</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$MyCount</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//picks a random proxy from the database</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">function</span> RandomProxy<span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw3">global</span> <span class="re1">$serpdb</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$offset_result</span> <span class="sy0">=</span> &nbsp;<span class="re1">$serpdb</span><span class="sy0">-&gt;</span><span class="me1">query</span><span class="br0">&#40;</span><span class="st0">&quot;SELECT FLOOR(RAND() * COUNT(*)) AS `offset` FROM `serp_proxies`&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$offset_row</span> <span class="sy0">=</span> <span class="kw3">mysql_fetch_object</span><span class="br0">&#40;</span><span class="re1">$offset_result</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$offset</span> <span class="sy0">=</span> <span class="re1">$offset_row</span><span class="sy0">-&gt;</span><span class="me1">offset</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$result</span> <span class="sy0">=</span> <span class="re1">$serpdb</span><span class="sy0">-&gt;</span><span class="me1">query</span><span class="br0">&#40;</span><span class="st0">&quot;SELECT * FROM `serp_proxies` LIMIT $offset, 1&quot;</span> <span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">while</span><span class="br0">&#40;</span><span class="re1">$row</span><span class="sy0">=</span><span class="kw3">mysql_fetch_assoc</span><span class="br0">&#40;</span><span class="re1">$result</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//make instance of Proxy, with proxy_host ip and port</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span> <span class="sy0">=</span> <span class="kw2">new</span> Proxy<span class="br0">&#40;</span><span class="re1">$row</span><span class="br0">&#91;</span><span class="st0">&#39;ip&#39;</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&#39;:&#39;</span><span class="sy0">.</span><span class="re1">$row</span><span class="br0">&#91;</span><span class="st0">&#39;port&#39;</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">proxy_ip</span> <span class="sy0">=</span> <span class="re1">$row</span><span class="br0">&#91;</span><span class="st0">&#39;ip&#39;</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ThisProxy</span><span class="sy0">-&gt;</span><span class="me1">proxy_port</span> <span class="sy0">=</span> <span class="re1">$row</span><span class="br0">&#91;</span><span class="st0">&#39;port&#39;</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw1">break</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//visit the famous russian site </span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">function</span> GetSomeAir<span class="br0">&#40;</span><span class="re1">$pages</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$index</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span> <span class="re1">$index</span><span class="sy0">&lt;</span> <span class="re1">$pages</span><span class="sy0">;</span> <span class="re1">$index</span><span class="sy0">++</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$pageno</span> <span class="sy0">=</span> <span class="kw3">sprintf</span><span class="br0">&#40;</span><span class="st0">&quot;%02d&quot;</span><span class="sy0">,</span><span class="re1">$index</span><span class="nu0">+1</span><span class="br0">&#41;</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$page_url</span> <span class="sy0">=</span> <span class="st0">&quot;http://www.samair.ru/proxy/proxy-&quot;</span> <span class="sy0">.</span> <span class="re1">$pageno</span> <span class="sy0">.</span> <span class="st0">&quot;.htm&quot;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$page_html</span> <span class="sy0">=</span> <span class="sy0">@</span><span class="kw3">file_get_contents</span><span class="br0">&#40;</span><span class="re1">$page_url</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//get rid of the crap and extract the proxies</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw3">preg_match</span><span class="br0">&#40;</span><span class="st0">&quot;/&lt;tr&gt;&lt;td&gt;(.*)&lt; <span class="es0">\/</span>td&gt;&lt; <span class="es0">\/</span>tr&gt;/&quot;</span><span class="sy0">,</span> <span class="re1">$page_html</span><span class="sy0">,</span> <span class="re1">$matches</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$txt</span> <span class="sy0">=</span> <span class="re1">$matches</span><span class="br0">&#91;</span><span class="nu0">1</span><span class="br0">&#93;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$main</span> <span class="sy0">=</span> <span class="kw3">split</span><span class="br0">&#40;</span><span class="st0">&#39;&lt;/td&gt;&lt;tr&gt;&lt;td&gt;&#39;</span><span class="sy0">,</span> <span class="re1">$txt</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$x</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$x</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$main</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$x</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="re1">$arr</span> <span class="sy0">=</span> <span class="kw3">split</span><span class="br0">&#40;</span><span class="st0">&#39;&lt;/td&gt;&lt;td&gt;&#39;</span><span class="sy0">,</span> <span class="re1">$main</span><span class="br0">&#91;</span><span class="re1">$x</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">Proxies</span><span class="br0">&#91;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="kw3">split</span><span class="br0">&#40;</span><span class="st0">&#39;:&#39;</span><span class="sy0">,</span> <span class="re1">$arr</span><span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//store the retrieved proxies (stored in this-&gt;Proxies) in the database</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">function</span> store2database<span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw3">global</span> <span class="re1">$serpdb</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">Proxies</span> <span class="kw1">as</span> <span class="re1">$p</span><span class="br0">&#41;</span> <span class="br0">&#123;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$result</span> <span class="sy0">=</span> <span class="re1">$serpdb</span><span class="sy0">-&gt;</span><span class="me1">query</span><span class="br0">&#40;</span><span class="st0">&quot;SELECT * FROM serp_proxies WHERE ip=&#39;&quot;</span><span class="sy0">.</span><span class="re1">$p</span><span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&quot;&#39;&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">mysql_num_rows</span><span class="br0">&#40;</span><span class="re1">$result</span><span class="br0">&#41;</span><span class="sy0">&amp;</span>lt<span class="sy0">;</span><span class="nu0">1</span><span class="br0">&#41;</span> <span class="re1">$serpdb</span><span class="sy0">-&gt;</span><span class="me1">query</span><span class="br0">&#40;</span><span class="st0">&quot;INSERT INTO serp_proxies (`ip`, `port`) VALUES (&#39;&quot;</span><span class="sy0">.</span><span class="re1">$p</span><span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&quot;&#39;, &#39;&quot;</span><span class="sy0">.</span><span class="re1">$p</span><span class="br0">&#91;</span><span class="nu0">1</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&quot;&#39;)&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$serpdb</span><span class="sy0">-&gt;</span><span class="me1">query</span><span class="br0">&#40;</span><span class="st0">&quot;DELETE FROM serp_proxies WHERE `ip`=&#39;&#39;&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">function</span> DeleteProxy<span class="br0">&#40;</span><span class="re1">$ip</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw3">global</span> <span class="re1">$serpdb</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$serpdb</span><span class="sy0">-&gt;</span><span class="me1">query</span><span class="br0">&#40;</span><span class="st0">&quot;DELETE FROM serp_proxies WHERE `ip`=&#39;&quot;</span><span class="sy0">.</span><span class="re1">$ip</span><span class="sy0">.</span><span class="st0">&quot;&#39;&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span> &nbsp; </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">function</span> GetCount<span class="br0">&#40;</span><span class="br0">&#41;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//use this to check how many proxies there are in the database</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw3">global</span> <span class="re1">$serpdb</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">MyCount</span> <span class="sy0">=</span> <span class="kw3">mysql_num_rows</span><span class="br0">&#40;</span><span class="re1">$serpdb</span><span class="sy0">-&gt;</span><span class="me1">query</span><span class="br0">&#40;</span><span class="st0">&quot;SELECT * FROM `serp_proxies`&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">return</span> <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">MyCount</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw2">Class</span> Proxy <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$proxy_ip</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$proxy_port</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$proxy_host</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$proxy_auth</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$ch</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$Content</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$USERAGENT</span> <span class="sy0">=</span> <span class="st0">&quot;Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)&quot;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$ProxyError</span> <span class="sy0">=</span> <span class="nu0">0</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$ProxyErrorMsg</span> <span class="sy0">=</span> <span class="st0">&#39;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$TimeOut</span><span class="sy0">=</span><span class="nu0">3</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">var</span> <span class="re1">$IncludeHeaders</span> <span class="sy0">=</span> <span class="nu0">0</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">function</span> Proxy<span class="br0">&#40;</span><span class="re1">$host</span><span class="sy0">,</span> <span class="re1">$username</span><span class="sy0">=</span><span class="st0">&#39;&#39;</span><span class="sy0">,</span> <span class="re1">$pwd</span><span class="sy0">=</span><span class="st0">&#39;&#39;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//initialize class, set host </span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">proxy_host</span> <span class="sy0">=</span> <span class="re1">$host</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">if</span> <span class="br0">&#40;</span><span class="kw3">strlen</span><span class="br0">&#40;</span><span class="re1">$username</span><span class="br0">&#41;</span> <span class="sy0">&gt;</span> <span class="nu0">0</span> <span class="sy0">||</span> <span class="kw3">strlen</span><span class="br0">&#40;</span><span class="re1">$pwd</span><span class="br0">&#41;</span> <span class="sy0">&gt;</span> <span class="nu0">0</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">proxy_auth</span> <span class="sy0">=</span> <span class="re1">$username</span><span class="sy0">.</span><span class="st0">&quot;:&quot;</span><span class="sy0">.</span><span class="re1">$pwd</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">function</span> CURL_PROXY<span class="br0">&#40;</span><span class="re1">$cc</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw1">if</span> <span class="br0">&#40;</span><span class="kw3">strlen</span><span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">proxy_host</span><span class="br0">&#41;</span> <span class="sy0">&gt;</span> <span class="nu0">0</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; curl_setopt<span class="br0">&#40;</span><span class="re1">$cc</span><span class="sy0">,</span> CURLOPT_PROXY<span class="sy0">,</span> <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">proxy_host</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span><span class="kw3">strlen</span><span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">proxy_auth</span><span class="br0">&#41;</span> <span class="sy0">&gt;</span> <span class="nu0">0</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;curl_setopt<span class="br0">&#40;</span><span class="re1">$cc</span><span class="sy0">,</span> CURLOPT_PROXYUSERPWD<span class="sy0">,</span> <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">proxy_auth</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw2">function</span> DoRequest<span class="br0">&#40;</span><span class="re1">$url</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span> <span class="sy0">=</span> curl_init<span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; curl_setopt<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="sy0">,</span> CURLOPT_URL<span class="sy0">,</span><span class="re1">$url</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">CURL_PROXY</span><span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; curl_setopt<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="sy0">,</span> CURLOPT_HEADER<span class="sy0">,</span> <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">IncludeHeaders</span><span class="br0">&#41;</span><span class="sy0">;</span> <span class="co1">// baca header</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; </div>
</li>
<li class="li1">
<div class="de1">&nbsp; curl_setopt<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="sy0">,</span> CURLOPT_USERAGENT<span class="sy0">,</span> <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">USERAGENT</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; curl_setopt<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="sy0">,</span> CURLOPT_RETURNTRANSFER<span class="sy0">,</span> <span class="nu0">1</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; curl_setopt<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="sy0">,</span> CURLOPT_TIMEOUT<span class="sy0">,</span> <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">TimeOut</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">Content</span> <span class="sy0">=</span> curl_exec<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//if an error occurs, store the number and message</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span>curl_errno<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="br0">&#41;</span><span class="br0">&#41;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#123;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ProxyError</span> <span class="sy0">=</span> &nbsp;curl_errno<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ProxyErrorMsg</span> <span class="sy0">=</span> &nbsp;curl_error<span class="br0">&#40;</span><span class="re1">$this</span><span class="sy0">-&gt;</span><span class="me1">ch</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="sy0">&lt;/</span>td<span class="sy0">&gt;&lt;/</span>count<span class="sy0">&gt;&lt;/</span>td<span class="sy0">&gt;&lt;/</span>tr<span class="sy0">&gt;</span></div>
</li>
</ol>
</div>
<p>There is not much to say about it, just a rough outline. I would prefer elite level 1 proxies but for now it will have to do.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.juust.org/index.php/icanhazproxies/2009/02/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>synonymizer with api</title>
		<link>http://www.juust.org/index.php/synonymizer-with-api/2008/12/</link>
		<comments>http://www.juust.org/index.php/synonymizer-with-api/2008/12/#comments</comments>
		<pubDate>Sun, 28 Dec 2008 12:09:46 +0000</pubDate>
		<dc:creator>juust</dc:creator>
				<category><![CDATA[optimization]]></category>
		<category><![CDATA[seo tips and tricks]]></category>
		<category><![CDATA[optimisation]]></category>

		<guid isPermaLink="false">http://www.juust.org/?p=172</guid>
		<description><![CDATA[If you want to put some old content on the net and have it indexed as fresh unique content, this works wonders for seo-friendly backlinks : the automated synonymizer. I want one that makes my content unique without having to type one character.
Lucky for me, mister John Watson&#8217;s synonym database comes with a free 10.000 [...]]]></description>
			<content:encoded><![CDATA[<p>If you want to put some old content on the net and have it indexed as fresh unique content, this works wonders for seo-friendly backlinks : the automated synonymizer. I want one that makes my content unique without having to type one character.</p>
<p>Lucky for me, mister <a href="http://words.bighugelabs.com/" rel="nofollow">John Watson&#8217;s synonym database</a> comes with a free 10.000 request a day API and boy is it sweet! </p>
<p>API Requests are straightforward :<br />
http://words.bighugelabs.com/api/2/[<a href="http://words.bighugelabs.com/api.php" rel="nofollow">apikey</a>]/[keyword]/xml</p>
<p>A number of return formats are supported but xml is easiest, either for parsing with simplexml or regular pattern matching.</p>
<p>It returns on request :<br />
<strong>black</strong> (slightly shortened)<br />
an xml file like :<br />
&lt;words&gt;<br />
&lt;w p=&#8221;adjective&#8221; r=&#8221;syn&#8221;&gt;bleak&lt;/w&gt;<br />
&lt;w p=&#8221;adjective&#8221; r=&#8221;syn&#8221;&gt;sinister&lt;/w&gt;<br />
&lt;w p=&#8221;adjective&#8221; r=&#8221;sim&#8221;&gt;dark&lt;/w&gt;<br />
&lt;w p=&#8221;adjective&#8221; r=&#8221;sim&#8221;&gt;angry&lt;/w&gt;<br />
&lt;w p=&#8221;noun&#8221; r=&#8221;syn&#8221;&gt;blackness&lt;/w&gt;<br />
&lt;w p=&#8221;noun&#8221; r=&#8221;syn&#8221;&gt;inkiness&lt;/w&gt;<br />
&lt;w p=&#8221;verb&#8221; r=&#8221;syn&#8221;&gt;blacken&lt;/w&gt;<br />
&lt;w p=&#8221;verb&#8221; r=&#8221;syn&#8221;&gt;melanize&lt;/w&gt;<br />
&lt;/words&gt;</p>
<p>&#8230;which is easiest handled with preg_match_all :</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="kw2">function</span> getsynonyms<span class="br0">&#40;</span><span class="re1">$keyword</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="re1">$pick</span> <span class="sy0">=</span> <span class="kw3">array</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$apikey</span> <span class="sy0">=</span> <span class="st0">&#39;get your own key&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$xml</span><span class="sy0">=</span><span class="kw3">file_get_contents</span><span class="br0">&#40;</span><span class="st0">&#39;http://words.bighugelabs.com/api/2/&#39;</span><span class="sy0">.</span><span class="re1">$apikey</span><span class="sy0">.</span><span class="st0">&#39;/&#39;</span><span class="sy0">.</span><span class="re1">$keyword</span><span class="sy0">.</span><span class="st0">&#39;/xml&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="sy0">!</span><span class="re1">$xml</span><span class="br0">&#41;</span> <span class="kw1">return</span> <span class="re1">$pick</span><span class="sy0">;</span> <span class="co1">//return empty array</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw3">preg_match_all</span><span class="br0">&#40;</span><span class="st0">&#39;/&lt;w p=&quot;adjective&quot; r=&quot;syn&quot;&gt;(.*?)&lt; <span class="es0">\/</span>w&gt;/&#39;</span><span class="sy0">,</span> <span class="re1">$xml</span><span class="sy0">,</span> <span class="re1">$adj_syns</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//preg_match_all(&#39;/&lt;/w&gt;&lt;w p=&quot;adjective&quot; r=&quot;sim&quot;&gt;(.*?)&lt; \/w&gt;/&#39;, $xml, $adj_sims);</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//preg_match_all(&#39;/&lt;/w&gt;&lt;w p=&quot;noun&quot; r=&quot;syn&quot;&gt;(.*?)&lt; \/w&gt;/&#39;, $xml, $noun_syns);</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//preg_match_all(&#39;/&lt;/w&gt;&lt;w p=&quot;verb&quot; r=&quot;syn&quot;&gt;(.*?)&lt; \/w&gt;/&#39;, $xml, $verb_syns);</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$adj_syns</span><span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span> <span class="kw1">as</span> <span class="re1">$adj_syn</span><span class="br0">&#41;</span> <span class="re1">$pick</span><span class="br0">&#91;</span><span class="br0">&#93;</span><span class="sy0">=</span><span class="re1">$adj_syn</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; &nbsp; <span class="co1">//same for verb/noun synonyms, I just want adjectives</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">return</span> <span class="re1">$pick</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="sy0">&lt;/</span>w<span class="sy0">&gt;</span></div>
</li>
</ol>
</div>
<p>practically applying it,<br />
I take a slab of stale old content and&#8230;</p>
<ul>
<li>strip tags</li>
<li>do a regular match on all alphanumeric sequences dropping other stuff</li>
<li>trim the resulting array elements</li>
<li>(merge all blog tags, categories, and a list of common words)</li>
<li>excluding common terms from the array with text elements</li>
<li>excluding words smaller than N characters</li>
<li>set a percentage words to be synonimized</li>
<li>attempt to retrieve synonyms for remaining terms</li>
<li>replace these words in the original text, keep count</li>
<li>when I reach the target replacement percentage, abort</li>
<li>return (hopefully) a revived text</li>
</ul>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1"><span class="kw2">function</span> synonymize<span class="br0">&#40;</span><span class="re1">$origtext</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//make a copy of the original text to dissect</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$content</span><span class="sy0">=</span><span class="re1">$origtext</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//content = $this-&gt;body;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$perc</span><span class="sy0">=</span><span class="nu0">3</span><span class="sy0">;</span> &nbsp; <span class="co1">//target percentage changed terms</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$minlength</span><span class="sy0">=</span><span class="nu0">4</span><span class="sy0">;</span> &nbsp;<span class="co1">//minimum length candidates</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$maxrequests</span><span class="sy0">=</span><span class="nu0">80</span><span class="sy0">;</span> <span class="co1">//max use of api-requests</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//dump tags </span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$content</span> <span class="sy0">=</span> &nbsp;<span class="kw3">strip_tags</span><span class="br0">&#40;</span><span class="re1">$content</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//dump non-alphanumeric string characters</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$content</span> <span class="sy0">=</span> <span class="kw3">preg_replace</span><span class="br0">&#40;</span><span class="st0">&#39;/[^A-Za-z0-9<span class="es0">\-</span>]/&#39;</span><span class="sy0">,</span> <span class="st0">&#39; &#39;</span><span class="sy0">,</span> <span class="re1">$content</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//explode on blank space</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$wrds</span> <span class="sy0">=</span> <span class="kw3">explode</span><span class="br0">&#40;</span><span class="st0">&#39; &#39;</span><span class="sy0">,</span> <span class="kw3">strtolower</span><span class="br0">&#40;</span><span class="re1">$content</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//trim off blank spaces just in case</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$w</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$w</span><span class="sy0">&lt;</span>count <span class="br0">&#40;</span><span class="re1">$wrds</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$w</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="re1">$words</span><span class="br0">&#91;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="kw3">trim</span><span class="br0">&#40;</span><span class="re1">$wrds</span><span class="br0">&#91;</span><span class="re1">$w</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//this should be all words</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$wordcount</span> <span class="sy0">=</span> <span class="kw3">count</span><span class="br0">&#40;</span><span class="re1">$words</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//how many words do I want changed ?</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$toswitch</span> <span class="sy0">=</span> <span class="kw3">round</span><span class="br0">&#40;</span><span class="re1">$wordcount</span><span class="sy0">*</span><span class="re1">$perc</span><span class="sy0">/</span><span class="nu0">100</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//only use uniques</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$words_unique</span><span class="sy0">=</span><span class="kw3">array_unique</span><span class="br0">&#40;</span><span class="re1">$words</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//sort, start with words at the end of the text </span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw3">sort</span><span class="br0">&#40;</span><span class="re1">$words_unique</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//merge common with tags, categories, linked_tags</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$common</span> <span class="sy0">=</span> <span class="kw3">array</span><span class="br0">&#40;</span><span class="st0">&quot;never&quot;</span><span class="sy0">,</span> <span class="st0">&quot;about&quot;</span><span class="sy0">,</span> <span class="st0">&quot;price&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//note : setting the minlength to 4 excludes lots of common terms</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$i</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">&lt;</span>count<span class="br0">&#40;</span><span class="re1">$words_unique</span><span class="br0">&#41;</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//if in common array, not selectable for synonymizing</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">in_array</span><span class="br0">&#40;</span><span class="re1">$words_unique</span><span class="br0">&#91;</span><span class="re1">$i</span><span class="br0">&#93;</span><span class="sy0">,</span> <span class="re1">$common</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span><span class="br0">&#125;</span> <span class="kw1">else</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="co1">//only terms bigger than minlength</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">strlen</span><span class="br0">&#40;</span><span class="re1">$words_unique</span><span class="br0">&#91;</span><span class="re1">$i</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">&gt;</span><span class="re1">$minlength</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="co1">//words_select contains candidates for synonyms</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$words_select</span><span class="br0">&#91;</span><span class="br0">&#93;</span> <span class="sy0">=</span> <span class="kw3">trim</span><span class="br0">&#40;</span><span class="re1">$words_unique</span><span class="br0">&#91;</span><span class="re1">$i</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//terms that can be changed</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$max</span> <span class="sy0">=</span> <span class="kw3">count</span><span class="br0">&#40;</span><span class="re1">$words_select</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//no more requests than max</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$max</span><span class="sy0">&gt;</span><span class="re1">$maxrequests</span><span class="br0">&#41;</span> <span class="re1">$max</span><span class="sy0">=</span><span class="re1">$maxrequests</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">for</span><span class="br0">&#40;</span><span class="re1">$i</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">&lt;</span> <span class="re1">$max</span><span class="sy0">;</span><span class="re1">$i</span><span class="sy0">++</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//get synonyms, give server some time</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw3">usleep</span><span class="br0">&#40;</span><span class="nu0">100000</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="co1">//retrieve synonyms etc.</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$these_words</span> <span class="sy0">=</span> getsynonyms<span class="br0">&#40;</span><span class="re1">$words_select</span><span class="br0">&#91;</span><span class="re1">$i</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$jmax</span><span class="sy0">=</span><span class="kw3">count</span><span class="br0">&#40;</span><span class="re1">$these_words</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$jmax</span><span class="sy0">&amp;</span>lt<span class="sy0">;</span><span class="nu0">1</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="co1">//no results</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span> <span class="kw1">else</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$count</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$j</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//the replacements are done in the original text</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$origtext</span><span class="sy0">=</span> <span class="kw3">preg_replace</span><span class="br0">&#40;</span><span class="st0">&#39;/&#39;</span><span class="sy0">.</span><span class="re1">$words_select</span><span class="br0">&#91;</span><span class="re1">$i</span><span class="br0">&#93;</span><span class="sy0">.</span><span class="st0">&#39;/i&#39;</span><span class="sy0">,</span> <span class="re1">$these_words</span><span class="br0">&#91;</span><span class="re1">$j</span><span class="br0">&#93;</span><span class="sy0">,</span> <span class="re1">$origtext</span><span class="sy0">,</span> <span class="nu0">-1</span><span class="sy0">,</span> <span class="re1">$count</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$total_switched</span><span class="sy0">+=</span><span class="re1">$count</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span> <span class="co1">//have we reached the percentage ? </span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$total_switched</span><span class="sy0">&gt;=</span><span class="re1">$toswitch</span><span class="br0">&#41;</span> <span class="kw1">break</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="co1">//okay!</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">return</span> <span class="re1">$origtext</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="kw2">function</span> getsynonyms<span class="br0">&#40;</span><span class="re1">$keyword</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$pick</span><span class="sy0">=</span><span class="kw3">array</span> <span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$apikey</span> <span class="sy0">=</span> <span class="st0">&#39;get your own key at bighugelabs.com&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="re1">$xml</span><span class="sy0">=@</span><span class="kw3">file_get_contents</span><span class="br0">&#40;</span><span class="st0">&#39;http://words.bighugelabs.com/api/2/&#39;</span><span class="sy0">.</span><span class="re1">$apikey</span><span class="sy0">.</span><span class="st0">&#39;/&#39;</span><span class="sy0">.</span><span class="kw3">urlencode</span><span class="br0">&#40;</span><span class="re1">$keyword</span><span class="br0">&#41;</span><span class="sy0">.</span><span class="st0">&#39;/xml&#39;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="sy0">!</span><span class="re1">$xml</span><span class="br0">&#41;</span> <span class="kw1">return</span> <span class="re1">$pick</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw3">preg_match_all</span><span class="br0">&#40;</span><span class="st0">&#39;/&lt;w p=&quot;adjective&quot; r=&quot;syn&quot;&gt;(.*?)&lt; <span class="es0">\/</span>w&gt;/&#39;</span><span class="sy0">,</span> <span class="re1">$xml</span><span class="sy0">,</span> <span class="re1">$adj_syns</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$adj_syns</span><span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span> <span class="kw1">as</span> <span class="re1">$adj_syn</span><span class="br0">&#41;</span> <span class="re1">$pick</span><span class="br0">&#91;</span><span class="br0">&#93;</span><span class="sy0">=</span><span class="re1">$adj_syn</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">return</span> <span class="re1">$pick</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="sy0">&lt;/</span>w<span class="sy0">&gt;&lt;/</span>count<span class="sy0">&gt;</span></div>
</li>
</ol>
</div>
<p>Nothing fancy, a straightforward search-replace routine. A 1200 word text has about 150 candidates and for 3% synonyms I need to replace 36 words, it can do that. If I were to use it for real I would build a table with non-returning terms, and store often used terms, that would speed up the synonimizing, allow the use of preferences and take a load of the api use.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.juust.org/index.php/synonymizer-with-api/2008/12/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>RedHat Seo : scraper auto-blogging</title>
		<link>http://www.juust.org/index.php/redhat-seo-christmas-edition/2008/12/</link>
		<comments>http://www.juust.org/index.php/redhat-seo-christmas-edition/2008/12/#comments</comments>
		<pubDate>Fri, 26 Dec 2008 18:07:01 +0000</pubDate>
		<dc:creator>juust</dc:creator>
				<category><![CDATA[google]]></category>
		<category><![CDATA[seo]]></category>
		<category><![CDATA[seo tips and tricks]]></category>
		<category><![CDATA[tool]]></category>
		<category><![CDATA[wordpress]]></category>
		<category><![CDATA[xml-rpc]]></category>
		<category><![CDATA[scrape]]></category>

		<guid isPermaLink="false">http://www.juust.org/?p=270</guid>
		<description><![CDATA[Just give us your endpoint and we&#8217;ll take it from there, sparky!
I was going to make one of these tools to scrape google and conjur a full blog out of nowhere, as Christmas special, RedHat Seo. The rough sketch has arrived , far from perfect, but it does produce a blog and don&#8217;t even look [...]]]></description>
			<content:encoded><![CDATA[<blockquote><p>Just give us your endpoint and we&#8217;ll take it from there, sparky!</p></blockquote>
<p>I was going to make one of these tools to scrape google and conjur a full blog out of nowhere, as Christmas special, RedHat Seo. The rough sketch has arrived , far from perfect, but it does produce a blog and don&#8217;t even look too shabby. I scraped a <a href="" rel="nofollow" target="_blank">small batch</a> of posts off of blogs, keeping the links intact and adding a tribute links. I hope they will pardon me for it. </p>
<h3>structure</h3>
<p>I use three main classes, </p>
<table>
<tbody>
<tr>
<td>BlogMaker    </td>
<td>     the application</td>
</tr>
<tr>
<td>Target         </td>
<td>     the blogs you aim for</td>
</tr>
<tr>
<td>WPContent   </td>
<td>     the scraped goodies</td>
</tr>
</tbody>
</table>
<p>&#8230;and two support classes</p>
<table>
<tbody>
<tr>
<td>SerpResult    </td>
<td>    scraped urls</td>
</tr>
<tr>
<td>Custom_RPC   </td>
<td>    a simple rpc-poster</td>
</tr>
</tbody>
</table>
<p>Target blogs have three texts, </p>
<table>
<tbody>
<tr>
<td>file</td>
<td>contents</td>
<td>maintenance</td>
</tr>
<tr>
<td>blog categories</td>
<td>category you post under</td>
<td>manual</td>
</tr>
<tr>
<td>blog tags</td>
<td> tags you list on the blog</td>
<td>manual</td>
</tr>
<tr>
<td>blog urls</td>
<td> urls already used for the blog</td>
<td>system</td>
</tr>
</tbody>
</table>
<h3>routine</h3>
<p>The BlogMaker class grabs a result list (up to 1000 urls per phrase) from Google, extracts the urls and stores them in SerpResult,  scrapes the urls and extracts the <strong>entry</strong> divs, stores div-entries in the WPContent class (that has some basic functions to sanitize the text), and uses the BlogTarget-definitions to post it up blogs with xml-rpc.</p>
<h3>usage</h3>
<p>My highlighter tends to mess up text with div markers in it, copying off the blog may not work,<br />
the full text source (about 500 lines) is <a href="http://serp.trismegistos.net/fastblog.txt" target="_blank" rel="nofollow">overhere</a>. Underneath I&#8217;ll list the main program loop :</p>
<div class="geshi no php">
<ol>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//make main instance</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$Blog</span> <span class="sy0">=</span> <span class="kw2">new</span> BlogMaker<span class="br0">&#40;</span><span class="st0">&quot;keyword&quot;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//define a target blog, you can define multiple blogs and refer with code</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//then add rpc-url, password and user</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//and for every target blog three text-files </span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$T</span><span class="sy0">=</span><span class="re1">$Blog</span><span class="sy0">-&gt;</span><span class="me1">AddTarget</span><span class="br0">&#40;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&#39;blogcode&#39;</span><span class="sy0">,</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&#39;http://my.blog.com/xmlrpc.php&#39;</span><span class="sy0">,</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&#39;password&#39;</span><span class="sy0">,</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&#39;user&#39;</span><span class="sy0">,</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&#39;keyword.categories.txt&#39;</span><span class="sy0">,</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&#39;keyword.tags.txt&#39;</span><span class="sy0">,</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="st0">&#39;keyword.urls.txt&#39;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//read the tags, cats and url text files stored on the server </span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//all retrieved urls are tested, if the target blog already has that</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//scraped url, it is discarded.</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$T</span><span class="sy0">-&gt;</span><span class="me1">CSV_GetTags</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$T</span><span class="sy0">-&gt;</span><span class="me1">List_GetCats</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$T</span><span class="sy0">-&gt;</span><span class="me1">ReadURL</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//grab the google result list</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//use params (pages, keywords) to specify search</span></div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$Blog</span><span class="sy0">-&gt;</span><span class="me1">GoogleResults</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1"><span class="re1">$a</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$Blog</span><span class="sy0">-&gt;</span><span class="me1">Results</span> <span class="kw1">as</span> <span class="re1">$BlogUrl</span><span class="br0">&#41;</span> <span class="br0">&#123;</span> </div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="re1">$a</span><span class="sy0">++;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="kw3">echo</span> <span class="re1">$BlogUrl</span><span class="sy0">-&gt;</span><span class="me1">url</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//see if the url isnt used yet</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$T</span><span class="sy0">-&gt;</span><span class="me1">checkURL</span><span class="br0">&#40;</span><span class="kw3">trim</span><span class="br0">&#40;</span><span class="re1">$BlogUrl</span><span class="sy0">-&gt;</span><span class="me1">url</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">!=</span><span class="kw2">true</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw3">echo</span> <span class="st0">&#39;&#8230;checking &#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw3">flush</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//if not used, get the source</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$BlogUrl</span><span class="sy0">-&gt;</span><span class="me1">scrape</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//check for divs marked &quot;entry&quot;, if they arent there, check &quot;post&quot;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//some blogs use other indications for the content</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//but entry and post cover 40%</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$entries</span> <span class="sy0">=</span> <span class="re1">$BlogUrl</span><span class="sy0">-&gt;</span><span class="me1">get_entries</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">count</span><span class="br0">&#40;</span><span class="re1">$entries</span><span class="br0">&#41;</span><span class="sy0">&amp;</span>lt<span class="sy0">;</span><span class="nu0">1</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw3">echo</span> <span class="st0">&#39;no entries&#8230;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw3">flush</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$entries</span> <span class="sy0">=</span> <span class="re1">$BlogUrl</span><span class="sy0">-&gt;</span><span class="me1">get_posts</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="kw1">if</span><span class="br0">&#40;</span><span class="kw3">count</span><span class="br0">&#40;</span><span class="re1">$entries</span><span class="br0">&#41;</span><span class="sy0">&amp;</span>lt<span class="sy0">;</span><span class="nu0">1</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; <span class="kw3">echo</span> <span class="st0">&#39;no posts either&#8230;&#39;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//if no entry-post div, mark url as done</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp; <span class="re1">$T</span><span class="sy0">-&gt;</span><span class="me1">RegisterURL</span><span class="br0">&#40;</span><span class="re1">$BlogUrl</span><span class="sy0">-&gt;</span><span class="me1">url</span><span class="br0">&#41;</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="re1">$ct</span><span class="sy0">=</span><span class="nu0">0</span><span class="sy0">;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="kw1">foreach</span><span class="br0">&#40;</span><span class="re1">$BlogUrl</span><span class="sy0">-&gt;</span><span class="me1">WpContentPieces</span> <span class="kw1">as</span> <span class="re1">$WpContent</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//in the get_entries/get_post function the fragments are stored</span></div>
</li>
<li class="li1">
<div class="de1"><span class="co1">//as wpcontent</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="re1">$ct</span><span class="sy0">++;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;</div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="kw1">if</span><span class="br0">&#40;</span><span class="re1">$WpContent</span><span class="sy0">-&gt;</span><span class="me1">judge</span><span class="br0">&#40;</span><span class="nu0">2000</span><span class="sy0">,</span> <span class="nu0">200</span><span class="sy0">,</span> <span class="nu0">5</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="re1">$WpContent</span><span class="sy0">-&gt;</span><span class="me1">tribute</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="sy0">;</span> &nbsp;<span class="co1">//add tribute link</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="re1">$T</span><span class="sy0">-&gt;</span><span class="me1">settags</span><span class="br0">&#40;</span><span class="re1">$WpContent</span><span class="sy0">-&gt;</span><span class="me1">divcontent</span><span class="br0">&#41;</span><span class="sy0">;</span> <span class="co1">//add tags</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="re1">$T</span><span class="sy0">-&gt;</span><span class="me1">postCustomRPC</span><span class="br0">&#40;</span><span class="re1">$WpContent</span><span class="sy0">-&gt;</span><span class="me1">title</span><span class="sy0">,</span> <span class="re1">$WpContent</span><span class="sy0">-&gt;</span><span class="me1">divcontent</span><span class="sy0">,</span> <span class="nu0">1</span><span class="br0">&#41;</span><span class="sy0">;</span> <span class="co1">//1=publish, 0=draft</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; &nbsp;<span class="re1">$T</span><span class="sy0">-&gt;</span><span class="me1">RegisterURL</span><span class="br0">&#40;</span><span class="re1">$WpContent</span><span class="sy0">-&gt;</span><span class="me1">url</span><span class="br0">&#41;</span><span class="sy0">;</span> &nbsp;<span class="co1">//register use of url</span></div>
</li>
<li class="li1">
<div class="de1"><span class="kw3">usleep</span><span class="br0">&#40;</span><span class="nu0">20000000</span><span class="br0">&#41;</span><span class="sy0">;</span> &nbsp;<span class="co1">//20 seconds break, for sitemapping</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; &nbsp;<span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp; <span class="br0">&#125;</span></div>
</li>
<li class="li1">
<div class="de1">&nbsp;<span class="br0">&#125;</span></div>
</li>
</ol>
</div>
<h3>notes</h3>
<ul>
<li>xml-rpc needs to be activated explicitly on the wordpress dashboard under settings/writing.</li>
<li>categories must be present in the blog</li>
<li>url file must be writeable by the server (777)</li>
</ul>
<p>It seems wordpress builds the sitemap as background process, the standard google xml sitemap plugin wil attempt to build in the cache (takes anywhere between 2 and 10 seconds), and apart from building a sitemap the posts also get pinged around. Giving the install 10 to 20 seconds between posts allows for all the hooked in functions to be completed.</p>
<h3>period</h3>
<p>That&#8217;s about all,<br />
consider it gpl, I added some comments in the source but I will not develop this any further. A mysql backed blogfarm tool (euphemistically called &#8216;publishing tool&#8217;) is more interesting, besides, I am off to the wharves to do some painting.</p>
<p>if you use it, send some feedback,<br />
merry christmas dogheads</p>
]]></content:encoded>
			<wfw:commentRss>http://www.juust.org/index.php/redhat-seo-christmas-edition/2008/12/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
