Google Panda Latent Semantic Indexing Test

Panda’s…

Latent Semantic Indexing

Queries, or concept searches, against a set of documents that have undergone LSI will return results that are conceptually similar in meaning to the search criteria even if the results don’t share a specific word or words with the search criteria.

 

LSI Test

O my friend, Panda is something that has to be surpassed. In {speculation|guess|supposition|surmise|surmisal|possibility|hypothesis} and keeping silence shall the friend be a master: you should not wish to see everything. (Nietzsche, Also Sprach Zarathustra)

 

Id the_term the_type the_value
156875 c0njecture (noun) speculation
156876 ———- (noun) hypothesis (generic term)
156877 ———- (noun) possibility
156878 ———- (noun) theory (generic term)
156879 ———- (noun) guess
156880 ———- (noun) supposition
156881 ———- (noun) surmise
156882 ———- (noun) surmisal
156883 ———- (noun) speculation
156884 ———- (noun) hypothesis
156885 ———- (noun) opinion (generic term)
156886 ———- (noun) view (generic term)
156887 ———- (noun) reasoning (generic term)
156888 ———- (noun) logical thinking (generic term)
156889 ———- (noun) abstract thought (generic term)
156890 ———- (verb) speculate
156891 ———- (verb) theorize
156892 ———- (verb) theorise
156893 ———- (verb) hypothesize
156894 ———- (verb) hypothesise
156895 ———- (verb) hypothecate
156896 ———- (verb) suppose
156897 ———- (verb) expect (generic term)
156898 ———- (verb) anticipate (generic term)

 (source : semanthesaurus)

 

Let’s see if the Panda gets it.

 

 

bing api with php and simplexml

About scraping results off of Bing : Bing use a set of about eight cookies. You can grab 200 results with php curl, as 20 pages of 10, but after the first 200 the Bing server checks for the cookie and for lack of one returns a blank page. I can fidget with the curl cookiejar, but Bing also offer a straighforward API.

Using the Bing API to list search results is easier.

Bing TOS : not for seo rank checks

In the last paragraph of the api guide, Bing give a quick recap of their TOS, you can do max 7 queries per second, and using the results for SEO rank checks is explicitly prohibited.

These following snippets (text source) are hence explicitly not to be used for bing search engine result page (‘serp’) rank checks.

bing api with simplexml

So here is one for web results using php simplexml. The web api (which uses namespaces) allows for retrieving max 1000 results per term at max 50 results per query, you can specify the number of results and the offset, where to start grabbing results.

  1. $Appid="A_VERY_LONG_STRING";
  2. $Query = "seo rank check";
  3. $Numres = 50; //max 50
  4. $Offset = 1;    //up to 1000
  5.  
  6. $url = 'http://api.search.live.net/xml.aspx?
  7. Appid='.$Appid.'
  8. &query='.$Query.'
  9. &sources=web
  10. &web.count='.$Numres.'
  11. &web.offset='.$Offset;
  12.  
  13. $feed = simplexml_load_file($url);
  14. //use the web: namespace
  15.  $children =  $feed->children('http://schemas.microsoft.com/LiveSearch/2008/04/XML/web');
  16.       foreach ($children->Web->Results->WebResult as $d) {
  17.                 echo $d->Title.'<br />';
  18.                 echo $d->Description.'<br />';
  19.                 echo $d->Url.'<br />';
  20.                 echo $d->DisplayUrl.'<br />';
  21.    }

..and one for the pictures using php simplexml :

  1. $Appid="A_VERY_LONG_STRING";
  2. $Query = "alkmaar";
  3. $Numres = 10;
  4. $Offset = 1;
  5.  
  6. $url = 'http://api.search.live.net/xml.aspx?';
  7. $url .= 'Appid='.$Appid;
  8. $url .= '&query='.$Query;
  9. $url .= '&sources=image';
  10. $url .= '&image.count='.$Numres;
  11. $url .= '&image.offset='.$Offset;
  12.  
  13. $feed = simplexml_load_file($url);
  14.  
  15. //use the mms: namespace      
  16.   $children =  $feed->children('http://schemas.microsoft.com/LiveSearch/2008/04/XML/multimedia');
  17.  
  18.     echo('<ul ID="resultList">');
  19.  
  20.     foreach ($children->Image->Results->ImageResult as $d) {
  21.                 echo('<li class="resultlistitem"><a href="' . $d->DisplayUrl . '">' . $d->Title . '</a><br />');
  22.                 echo('<img src="' . $d-/>Thumbnail->Url. '" /><br />
  23.                      '.$d->Thumbnail->ContentType.'<br />
  24.                     '.$d->Thumbnail->Height.'<br />
  25.                     '.$d->Thumbnail->Width.'<br />
  26.                     '.$d->Thumbnail->FileSize.'<br />
  27.                     </li>');
  28.        }
  29.     echo("</ul>");

I actually like that api, I am going to use that.

bing api with json

Bing seem to prefer you use json, less bandwidth usage. After their example in the api basics guide :

  1.  
  2. $Numres = 10;
  3. $Offset = 1;
  4. $Query='alkmaar';
  5.  
  6. $url = 'http://api.search.live.net/json.aspx?';
  7. $url .= 'Appid='.$Appid;
  8. $url .= '&query='.$Query;
  9. $url .= '&sources=image';
  10. $url .= '&image.count='.$Numres;
  11. $url .= '&image.offset='.$Offset;
  12.  
  13.  
  14. $response = file_get_contents($url);
  15. $jsonobj = json_decode($response);
  16. echo('<ul ID="resultList">');
  17. foreach($jsonobj->SearchResponse->Image->Results as $value)
  18. {
  19.     echo('<li class="resultlistitem"><a href="' . $value->Url . '">');
  20.     echo('<img src="' . $value-/>Thumbnail->Url. '"></a></li>');
  21. }
  22. echo("</ul>");

Of course there is the old RSS-option, which doesnt require an appid but also falls under the api 2.0 tos, and a soap option.

other sources :
There is a bing api php class made over at routecafe, and a jquery bing plugin using json over at Einar Otto Stangvik’s blog.

about the trackback thing

The question about the trends script with trackbacks was wether a few hundred backlinks was worth the trouble, and it wasn’t. I wrote a second routine to grab the most common significant words from excerpts, and do a second search to grab better results and up to five trackbacks per page.

So I put that online, it grabbed 4000 backlinks in an hour and overloaded the host server.

Baidu, radian6 and google had stepped up indexing after I added sitewide tags and that didnt show up in analytics, the site got the trackback validations and crawlers and the server went haywire. It is a shared host, the resources are too limited to run that kind site on. I put it on hold till I find a solution for the hosting,

Google of course penalised the site with PR0 and dropped the domain from the serp on its main keywords, but in Yahoo it ranks about 20 out of 360 million result pages and in MSN it ranks no 1. I was thinking about adding a translator plugin and see if I can get some traffic from Baidu.