juust ~ php oddities

Unordered list of one element
  • rss
  • begin
  • about
    • vcard
    • WTF is BroJesus
  • php scripts
    • flickr wp widget
    • google multi key serp tool, php script
    • gwt plugin
  • php classes
    • php pagerank class
    • fibonacci class
    • robots.txt parser php class
  • serp
    • serp dashboard wordpress plugin
  • services

hands on xml-rpc : copying msql tables

juust | 08/01/2009

I don’t have anything to blog on, so I will bore you all with a quick generic function to copy mysql tables from one host to another, using xml-rpc.

I use the Incutio xml-rpc library on both hosts, to handle the tedious stuff (xml formatting and parsing). That leaves only some snippets to send and receive table data and store it on a mysql database.

First : how to handle the table data on the sending end:

  • I take an associative array from a mysql query
  • I make an array to hold the records
  • I add each row as array
  • I make an IXR-client.
  • I add some general parameters
  • I hand these and the entire table array to my IXR-client.
  • send…
  1. //the snippet with the client is at the bottom of the post
  2. $ThisClient = New SerpClient('http://serp.trismegistos.net/db/xmlrpc.php', 'user', 'pass', 'sender');
  3.  
  4. $tablename = "serp_tags_keys";
  5. $tableid = "id";
  6. $result = $serpdb->query("SELECT * FROM ".$tablename);
  7. $recordcount = mysql_num_rows($result);
  8.  
  9. while($row=mysql_fetch_assoc($result)) {
  10.  $record=array();
  11.  foreach($row as $key => $value) $record[$key]=$value;
  12.  $records[]=$record;
  13. }
  14.  
  15. $ThisClient->putTable($tablename, $recordcount, $tableid, $records);

I consider some additional fields necessary for basic integrity checks : I add “ID” as key field, so on the receiving end the server knows which field is my table’s auto-increment field. Other fields are a username, password, tablename and the batch recordcount.

The IXR_Client then generates a tangled mess of xml-tags holding the entire prodecure call and data. (you can put the client on ‘debug’, then it dumps the generated xml to the screen).

The first part of the xml file contains the single parameters :

  • username
  • password
  • tablename
  • recordcount
  • id-field

<methodCall>
<methodName>serp.putTable</methodName>
<params>
<param><value><string>user</string></value></param>
<param><value><string>pass</string></value></param>
<param><value><string>serp_tags_keys</string></value></param>
<param><value><int>91</int></value></param>
<param><value><string>id</string></value></param>

Then the entire table is sent as one parameter in the procedure call.

That parameter is built from an array containing the table rows as ’struct’. If I want to use the routine for any table, I need the fieldname-value pairs to compose a standard mysql insert statement. A struct type allows me to use key-value pairs in the xml-file that can be parsed back into an array.

<param><value><array>

<data>

<value><struct>
<member><name>id</name><value><string>4</string></value></member>
<member><name>tag</name><value><string>ranking</string></value></member>
<member><name>cat</name><value><string>alexa ranking seo internet ranking internet positi</string></value></member>
<member><name>date</name><value><string>200901</string></value></member>
</struct></value>

<value><struct>
<member><name>id</name><value><string>94</string></value></member>
<member><name>tag</name><value><string>firm</string></value></member>
<member><name>cat</name><value><string>firm seo</string></value></member>
<member><name>date</name><value><string>200901</string></value></member>
</struct></value>

</data>

</array></value></param>

That was the last of the param holding the table, so the entire tag-mess is closed :

</params&gt</methodCall&gt

Then the second part : on the receiving end the Incutio class parses the whole tag-mess, and hands an array of the param sections as input to my function putTable.

  1.  function putTable($args)
  2.  {
  3.   $user   = $args[0];
  4.   $pass   = $args[1];
  5.   $tname   = $args[2];
  6.   $tcount  = $args[3];
  7.   $id           = $args[4];
  8.   $table   = $args[5];

$table is a straightforward array holding as items an array ($t) created from the struct with the pairs of fieldname-value. I turn the recordsets key-value struct into a mysql INSERT query :
$query = “INSERT INTO `”.$tname.”` (” field, field… “) VALUES (” fieldvalue, fieldvalue “)”;

All I have to do is add the fieldnames and fieldvalues to the mysql insert query.

  1.   foreach($table as $t) {
  2.  
  3. //the fixed parts
  4.     $query0 = 'INSERT INTO `'.$tname.'` (';
  5.     $query2 .=") VALUES (";
  6.  
  7. //make the (`fieldname`, `fieldname`, `fieldname`) query-bit
  8. //and the ('fieldvalue', 'fieldvalue', 'fieldvalue') query-bit :
  9.  
  10.     foreach($t as $key=>$value) {
  11.      if($key!=$id) {
  12.       $query1 .="`".$key."`, ";
  13.       $query3 .="'".$value."', ";
  14.      }
  15.     }
  16.  
  17. //remove the trailing ", "
  18.     $query1=substr($query1, 0, strlen($query1)-2);
  19.     $query3=substr($query3, 0, strlen($query3)-2);
  20.  
  21. //glue em up and add the final ")"
  22.     $query0 .= $query1.$query2.$query3.")";
  23.  
  24. //query…
  25.     $this->connection->query($query0);
  26.  
  27. //reset the strings
  28.     $query0='';
  29.     $query1='';
  30.     $query2='';
  31.     $query3='';
  32.    }
  33.  }

that generates mysql queries like
INSERT INTO `serp_tags_keys` (`tag`, `cat`, `date`) VALUES (’ranking’, ‘alexa ranking’, ‘200901′) and copies the entire table.

That is how I handle the table data.

Of course I have to define two custom classes to process the serp.putTable procedure itself, using the Incutio class.

First the class for the sending script, which is pretty straight forward :

  • make an IXR_Client instance
  • hand the record set to it
  • have it formatted and sent
  1. //include the library
  2. include('class-IXR.php');
  3.  
  4. //make a custom class that uses the IXR_client
  5. Class SerpClient
  6. {
  7.  var $rpcurl;         //endpoint
  8.  var $username;   //you go figure
  9.  var $password;
  10.  var $bClient;      //incutio ixr-client instance
  11.  var $myclient;  //machine/host-id
  12.  
  13.     function SerpClient($rpcurl, $username, $password, $myclient)
  14.     {
  15.  $this->rpcurl = $rpcurl;
  16.     if (!$this->connect()) return false;
  17.  
  18.      //Standard variables to send in the message
  19.  $this->rpcurl = (string) $rpcurl;
  20.      $this->username = (string) $username;
  21.      $this->password = (string) $password;
  22.  $this->myclient = (string) $myclient;
  23.      return $this;
  24.     }
  25.  
  26.      function connect()
  27.    {
  28. //basic client, it takes the endpoint url, tests and returns true if it exists
  29.      if($this->bClient = new IXR_Client($this->rpcurl)) return true;
  30.     }
  31.  
  32. //the function I use to send the data
  33.   function putTable($tablename, $recordcount, $tableid, $array)
  34.  {
  35. //first parameter is always the methodname, then the parameters, which are
  36. //added sequential to the xml-file (with the appropriate tags for datatypes.
  37. //the script figures that out. note : it uses htmlentities on strings.
  38.   $this->bClient->query('serp.putTable', $this->username, $this->password, $tablename, $recordcount, $tableid, $array);
  39.  }
  40.  
  41. }

I use it in the snippets above with :

  1. $ThisClient = New SerpClient('http://serp.trismegistos.net/db/xmlrpc.php', 'user', 'pass', 'sender');
  2. //…
  3. $ThisClient->putTable($tname, $tcount, $tableid, $records);

Then, on the receiving end, my program has to know how to handle the xml containing the remote procedure call.

I define an extension on IXR_server and pass serp.putTable as new ‘method’ (callback function).

  1. //go away cookie…
  2. $_COOKIE = array();
  3.  
  4. //make sure you get the posted crap, the ixr instances grabs it input from it
  5. if ( !isset( $HTTP_RAW_POST_DATA ) ) $HTTP_RAW_POST_DATA = file_get_contents( 'php://input' );
  6. if ( isset($HTTP_RAW_POST_DATA) ) $HTTP_RAW_POST_DATA = trim($HTTP_RAW_POST_DATA);
  7.  
  8. //include the library
  9. include('class-IXR.php');
  10.  
  11. //make an extended class
  12. class serp_xmlrpc_server extends IXR_Server {
  13.  
  14. //use the same function name…
  15.  
  16.  function serp_xmlrpc_server() {
  17.  
  18. //build an array of methods :
  19. //first the procedurename you use in the xml-text,
  20. //then which function in the extended class (this one) it maps to
  21. //to be used as $this->method
  22.  
  23.   $this->methods = array('serp.putTable'  => 'this:putTable');
  24.  
  25. //hand em to the IXR server instance that will map it as callback
  26.   $this->IXR_Server($this->methods);
  27.  }
  28.  
  29. //now IXR_Server instance uses ($this->)putTable
  30. //to process incoming xml-text
  31. //containing serp.putTable as methodname
  32.  
  33.   function putTable($args)
  34.  {
  35. //(for routine : see the snippet above to store the xml data in mysql)
  36.  }
  37. }
  38.  
  39. //make the class instance like any regular get-post php program,
  40. //the only actual program line, that instantiates the extended class,
  41. //which handles the posted xml
  42.  
  43. $serp_xmlrpc_server = new serp_xmlrpc_server();

That’s all. I am not going to list a cut-and-paste version. You have to build some stuff with it, then you will come up with lots of stuff you can do with it.

Wordpress and iPhone built a plugin that receives pictures from iPhone. Wordpress uses Incutio so you can ‘piggyback’ on that and have an iPhone plugin for your own website in two days flat using an ajax lightbox gallery script. Or go monetize small websites with some seo oriented ‘optimisation’ functions like ChangeFooterLinks(array($paidurl, $anchortext)) :) or whatever… boring, isn’t it ?

Comments
No Comments »
Categories
optimisation, php, xml-rpc
Tags
optimisation, php, xml-rpc
Comments rss Comments rss
Trackback Trackback

RedHat Seo : scraper auto-blogging

juust | 26/12/2008

Just give us your endpoint and we’ll take it from there, sparky!

I was going to make one of these tools to scrape google and conjur a full blog out of nowhere, as Christmas special, RedHat Seo. The rough sketch has arrived , far from perfect, but it does produce a blog and don’t even look too shabby. I scraped a small batch of posts off of blogs, keeping the links intact and adding a tribute links. I hope they will pardon me for it.

structure

I use three main classes,

BlogMaker the application
Target the blogs you aim for
WPContent the scraped goodies

…and two support classes

SerpResult scraped urls
Custom_RPC a simple rpc-poster

Target blogs have three texts,

file contents maintenance
blog categories category you post under manual
blog tags tags you list on the blog manual
blog urls urls already used for the blog system

routine

The BlogMaker class grabs a result list (up to 1000 urls per phrase) from Google, extracts the urls and stores them in SerpResult, scrapes the urls and extracts the entry divs, stores div-entries in the WPContent class (that has some basic functions to sanitize the text), and uses the BlogTarget-definitions to post it up blogs with xml-rpc.

usage

My highlighter tends to mess up text with div markers in it, copying off the blog may not work,
the full text source (about 500 lines) is overhere. Underneath I’ll list the main program loop :

  1.  
  2. //make main instance
  3. $Blog = new BlogMaker("keyword");
  4.  
  5. //define a target blog, you can define multiple blogs and refer with code
  6. //then add rpc-url, password and user
  7. //and for every target blog three text-files
  8.  
  9. $T=$Blog->AddTarget(
  10.  'blogcode',
  11.  'http://my.blog.com/xmlrpc.php',
  12.  'password',
  13.  'user',
  14.  'keyword.categories.txt',
  15.  'keyword.tags.txt',
  16.  'keyword.urls.txt'
  17.  );
  18.  
  19. //read the tags, cats and url text files stored on the server
  20. //all retrieved urls are tested, if the target blog already has that
  21. //scraped url, it is discarded.
  22. $T->CSV_GetTags();
  23. $T->List_GetCats();
  24. $T->ReadURL();
  25.  
  26. //grab the google result list
  27. //use params (pages, keywords) to specify search
  28. $Blog->GoogleResults();
  29.  
  30. $a=0;
  31. foreach($Blog->Results as $BlogUrl) {
  32.   $a++;
  33.   echo $BlogUrl->url;
  34. //see if the url isnt used yet
  35.  
  36.  if($T->checkURL(trim($BlogUrl->url))!=true) {
  37.    echo '…checking ';
  38.    flush();
  39. //if not used, get the source
  40.    $BlogUrl->scrape();
  41. //check for divs marked "entry", if they arent there, check "post"
  42. //some blogs use other indications for the content
  43. //but entry and post cover 40%
  44.  
  45.    $entries = $BlogUrl->get_entries();
  46.    if(count($entries)&lt;1) {
  47.     echo 'no entries…';
  48.     flush();
  49.     $entries = $BlogUrl->get_posts();
  50.      if(count($entries)&lt;1) {
  51.       echo 'no posts either…';
  52. //if no entry-post div, mark url as done
  53.  
  54.       $T->RegisterURL($BlogUrl->url);
  55.      }
  56.    }
  57.  
  58.    $ct=0;
  59.    foreach($BlogUrl->WpContentPieces as $WpContent) {
  60. //in the get_entries/get_post function the fragments are stored
  61. //as wpcontent
  62.     $ct++;
  63.  
  64.     if($WpContent->judge(2000, 200, 5)) {
  65.      $WpContent->tribute();  //add tribute link
  66.      $T->settags($WpContent->divcontent); //add tags
  67.      $T->postCustomRPC($WpContent->title, $WpContent->divcontent, 1); //1=publish, 0=draft
  68.      $T->RegisterURL($WpContent->url);  //register use of url
  69. usleep(20000000);  //20 seconds break, for sitemapping
  70.     }
  71.    }
  72.   }
  73.  }

notes

  • xml-rpc needs to be activated explicitly on the wordpress dashboard under settings/writing.
  • categories must be present in the blog
  • url file must be writeable by the server (777)

It seems wordpress builds the sitemap as background process, the standard google xml sitemap plugin wil attempt to build in the cache (takes anywhere between 2 and 10 seconds), and apart from building a sitemap the posts also get pinged around. Giving the install 10 to 20 seconds between posts allows for all the hooked in functions to be completed.

period

That’s about all,
consider it gpl, I added some comments in the source but I will not develop this any further. A mysql backed blogfarm tool (euphemistically called ‘publishing tool’) is more interesting, besides, I am off to the wharves to do some painting.

if you use it, send some feedback,
merry christmas dogheads

Comments
1 Comment »
Categories
google, seo, seo tips and tricks, tool, wordpress, xml-rpc
Tags
google, scrape, seo, seo tips and tricks, tool, wordpress, xml-rpc
Comments rss Comments rss
Trackback Trackback

an xml rpc endpoint

juust | 07/12/2008

(geek content:) Integrating the Incutio xml rpc class into a phpLinkDirectory install opens a lot of possibilities for running remote control automated networks. For a basic example I took the submit routine of phpLD and the xml-rpc routine from wordpress, deleted all nonsense, and ended up with a simple xml-rpc endpoint for my link directory.

On the sender side, I make an xml file that holds the methodName (which is the function I want to execute remotely : phpld.SubmitLink), and the array values I want to pass to the function as parameters. I attach the xml-string as post to a curl call and fire it at the xmlrpc endpoint.

  1. function getmyxml() {
  2.  
  3. //make the $data array
  4. //normally phpLd makes it when someone submits a site
  5.  $data['LINK_TYPE']=1;
  6.  $data['DESCRIPTION']='DESCRIPTION';
  7.  $data['TITLE']='TITLE';
  8.  $data['OWNER_NAME']='OWNER_NAME';
  9.  $data['URL']='http://www.domain.com/xmlrpc.php';
  10.  $data['ID']='';
  11.  
  12. //put the data array in an xml string to post to the xmlrpc endpoint
  13.  
  14. //make the xml header
  15.  $myxml='< ?xml version="1.0" encoding="UTF-8"?>';
  16.  $myxml.='< methodCall>';
  17.  $myxml.= '< methodName>phpld.SubmitLink< /methodName>';
  18.  $myxml.='< params>';
  19.  
  20. //loop to add the $data elements as param-tags
  21.  foreach($data as $d) {
  22.   $myxml.='< param>';
  23.   $myxml.='< value>< string>'.trim($d).'< /string>< /value>';
  24.   $myxml.='< /param>';
  25.  }
  26.  
  27. //finish the xml file :
  28.  $myxml.='< /params>';
  29.  $myxml.='< /methodCall>';
  30.  
  31. //return it
  32.  return $myxml;
  33. }
  34.  
  35. //make the call to the endpoint
  36.     $ch = curl_init('http://www.domain.com/xmlrpc.php');
  37. //use content-type text/xml as extra header
  38.     curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: text/xml'));
  39. //get the xml-string and use it as post var
  40.     curl_setopt($ch, CURLOPT_POSTFIELDS, getmyxml());
  41.     curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
  42.     curl_setopt($ch, CURLOPT_TIMEOUT, 1);
  43.     $return = curl_exec($ch);
  44.     unset($ch);
  45. }

So much for the sending end, now the receiving end : the xmlrpc.php endpoint file. I need the Incutio xml handler and the phpLinkdirectory database class, with the proper settings, and a function SubmitLink to add the data in the posted xml-file as a record to the database.

Some general settings :

  1. define('XMLRPC_REQUEST', true);
  2. $_COOKIE = array();
  3. if ( !isset( $HTTP_RAW_POST_DATA ) ) $HTTP_RAW_POST_DATA = file_get_contents( 'php://input' );
  4. if ( isset($HTTP_RAW_POST_DATA) )  $HTTP_RAW_POST_DATA = trim($HTTP_RAW_POST_DATA);

I include the Incutio class file, which is a general xml client/server class, and I include init.php from the phpLD script. In the init.php script the AdoDb database instance is declared, that handles the mysql connection, as well as the basic phpLd environment.

  1. include_once('include/class-IXR.php');
  2. include_once('init.php');
  3. $xmlrpc_logging = 0;

Let’s put them both to work for me : the IXR_Server has to know it has to run the SubmitLink function if that function is set as methodName in the xml file. In php you can add methods dynamically to a class, and in the function call($methodname, $args) the IXR_Server handles callbacks.

In my phpLD_xmlrpc_server class, which extends IXR_Server, I add a function SubmitLink that stuffs the data in the phpld database, and I make an array with xml.methodName=>class:function as key=>value pair and pass that to the IXR_Server, that’s enough.

  1. class phpLD_xmlrpc_server extends IXR_Server {
  2.  
  3.  var $methods=array();
  4.  
  5.  function phpLD_xmlrpc_server() {
  6.  
  7. //mapping the custom methods
  8.     $this->methods['phpld.SubmitLink'] = 'this:SubmitLink';
  9.  
  10. //handing the custom methods to the base class
  11.     $this->IXR_Server($this->methods);
  12.  }

When the IXR class parses the posted xml and finds the phpld.SubmitLink methodName, it executes the custom method ($this->)SubmitLink with the data posted in the xml param tags as input :

  1.  
  2. function SubmitLink($params) {
  3.  
  4. //use db and tables from include(init.php)
  5.  global $db;
  6.  global $tables;
  7.  
  8. //map the param-values passed to the $data array
  9.  $data['LINK_TYPE']=$params[0];
  10.  $data['DESCRIPTION']=$params[1];
  11.  $data['TITLE']=$params[2];
  12.  $data['OWNER_NAME']=$params[3];
  13.  $data['URL']=$params[4];
  14.  $data['ID']='';
  15.  
  16. //pass the data array to the adodb $db instance's Replace function :
  17.   if ($db->Replace($tables['link']['name'], $data, 'ID', true) > 0) {
  18.    return $data['DESCRIPTION']." entered";
  19.   } else {
  20.    return " refused";
  21.   }
  22.  }

The AdoDb function Replace maps key-value to tablefield-value, so I use the database field names as keys for the $data array, and assign the $param-values to them. I pass the $data array to the adoDb function and that takes care of the rest.

After adding the includes, settings, and classes,
all I have to do is add a final call at the end to start a new instance of the extended class to handle the posted xml-data :

  1. $phpLD_xmlrpc_server = new phpLD_xmlrpc_server();

…and I have an xml-rpc endpoint :

  1. define('XMLRPC_REQUEST', true);
  2. $_COOKIE = array();
  3. if ( !isset( $HTTP_RAW_POST_DATA ) ) $HTTP_RAW_POST_DATA = file_get_contents( 'php://input' );
  4. if ( isset($HTTP_RAW_POST_DATA) ) $HTTP_RAW_POST_DATA = trim($HTTP_RAW_POST_DATA);
  5.  
  6. include_once('include/class-IXR.php');
  7. include_once('init.php');
  8.  
  9. $xmlrpc_logging = 0;
  10.  
  11. class phpLD_xmlrpc_server extends IXR_Server {
  12.  
  13.  var $methods=array();
  14.  
  15.  function phpLD_xmlrpc_server() {
  16.     $this->methods['phpld.SubmitLink'] = 'this:SubmitLink';
  17.   $this->IXR_Server($this->methods);
  18.  }
  19.  
  20.  function SubmitLink($args) {
  21.  
  22.  global $db;
  23.  global $tables;
  24.  
  25.  $data['STATUS']         = $args[0];
  26.  $data['IPADDRESS']      = $args[1];
  27.  $data['VALID']          = $args[2];
  28.  $data['LINK_TYPE']      = $args[3];
  29.  
  30.   if ($db->Replace($tables['link']['name'], $data, 'ID', true) > 0) {
  31.    return $data['DESCRIPTION']." entered";
  32.   } else {
  33.    return " refused";
  34.   }
  35.  }
  36. }
  37. $phpLD_xmlrpc_server = new phpLD_xmlrpc_server();

Short and sweet.

The actual strength of xml-rpc is in defining a set of standard methodNames and parameters passed with the methodCall (for instance for small website maintenance tasks and statistics reporting, an rpc.sms protocol as successor to Twitter, or standard socialgraph functions for ajax/javascript/widgets) for developing standard API’s.

Comments
No Comments »
Categories
linkdir, php, xml-rpc
Tags
php, xml-rpc
Comments rss Comments rss
Trackback Trackback

« Previous Entries Next Entries »

Recent Posts

  • geert wilders
  • gone till september
  • socialize me
  • Pagerank sculpting session
  • wish you were here

click me!
rss
Comments rss
Blog Directory
Web Developement Blogs - BlogCatalog Blog Directory
Listed in LS Blogs the Blog Directory and Blog Search Engine
Blog Flux Directory
joopita.com free web directory and search engine
design by jide
sitemap
22289 confirmed spam kills