juust ~ php oddities

Unordered list of one element
  • rss
  • begin
  • about
    • vcard
    • WTF is BroJesus
  • php scripts
    • flickr wp widget
    • google multi key serp tool, php script
    • gwt plugin
  • php classes
    • php pagerank class
    • fibonacci class
    • robots.txt parser php class
  • serp
    • serp dashboard wordpress plugin
  • services

optimisation

juust | 15/12/2008

Adriaan asked about handling a large mysql table with regard to a traffic log :
I have a similar problem with a rank tracker, a table that grows fast, and after a while queries become painful slow. I want to show results per domain on screen in less than a second the next two years (generally 5 to 20 urls per domain, with 200.000 urls for 80.000 domains per month).

warehouse

Some information systems offer a snapshot reporting system. Using it means shifting the focus from querying on raw entry tables towards aggregating the entry data into tables and reporting from these tables (technorati cosmos, google analytics, sap business warehouse). Successful systems minimize their costs (resources like database and application server capacity, and admin hours) whilst offering end users a rich data representation with short report generation time.

If I use the traditional “data function form” application model, I can optimise all three areas and avoid querying on large tables.

data

First the data layer :
For raw entries I use one archive table, one table to store entries of the current period,
and besides that, a set of aggregated data tables.

  • for raw entries : a small temp-table and basic insert-queries
  • for raw entries : an archive table with a periodic insert of the temp-table
  • for end user reporting : basic queries on the snapshot/warehouse tables
  • a procedure to archive raw entries
  • a procedure to aggregate to snapshot tables

The insert instructions are performed on a small table, which works faster. Transforming data into information is done in the aggregation procedure, once a day or once a month. These are the same queries I would use for end-user reports, but now I report to the snapshot ‘warehouse’.

The cycles and time for operations that I would normally perform on the whole set for each report query, are stored in the warehouse aggregates. After aggregating the raw entry data, I archive the records (=dump the temp table into the archive table) and clear the temp tables for new input. With a bit of luck, I don’t have to access the archive anymore.

functions

After bringing down query-time in the data layer, I can diminish the number of times functions are run by caching parts of pages to disk, as html tables or a serialized array. In stead of generating the same table every time a user hits a page, I can store the generated table and retrieve it from a server cache directory. That can also save some connection handles.

forms

In the form layer I can use the apache gzip module and compress data sent to a clients browser which decreases traffic and page loading time.

evaluation
I am testing it with a site, for now it saves up to 80% in data traffic and page loading, next few months I’ll see if it holds up. It seems to work fine for registration systems with a lot of static historic data like a hit log, rank tracker, time sheets, banking statements.

Negative :

  • more tables
  • requires more programming and planning
  • I cannot easily add ‘on the fly’ reports

Positive :

  • minimal claim on database/application server resources
  • more control over transformation of data
  • good basis for a ‘rich’ frontend
  • scalable
  • fast pages
  • no large tables, fast queries
Comments
No Comments »
Categories
mysql, optimisation
Tags
mysql, optimisation
Comments rss Comments rss
Trackback Trackback

phpLD Mod : social bookmarking

juust | 19/08/2008

I was looking for a plugin to make it easy for people to bookmark their link-pages for my search engine optimized link directory. If they can add a bookmark and help get the page crawled, indexed, and ranked, I get a brutal backlink-count and fast increasing pagerank (free listing for a reciprocal).

So today I make a phpLD Bookmark-Mod ! I want one routine that puts the specific url of any page with the category title as bookmark-link in the page. For the example I took ekstreme.com, they have a page with most major bookmarking sites, from there you can add url+title to any ’social’ site.

How?

By using the apache server variables, $_SERVER["HTTP_HOST"] and $_SERVER["REQUEST_URI"]

phpLD uses index.php to generate all the link pages (it’s only one actual file). I will add some code to index.php to retrieve the ‘current’ category page url.

As my site uses seo-friendly mod_rewrite, the url contains the category names, so I extract the path from http://links.trismegistos.net/health/senior_health/ and replace the slashes with spaces, and it becomes my title “health senior health”

  1. $URL = "http://" . $_SERVER["HTTP_HOST"] . $_SERVER["REQUEST_URI"];
  2. $PT=parse_url($URL, PHP_URL_PATH);
  3. $PTT=trim(preg_replace('/\//', ' ', $PT));
  4. $TITLE = "Trismegistos Links ".$PTT;

Now I need to get phpLD to list that anchor.

phpLD uses Templates (traditional data-functions-form (mysql, php, smarty)).

That means you cannot place php snippets directly in the forms. Luckily I have the $TPL-object in the php functions layer, which has a function to assign a value to a variable and store it in the template engine, and once the output file is made (the form) any reference i make to the variable in the template is replaced with the value stored in it.

So I make the variable “socialize” in $tpl and assign the anchor I made to it.

  1. $TITLE = urlencode($TITLE);
  2. $URL = urlencode($URL);
  3.  
  4. $socialize = "<a href=\"http://ekstreme.com/socializer/?url=$URL&amp;title=$TITLE\" rel=\"nofollow\" target=\"_blank\">Socialize!</a>";
  5.  
  6. $tpl->assign('socialize', $socialize);

Now the anchor is stored in the $tpl-object, the template engine, and I can list the content of any variable stored in the ‘engine’ between {}. I pick FOOTER.TPL and add the reference {$socialize}.

  1.  <table><tbody>
  2.   <tr><td>pending</td><td>{$pending_links}</td><td>{$socialize}</td></tr>
  3.         </tbody></table>

I store the php function in a file socialize.php in the root of the directory, and the last step is putting the extra routine in index.php :

  1.    require_once 'socialize.php';

…right after require_once ‘init.php’

every time a new page is loaded, the routine is executed, retrieves the current url from the server, makes a new anchor and shows it on the page as bookmark link. Exactly what I wanted.

Have a look at links.trismegistos.net – health – senior health, in the footer you see Socialize!, the link takes you to the ekstreme.com site where you get a list of all bookmarking sites available.

Feel free to cut-and-paste it, it’s 10 lines of code.

  1. $URL = "http://" . $_SERVER["HTTP_HOST"] . $_SERVER["REQUEST_URI"];
  2. $PT=parse_url($URL, PHP_URL_PATH);
  3. $PTT=trim(preg_replace('/\//', ' ', $PT));
  4. $TITLE = "Trismegistos Links ".$PTT;
  5. $TITLE = urlencode($TITLE);
  6. $URL = urlencode($URL);
  7. $socialize = "<a href=\"http://ekstreme.com/socializer/?url=$URL&amp;title=$TITLE\" rel=\"nofollow\" target=\"_blank\">Socialize!</a>";
  8. $tpl->assign('socialize', $socialize);
Comments
No Comments »
Categories
links, optimisation, pagerank
Tags
links, optimisation, pagerank, php
Comments rss Comments rss
Trackback Trackback

Cache Wordpress on IIS

admin | 18/08/2008

If you run Wordpress on an IIS Windows server with FastGGI and Wordpress is slow, you will need a cache to speed it up and might run into some problems, if so : check out
WP-Cache up and running on IIS on fanrastic.com. A very simple explanation on how to get the Wp-Cache plugin for Wordpress running on an IIS windows host, and it works.

Why use a cache ?

Most people skip sites with long load-times, speeding up load times with a cache is a ‘must-have’ if you plan to develop and optimize a site aimed at getting your traffic through a search engine

I installed Google Analytics a week ago and that affirms it, the rate of people that skipped this site after one page was 85% compared to 15% on another domain with normal load times. Pages took 4 seconds or more to load without a cache, and 1.5 seconds or less with a cache. Average page per visitor was 1.14 for this site and 5.8 for the other.

For advertising purposes, long loading time means a site is unattractive to advertisers. You don’t bind visitors and the ‘bad rep’ your site gets also reflects on advertisers, especially with banner ads that work more visual/emotional (image and experience are linked and the poor performance of the site reflects on the advertisers).

Webslug keep a list on slow loading sites, some of them really extreme with 5 minutes load times. Some sites use a five second boundary, sites with higher loading times are considered unacceptable for advertising programs.

A cache also decreases server-load (which isn’t the foremost reason for wordpress bloggers to install it) and allows for a higher concurrent traffic volume, especially if you become popular with a few dozen visitors at the same time, your server overloads.

Loads of reasons to install a cache.

——–

added 14-09

One other reason is MySql, which can handle 2500 concurrent connections but only about 50 concurrent users, problems arise when you for instance have a large file table or registered user table, that for indexing builds a temporary index table (the ‘digg’-effect where a sudden massive rush on the site makes mysql opt out).

Using any caching mechanism diminishes the load on the mysql server which  effectively solves most problems with the site slowing down. For wordpress, removing unnecessary plug-ins helps, as does a page-cache.But neither solves the actual problem, the mysql database interaction.

Wordpress-MU might in time deliver the goods, a stable mysql query caching mechanism, and for now it seems the ‘ease-of-use’ and the simplicity of the source-code dictate a page-cacher will have to do for the standalone version.

So depending on your plans with a website, where Wordpress is the ideal tool for a site with up to, say, 1000 pages, if you plan on a larger business strength site use a CMS that does have backend mysql query-caching.

Comments
No Comments »
Categories
seo, wordpress
Tags
optimisation, wordpress
Comments rss Comments rss
Trackback Trackback

« Previous Entries Next Entries »

Recent Posts

  • geert wilders
  • gone till september
  • socialize me
  • Pagerank sculpting session
  • wish you were here

click me!
rss
Comments rss
Blog Directory
Web Developement Blogs - BlogCatalog Blog Directory
Listed in LS Blogs the Blog Directory and Blog Search Engine
Blog Flux Directory
joopita.com free web directory and search engine
design by jide
sitemap
22293 confirmed spam kills