Latest News: (loading..)

Archived

This topic is now archived and is closed to further replies.

Parikesit

Speed Up Database Query with MySQL 5+ (and more with MySQLi Extension)

17 posts in this topic

There are so many thread about speed up OsCommerce. In this thread I try to focus on MySQL queries optimization (without caching, addon, etc), just pure query technique based on MySQL 5+.

 

As a background, OsCommerce 2.X has been developed long time ago based on PHP 4+ and MySQL 3+ (or MySQL 4+, I'm not sure).

 

So there is a room for improvement just on query technique with current MySQL 5+ standard. And also another improvement with PHP MySQLi Extension.

 

 

Note:

Check your mysql version first before implement suggestion on this thread. Make sure you use MySQL-5+.

 

Cheers,

@zaenal

Share this post


Link to post
Share on other sites

I try to make a list of query technique available on MySQL 5+ before going to osCommerce php code.

  1. INSERT ... ON DUPLICATE KEY UPDATE Syntax (see: MySQL>Insert-Update)
  2. SELECT [sql_CALC_FOUND_ROWS] Syntax (see: MySQL>Select)
  3. Subqueries with ANY, IN, or SOME Syntax (see: MySQL>Subquery)
  4. Multi-query with PHP MySQLi Extension (see: PHP>MySQLi>Multi Query)

Any suggestions are welcome...

Share this post


Link to post
Share on other sites

IMPROVING QUERIES on SESSION HANDLING

-----------------------------------------------------------

 

First example of "INSERT ... ON DUPLICATE KEY UPDATE" Syntax, can be implemented on session handling. When using MySQL for storing session, at least this contributing 3 queries:

  • select current session value (if any);
  • determine session_key with select count;
  • and insert/update session value.

With "INSERT ON DUPLICATE KEY UPDATE", we can reduce it to 2 queries.

 

 

A. Open files below

  • catalog:/includes/functions/sessions.php
  • admin:/includes/functions/sessions.php

 

B. Edit Files

 

Find following code:

   function _sess_write($key, $val) {
  global $SESS_LIFE;
  $expiry = time() + $SESS_LIFE;
  $value = $val;
  $check_query = tep_db_query("select count(*) as total from " . TABLE_SESSIONS . " where sesskey = '" . tep_db_input($key) . "'");
  $check = tep_db_fetch_array($check_query);
  if ($check['total'] > 0) {
    return tep_db_query("update " . TABLE_SESSIONS . " set expiry = '" . tep_db_input($expiry) . "', value = '" . tep_db_input($value) . "' where sesskey = '" . tep_db_input($key) . "'");
  } else {
    return tep_db_query("insert into " . TABLE_SESSIONS . " values ('" . tep_db_input($key) . "', '" . tep_db_input($expiry) . "', '" . tep_db_input($value) . "')");
  }
   }

 

Replace with:

   function _sess_write($key, $val) {
  global $SESS_LIFE;
  $expiry = time() + $SESS_LIFE;
  $value = $val;
  tep_db_query("INSERT into " . TABLE_SESSIONS . " VALUES ('" . tep_db_input($key) . "', '" . tep_db_input($expiry) . "', '" . tep_db_input($value) . "' ON DUPLICATE KEY UPDATE expiry=VALUES(expiry), value=VALUES(value)");
   }

 

 

Cheers,

@zaenal

Share this post


Link to post
Share on other sites

IMPROVING QUERIES on SESSION HANDLING

-----------------------------------------------------------

...

Replace with:

function _sess_write($key, $val) {
  global $SESS_LIFE;
  $expiry = time() + $SESS_LIFE;
  $value = $val;
  tep_db_query("INSERT into " . TABLE_SESSIONS . " VALUES ('" . tep_db_input($key) . "', '" . tep_db_input($expiry) . "', '" . tep_db_input($value) . "' ON DUPLICATE KEY UPDATE expiry=VALUES(expiry), value=VALUES(value)");
}

 

Sorry, above query lack of close parenthesis ")" before ON DUPLICATE KEY ....

 

Here the right one:

function _sess_write($key, $val) {
  global $SESS_LIFE;
  $expiry = time() + $SESS_LIFE;
  $value = $val;

  tep_db_query("INSERT into " . TABLE_SESSIONS . " VALUES ('" . tep_db_input($key) . "', '" . tep_db_input($expiry) . "', '" . tep_db_input($value) . "') ON DUPLICATE KEY UPDATE expiry=VALUES(expiry), value=VALUES(value)");
}

 

Cheers,

@zaenal

Share this post


Link to post
Share on other sites

IMPROVING QUERIES on splitPageResults Class

-------------------------------------------

With: SELECT

[sql_CALC_FOUND_ROWS] Syntax


Before going further, here some notes:
  • splitPageResults class available both in catalog and admin area. But I recommend we don't touch the admin area. Beside the admin area will be less hits (only admin access it), optimizing splitPageResults in admin area will affects more files and also required compatibility with some addons.
  • When optimizing splitPageResults class in catalog, this will affects at least 6 files (as you can see in the end of this post).
  • I have tested the default queries versus the optimized queries on 200 rows, 130.000 rows, and 1.800.000 rows, the average results:
    => the optimized queries 15% faster than the default one (the number of rows does not affect the results).
  • More info can be found at MySQL website: Function:FOUND_ROWS().


A. Optimization
Note: You can edit it manually or you can download it below

The idea here is replacing select count(*) with select found_rows() in catalog:includes/classes/split_page_results.php.

//default query to calculate how many rows in result set
$count_query = tep_db_query("select count(" . $count_string . ") as total " . substr($this->sql_query, $pos_from, ($pos_to - $pos_from)));

 

Function found_rows() is demanding SQL_CALC_FOUND_ROWS to be added to the query and to be run before it. So we need to modify the class like below:

  • add variable $results to the class
  • modifiy function/constructor splitPageResults.

 class splitPageResults {
var $results, $sql_query, $number_of_rows, $current_page_number, $number_of_pages, $number_of_rows_per_page, $page_name;
/* class constructor */
function splitPageResults($query, $max_rows, $count_key = '*', $page_holder = 'page') {
  global $HTTP_GET_VARS, $HTTP_POST_VARS;

  //add SQL_CALC_FOUND_ROWS to the query
  $this->sql_query = preg_replace('@^SELECT (SQL_CALC_FOUND_ROWS)?@i', 'SELECT SQL_CALC_FOUND_ROWS ', trim($query));
  $this->page_name = $page_holder;
  if (isset($HTTP_GET_VARS[$page_holder])) {
	$page = $HTTP_GET_VARS[$page_holder];
  } elseif (isset($HTTP_POST_VARS[$page_holder])) {
	$page = $HTTP_POST_VARS[$page_holder];
  } else {
	$page = '';
  }
  if (empty($page) || !is_numeric($page)) $page = 1;
  $this->current_page_number = $page;
  $this->number_of_rows_per_page = $max_rows;

  $offset = ($this->number_of_rows_per_page * ($this->current_page_number - 1));

  $this->sql_query .= " limit " . max($offset, 0) . ", " . $this->number_of_rows_per_page;

  //RUN the query and get results object
  $this->results = tep_db_query($this->sql_query);

  //RUN FOUND_ROWS() to get total rows
  $count = tep_db_fetch_array(tep_db_query('SELECT FOUND_ROWS() as total'));
  $this->number_of_rows = intval($count['total']);

  $this->number_of_pages = ceil($this->number_of_rows / $this->number_of_rows_per_page);
  if ($this->current_page_number > $this->number_of_pages) {
	$this->current_page_number = $this->number_of_pages;
  }
}
//till the end...
//...
}

 

 

So the different between these 2 methods:

 

Standard class:

  • 1 query in the class: select count(*) ...
  • 1 query in the page that call this class
  • Return value: $obj->sql_query (string)

Optimized class:

  • 2 queries in the class: SELECT SQL_CALC_FOUND_ROWS... & SELECT FOUND ROWS()
  • 0 query in the page that call this class (because the query has been run in the class)
  • Return value: $obj->results (mysql object)

B. Example of Use

 

Below are 2 examples (as in catalog:includes/modules/product_listing.php) of how the default class and optimized class works.

 

=> The Old Way (Default Class)

//sample $listing_sql taken from product listing in index.php.
$listing_sql   = "select p.products_image, pd.products_name, p.products_id, p.manufacturers_id, p.products_price, p.products_tax_class_id, IF(s.status, s.specials_new_products_price, NULL) as specials_new_products_price, IF(s.status, s.specials_new_products_price, p.products_price) as final_price from products_description pd, products p left join manufacturers m on p.manufacturers_id = m.manufacturers_id left join specials s on p.products_id = s.products_id, products_to_categories p2c where p.products_status = '1' and p.products_id = p2c.products_id and pd.products_id = p2c.products_id and pd.language_id = '1' and p2c.categories_id = '10' order by pd.products_name";
$listing_split = new splitPageResults($listing_sql, MAX_DISPLAY_SEARCH_RESULTS, 'p.products_id');
if ($listing_split->number_of_rows > 0) {
 $rows = 0;
 $listing_query = tep_db_query($listing_split->sql_query);
 while ($listing = tep_db_fetch_array($listing_query)) {
$rows++;
echo $listing['products_name'] . "<br />\n";
 }
}

 

 

=> The New Way (Optimized Class)

//sample $listing_sql taken from product listing in index.php.
$listing_sql   = "select p.products_image, pd.products_name, p.products_id, p.manufacturers_id, p.products_price, p.products_tax_class_id, IF(s.status, s.specials_new_products_price, NULL) as specials_new_products_price, IF(s.status, s.specials_new_products_price, p.products_price) as final_price from products_description pd, products p left join manufacturers m on p.manufacturers_id = m.manufacturers_id left join specials s on p.products_id = s.products_id, products_to_categories p2c where p.products_status = '1' and p.products_id = p2c.products_id and pd.products_id = p2c.products_id and pd.language_id = '1' and p2c.categories_id = '10' order by pd.products_name";
$listing_split = new splitPageResults($listing_sql, MAX_DISPLAY_SEARCH_RESULTS);
if ($listing_split->number_of_rows > 0) {
 $rows = 0;
 //no need to run query again, but directly get the results from the objClass.
 while ($listing = tep_db_fetch_array($listing_split->results)) {
$rows++;
echo $listing['products_name'] . "<br />\n";
 }
}

 

 

C. Other Files to Edit

If you won't/don't edit below files, there is no need to optimize splitPageResults class. Because the optimized class add 1 query into it, and demanding you to remove 1 query from below files to make it more faster.

  • products_new.php
  • account_history.php
  • product_review.php
  • reviews.php
  • specials.php, and
  • includes/modules/product_listing.php

 

D. Download Optimized Class

Here the modified class:

split_page_results.php

 

Cheers,

@zaenal

Share this post


Link to post
Share on other sites

I have used the GROUP_CONCAT construct to support returning multiple attributes for a single product. It saves a lot of queries on getting those details; by making the listing query more complex... No idea by the way how it performs.

Paul

Share this post


Link to post
Share on other sites

Hi Paul,

 

That's GROUP_CONCAT should be listed for MySQL query enhancement.

 

Thanks,

@zaenal

Share this post


Link to post
Share on other sites

I'm not following what needs to be changed in the files listed in part C above. Can you be more clear?

Share this post


Link to post
Share on other sites

I'm not following what needs to be changed in the files listed in part C above. Can you be more clear?

 

All files listed in part C use splitPageResults class in the "old ways" not in the "new ways" as explained in part B.

 

Best,

@zaenal

Share this post


Link to post
Share on other sites

Have kept an eye on this thread as optimization is a keen hobby of mine... not much happening so although I'm usually not the sharing type (since I'm mean and horrible) I thought I'd post just a few of the 1000's of changes I've made to my core making it probably the fastest around. I've changed it so much a lot of the fixes and optimizations require other changes on top of other changes - but these are a couple I quickly checked against the current github code, and they appear to be drop in.

 

osCommerce, although a nice base to start out with and learn from, isn't well optimized and in some places not well written (SORRY HARALD, WE KNOW IT BUT LOVE YOUR EFFORT!) - My original base is late 2002 or early 2003 and has been transformed into an enterprise class (performance wise) site compatible with PHP 5.4, MySQL 5.5 etc, having sustained 16,000 concurrent users during a television appearance. Most pages are down to barely a few SQL calls, and they've all been heavily optimized with massive code rewrites and reworks to achieve complete performance.

 

THIS IS A NICE ONE - every entry in your shopping cart will cause an SQL query if you call $cart->calculate() to add a monetary value display somewhere on your page for instance. So 30 items = 30 SQL calls. Not any more:

 

includes/classes/shopping_cart.php calculate() - replace from the start of the function until //attributes price. My live one has more optimizations but this will work with stock osc. A few things: It's impossible to insert non (INT) product_id's to my cart so we're safe here using array_keys() before the security experts bark :) I didn't optimize the attributes section since I've removed that in favour of actual parent/child style attributes and options which helps with stock tracking, images, freight costings etc...

 

 

function calculate() {
	global currencies;
	$this->total = 0;
	$this->weight = 0;
	if ((!is_array($this->contents))||(sizeof($this->contents)==0)) return 0;
	$product_query = tep_db_query("select p.products_id, p.products_tax_class_id, p.products_weight, IF(s.specials_new_products_price, s.specials_new_products_price, p.products_price) as products_price
									FROM products p LEFT JOIN specials s ON p.products_id=s.products_id AND s.status=1
									WHERE p.products_id IN (".implode(',',array_keys($this->contents)) . ")");
	while ($product = tep_db_fetch_array($product_query)) {
		$products_tax = tep_get_tax_rate($product['products_tax_class_id']);
		$products_price = $product['products_price'];
		$products_weight = $product['products_weight'];
		$this->total += $currencies->calculate_price($products_price, $products_tax, $qty);
		$this->weight += ($qty * $products_weight);
	}

 

here's a few (should be) drop in replacements for includes/functions/general.php that uses more modern PHP features for significant performance gains:

 

EDIT: Whoa! Note: My tep_get_all_get_params() doesn't care about the session_id variable in $_GET, since that is long gone to me anyway. You may need to deal with this by pushing the session name to $exclude_array if you really actually need that crap (please dont!).

function tep_get_all_get_params($exclude_array = '') {
	if ((empty($exclude_array))||(!is_array($exclude_array))) $exclude_array = array();
	$get_url = http_build_query(array_diff_key($_GET,array_flip($exclude_array)));
	if (empty($get_url)) { return ''; } else { return $get_url.'&'; }
}
function tep_break_string($string, $len, $break_char = '-') {
	$str = explode(' ',$string);
	foreach($str as $key=>$val) {
			if (strlen($val)>$len) $str[$key]=substr(chunk_split($val, $len, $break_char),0,-(strlen($break_char)));
	}
	return implode(' ',$str);
}
function tep_round($number, $precision) { return round($number,$precision); }
function tep_array_to_string($array, $exclude = '') {
	if ((empty($exclude))||(!is_array($exclude))) $exclude = array();
	return http_build_query(array_diff_key($array,array_flip($exclude)));
}
function tep_not_null($value) { return !empty($value); }

 

 

These are a few that return the same as stock osc, a lot of functions are different in my base unfortunately due to new and obsoleted classes and functions, sorry! Why the empty() checks in front of is_array() you ask? Well... empty will fail through around 7 times faster than using is_array on an empty incoming value. SOURCE: http://www.phpbench.com/

 

If I get some time to compare more I'll post some up... enjoy.

Share this post


Link to post
Share on other sites

@@RaMoNsTeR

 

Would you mind revealing your method for negating the need for the session_id within the uri , or is it just through 'force cookie use'

Share this post


Link to post
Share on other sites

@@RaMoNsTeR

 

Would you mind revealing your method for negating the need for the session_id within the uri , or is it just through 'force cookie use'

 

Sure, really as you suspected, although I don't have a 'force cookie use' option anymore hehe! Cookies are forced but I've fixed the cookie code and session handling + cookie detection etc to be as robust, broad and transparent as possible - I have 2 big sites running on this core and never had anyone complain they can't purchase on the site due to cookie issues - its 2012 and the time for sessions in URLs and tolerating people who completely turn off cookies is gone for me.

 

In addition some pages which refused direct approach and threw the ugly "cookie screen" previously have been fixed with session/security in mind. I completely ridded the code of any trace of session URL adding, thus the omission in my tep_get_all_get_params() above. To be honest my site is at the point where variables aren't passed around much at all, its just so much cleaner.

 

Anyone looking at building an online shop with any kind of future in mind really shouldn't deploy a solution with "long vars" (i.e. $HTTP_GET....) and session ids stuffed to the end of URLs, I think OSC 2.x still has this off the shelf? Still, I wouldn't waste dev time on that 2.x base either from scratch =)

Share this post


Link to post
Share on other sites

Small incompatibility issue with tep_not_null() above. string type '0' returns incorrectly - use this instead:

 

function tep_not_null($value) { if (strlen(trim($value)) > 0) return true; return !empty($value); }

Share this post


Link to post
Share on other sites

Hi Parikesit,

 

While optimizing products_new.php & specials.php it returns errors. I believe its cause by array will you able to shed some light? Very nice codes by the way!!

 

Sam

Share this post


Link to post
Share on other sites

Since APC is standard in PHP 6+, and looking towards the future... there are *MASSIVE* gains for page load time to be had by lumping all the define() crud (database_tables.php / filenames.php / languages/<yours>) etc into an apc_constant_define(). Profiling with XDEBUG etc will confirm this, I now have category pages completing under the 4ms mark which is a milestone for me anyway.

 

Writing a cache function set for APC which allows the setup to run when APC isn't present is ideal, and gives functional support for flushes and other protection mechs. My application_top (as an example only - non working on std install as is) has:

 

// set the application parameters
    if (!cache_load_constants('configuration')) {
		    $config_array = array();
		    $configuration_query = tep_db_query('SELECT configuration_key AS cfgKey, configuration_value AS cfgValue FROM configuration');
		    while ($configuration = tep_db_fetch_array($configuration_query)) $config_array[$configuration['cfgKey']] = $configuration['cfgValue'];
		    cache_save_constants('configuration', $config_array);
    }

 

BEWARE: you need to be wary that these APC variables are PHP instance wide, so your wrapper should add something unique to the storage vars like the database name perhaps and whether its in test or production mode.

Share this post


Link to post
Share on other sites

Warning! I have tested the above suggestion for function calculate(). First it broke my site. So I fixed it to read:

 

   function calculate() {
  global $currencies;
  $this->total = 0;
  $this->weight = 0;
  if ((!is_array($this->contents))||(sizeof($this->contents)==0)) return 0;
  reset($this->contents);
   while (list($products_id, ) = each($this->contents)) {
    $qty = $this->contents[$products_id]['qty'];
// products price
    $product_query = tep_db_query("select p.products_id, p.products_tax_class_id, p.products_weight, IF(s.specials_new_products_price, s.specials_new_products_price, p.products_price) as products_price
								   FROM products p LEFT JOIN specials s ON p.products_id=s.products_id AND s.status=1
								   WHERE p.products_id IN (".implode(',',array_keys($this->contents)) . ")");
    while ($product = tep_db_fetch_array($product_query)) {
	 $products_tax = tep_get_tax_rate($product['products_tax_class_id']);
	 $products_price = $product['products_price'];
	 $products_weight = $product['products_weight'];
	 $this->total += $currencies->calculate_price($products_price, $products_tax, $qty);
	 $this->weight += ($qty * $products_weight);
	 }

 

Then, the next bug is that if you add items to your cart like so:

 

1 item in cart ===> subtotal is correct

add another item so you have 2 items in the cart ===> subtotal adds up and is multiplied by 2

add another item, so you have 3 items in the cart ===> subtotal adds up and is multiplied by 3...

 

O' my, O' my if the customers accepts that price... :)

 

Sara

Share this post


Link to post
Share on other sites