Jump to content

Archived

This topic is now archived and is closed to further replies.

Chemo

An example of how to debug a nasty query problem

Recommended Posts

I was contacted by a high volume site that is hosted on a dedicated server. The problem was that the MySQL process would randomly spike at 90% and hold for up to 30 seconds. The store owner seemed to believe it was somehow tied into the checkout process since there was a slight coorelation to the time an order was being placed.

 

So, the bug hunt began...

 

STEP 1

Since this was an obvious MySQL problem the first thing to do is create a debug script to pinpoint the nasty query. The standard logging simply will not work as the information contained is minimal at best. So, here is the modification to tep_query() that is located in *catalog*/includes/functions/database.php:

  function tep_db_query($query, $link = 'db_link') {
   global $$link;
$query_start = microtime();

   $result = mysql_query($query, $$link) or tep_db_error($query, mysql_errno(), mysql_error());

$_start = explode(' ', $query_start);
$_end = explode(' ', microtime());
$_time = number_format(($_end[1] + $_end[0] - ($_start[1] + $_start[0])), 6);

   if (defined('STORE_DB_TRANSACTIONS') && (STORE_DB_TRANSACTIONS == 'true') && ($_time > 5.00)) {
     if (SID) { $cookie_info = 'COOKIES: Not enabled'."\n"; }
  else { $cookie_info = 'COOKIES: Enabled'."\n"; }
  $ip_addy = 'IP ADDRESS: ' . $_SERVER["REMOTE_ADDR"] . "\n";
  error_log($cookie_info. $ip_addy . 'TIME: '.strftime('%m/%d/%Y %H:%M:%S')."\n".'EXECUTION TIME: '.$_time .' seconds'."\n".'SCRIPT: '.$_SERVER["PHP_SELF"]."\n".'QUERY ' . $query . "\n\n", 3, STORE_PAGE_PARSE_TIME_LOG);
   }
   return $result;
 }

Now we were getting somewhere! The log file was now being populated with each query THAT TOOK LONGER THAN 5 SECONDS TO EXECUTE. The difference is that the stock code logged everything so you would have to search through thousands of entries to find the offending query. This mod only logged the process killers.

 

In addition, each entry has in a readable format whether the customer has cookies enabled or not, their IP, SCRIPT EXECUTING THE QUERY, query execution time, and the actual query.

 

STEP 2

Now it was time to look at the log file. Almost immediately it was obvious where the problem was located. Here is a sample of the log file entry:

COOKIES: Enabled
IP ADDRESS: <removed>
TIME: 11/17/2004 17:23:56
EXECUTION TIME: 23.917675 seconds
SCRIPT: /product_info.php
QUERY select p.products_id, p.products_image from orders_products opa, orders_products opb, orders o, products p where opa.products_id = '124' and opa.orders_id = opb.orders_id and opb.products_id != '124' and opb.products_id = p.products_id and opb.orders_id = o.orders_id and p.products_status = '1' group by p.products_id order by o.date_purchased desc limit 6

Well look at that...it's not in the checkout process after all. It was confirmed when the log file was entirely populated with exact entiries.

 

It should be obvious to anyone that is familiar with osC as to the source of the offending query: the also_purchased.php module.

 

I asked if he had the cache turned on and it was verified that the setting was true. However, the cache files were not being written to file (and hence, used).

 

STEP 3

Now it was clear what the root cause was for the MySQL process spike. I looked over all the code and everything was in order for the box cache system...what now? It had to be a setting issue! I asked for the log directory listed in the admin control panel and the problem was solved! The log file directory did not contain an ending slash.

 

STEP 4

Now that the cache was fixed and confirmed to be functioning properly the log file was reset. An amount of time was allowed to pass so the most popular products would be cached. Once again, the log file was reset. The owner noted that the spikes were still happening but at a lower frequency. The log file confirmed this but now we had encountered another problem: the cache system only works for cookied customers! All those with cookies turned off still executed the HUGE query thereby spiking the MySQL process.

 

STEP 5

It was clear that the module needed to be cached as well or the server would suffer severe load at times. So, I implemented a cache code that is similiar to the page cache contribution but on a scaled down version. Basically, it buffers the output and strips the osCsid from each URL substituting a placeholder (<osCsid>). Then, I created a custom function to read the cache and replace the placeholder with the customer's osCsid THEN render the HTML code. Worked beautifully...no bugs.

 

STEP 6

Monitor the log file and server stats. So far there are no further occurrences and the owner commented that the load is extremely low compared to before the debugging.

 

Now, it should be noted that if your cache file system does not work and you have:

 

1. A large number of orders

or

2. A large nubmer of products

or

3. Both

 

...your site will present the same symptoms. The owner was lucky enough to have a dedicated server so as to monitor the process useage. If he were virtually hosted he would never know and probably never find out his customers were getting 30+ second load times on the product_info pages.

 

The moral of all this: enable the cache system and make sure it is writing files to disk properly.

 

So, that is how the story ends...

Share this post


Link to post
Share on other sites

Honestly, my site was hitting 95% mysql usage (on a dedicated p4-2.8ghz, this was a bit unheard of).. we hadn't pinpointed it to the orders until I noticed a small surge of orders coming in at a certain time and the mysql just sitting at 95% usage. You can imagine that on a dedicated server, this can kill other sites very easily, causing problems

 

It is important to note that we were working with a CRE Loaded 6.1a version of oscommerce, and that this modified pack seems to have more loose code than the stock oscommerce.

 

Thanks again Bobby, you're a lifesaver... literally. I think my webhost guy was gonna strangle me for having that high of a server usage :)

Share this post


Link to post
Share on other sites

'Tis what I do :)

 

Hopefully others will read this and get the take home lesson: enable the cache and make sure it's working!

 

BTW, your site is flying now!

Share this post


Link to post
Share on other sites
'Tis what I do :)

 

Hopefully others will read this and get the take home lesson: enable the cache and make sure it's working!

 

BTW, your site is flying now!

 

another great contribution...thx chemo :D yer on a roll!

 

should I do this if I'm using the page cache contrib?

Share this post


Link to post
Share on other sites

No need if using the page cache...but he was using a template system which is not compatible.

 

The main point is that one nasty query (if you have a large number of products or orders) can easily be cached BUT only if the cache is working correctly.

 

Also, the modification of the logging makes it a bit more useful for pinpointing the problem file and query.

 

Hopefully, someone gets some good info out of the post.

Share this post


Link to post
Share on other sites
Hopefully, someone gets some good info out of the post.

 

Thanks for posting, great information. Impressed to see you take the time to outline the problem and how you fixed it, just so that others may benefit from your work!

Share this post


Link to post
Share on other sites

I've been following chemo's posts and threads over the last couple of weeks. Chemo is finishing the year strong and should be forum member of the year ;)

Share this post


Link to post
Share on other sites
I've been following chemo's posts and threads over the last couple of weeks. Chemo is finishing the year strong and should be forum member of the year ;)

Much gratitude for the kind remarks...but it is bitter sweet for me.

 

I used to have 3 clients that paid me a monthly salary to webmaster their sites (create custom scripts, debug, optimize, SEO, etc). One of them recently decided that 4 months of stability on his site and strong rankings signalled the end of my service term. I guess I worked myself out of business with him :)

 

At any rate, that leaves a few hours daily to spend more time on my favorite forum. All I've been doing recently is giving back the few years of experience that I have with osC. You know...I've watched the project grow over the years from the old tep days to the current MS2. I have to say that I am absolutely floored with the radical MS3. The layers of abstraction are very useful and the new classes are elegant to say the least. I can't wait until hpdl let's us start posting MS3 contributions since I have already coded 3 good ones.

 

You know who should get the coder of the year award? The development team...we just contribute to the master code that is osC. If it weren't for their hard work and dedication none of us would be coding add-ons.

 

So, let's hear it for the Development Team! Cheers!

Share this post


Link to post
Share on other sites

WOW. this topic just saved me so much.

after having SQL load and oscommernce pages problems. it all came down to also_purchased.php module. maybe at some point i installed a contrib but.. jezez christ.

 

pages either loaded or took 50+sec to load vs the 0.85sec. in restored the also_purchased.php to stock. and it runs prefect. quries down to 40s from 70s.

 

who ever did the also_purchased.php contrib should be thrown in a hole.

Share this post


Link to post
Share on other sites

Chemo are you still around? those posts are 5 yars old.

 

I have this problem now and I suspect it is from the google code added to track statsitics for the adwords (I could be way off) because it started immediately after that was added and nothing more really changed.

 

I could remove the code to check it that is it but that defeats the purpose.

 

Any help it appreciated.

 

And the damb site is just starting to come together too!


The Site can be viewed at www.performanceautopartsonline.com

 

The site is live (despite these minor glitches) please respect that and do not sign up etc...

 

maybe a contribution one day when I get this site the way I want it.

 

I don't make spelling mistakes! I have dyslecsic fingers.

Share this post


Link to post
Share on other sites

Shoot I just noticed he was banned! Doh


The Site can be viewed at www.performanceautopartsonline.com

 

The site is live (despite these minor glitches) please respect that and do not sign up etc...

 

maybe a contribution one day when I get this site the way I want it.

 

I don't make spelling mistakes! I have dyslecsic fingers.

Share this post


Link to post
Share on other sites

×