Jump to content
  • Checkout
  • Login
  • Get in touch

osCommerce

The e-commerce.

Also Purchased query performance problem


mattsc

Recommended Posts

I'm getting a huge performance hit (on the order of 40 seconds!) on the query to retrieve the "Also Purchased" items. My orders_products table is just over 130k rows, orders is @ 36k rows, my and products and products description row count are both around 4k rows. This is of course hitting my most popular products the worst. For now, I've disabled the Also Purchased part of the product info page, but I'm hoping someone might have an idea here about how to fix this.

 

For debugging purposes, I've been running the query against the database directly, and sure enough, this query is my problem:

MariaDB [osc_v2.3.4_GOLD]> select
         p.products_id, p.products_image, pd.products_name
     from
         orders_products opa,
         orders_products opb,
         orders o,
         products p
     LEFT JOIN
         products_description pd
     on
         p.products_id = pd.products_id
     where
         opa.products_id = 512
         and opa.orders_id = opb.orders_id
         and opb.products_id != 512
         and opb.products_id = p.products_id
         and opb.orders_id = o.orders_id
         and p.products_status = '1'
         and pd.language_id = 1
     group by
         p.products_id order
         by o.date_purchased desc
         limit 3;
+-------------+---------------------------------+----------------------------------------------------------------------------+
| products_id | products_image                  | products_name                                                              |
+-------------+---------------------------------+----------------------------------------------------------------------------+
|        2536 | fuelsensornsr150sp-b41b50dc.jpg | Fuel Level Sensor, NSR150SP                                                |
|        3008 | 16950kw6901-142699a3.jpg        | Fuel Tap / Pet Cock, NSR150SP                                              |
|        3867 | yzset1-4f369a33.jpg             | Set, Exhaust Chamber, Stainless Steel with Stainless Steel Silencer, YZ250 |
+-------------+---------------------------------+----------------------------------------------------------------------------+
3 rows in set (41.16 sec)

When I run this against another product (changing product 512 to 521):

MariaDB [osc_v2.3.4_GOLD]> select
         p.products_id, p.products_image, pd.products_name
     from
         orders_products opa,
         orders_products opb,
         orders o,
         products p
     LEFT JOIN
         products_description pd
     on
         p.products_id = pd.products_id
     where
         opa.products_id = 521
         and opa.orders_id = opb.orders_id
         and opb.products_id != 521
         and opb.products_id = p.products_id
         and opb.orders_id = o.orders_id
         and p.products_status = '1'
         and pd.language_id = 1
     group by
         p.products_id order
         by o.date_purchased desc
         limit 3;
+-------------+----------------------------+------------------------------------------------------------------------+
| products_id | products_image             | products_name                                                          |
+-------------+----------------------------+------------------------------------------------------------------------+
|        3287 | rearstain-be840159.jpg     | Set, Pipe, Full Race System, Ninja 250R/300R, Round Stainless Silencer |
|        4281 | meshsliponset-57918fee.jpg | Set, Pipe, Slip On, Mesh, Slash Cut, Yamaha R25/R3                     |
|        2947 | bungs1-33783ffe.jpg        | Wheel Crash Bungs                                                      |
+-------------+----------------------------+------------------------------------------------------------------------+
3 rows in set (4.25 sec)
It's returning in just over 4 seconds... still not terrific, but not the 10X speed hit I'm getting on my popular products otherwise.

 

My SQL-Fu is failing me here as to how to try to optimize this query, or possibly throw a index on the table(s) to try and speed things up.
Link to comment
Share on other sites

The major problem here is that 130k orders_products table, combined with the 36k orders table. The tables are already indexed, so that's not a solution.

 

The easiest way to solve this is to purge your older orders. If you've been accumulating them for many years, clearing out all but the last year or two will help a lot. You shouldn't need to keep old orders online. Of course you should back up your database first, in case you need some of that old data.

 

The only other solution I can think of would be to rewrite the code to run the query manually in the Admin and store the results in a new database table. The catalog side query would then pull the results from this new table, which would be a great deal faster. The downside is that you need to manually run the query periodically to refresh the new table. That and the new code that would have to be written to do this of course.

 

I would love to see a better solution to this. Anybody?

 

Regards

Jim

See my profile for a list of my addons and ways to get support.

Link to comment
Share on other sites

The major problem here is that 130k orders_products table, combined with the 36k orders table. The tables are already indexed, so that's not a solution.

 

The easiest way to solve this is to purge your older orders. If you've been accumulating them for many years, clearing out all but the last year or two will help a lot. You shouldn't need to keep old orders online. Of course you should back up your database first, in case you need some of that old data.

 

The only other solution I can think of would be to rewrite the code to run the query manually in the Admin and store the results in a new database table. The catalog side query would then pull the results from this new table, which would be a great deal faster. The downside is that you need to manually run the query periodically to refresh the new table. That and the new code that would have to be written to do this of course.

 

I would love to see a better solution to this. Anybody?

 

Regards

Jim

 

So I may have a problem then with my tables NOT having indices (indexes? indiees? indixes? sounds like I'm trying to pluralize moose as mouses LOL! anyway!) on them. When I run show index from on the various tables, I'm seeing the primary keys on orders (orders_id) and orders_products (orders_products_id) for the tables, but no other form of an index on either of those two tables. Both products and products_description have multiple indicies (great scott that just feels wrong to 'say') on them, but appear to be pretty primitive in the form of their optimization for what I would expect to see. This database started it's life as a 2.2 install and has, through it's years, been wrangled and harassed into it's job for, well, obviously a while.

 

I don't particularly have any problem with creating a separate database copy from which to run reports. It would, as you note, be a bit of a PITA, but it's possible to re-write the reports to query against a view which included historical data as well as current sales. I certainly don't look forward to doing that work, but it's a possibility. I'd have to chat with my partners to see what their thoughts on the matter would be. For now, I've disabled the "also purchased" part of the product_info page... something else which isn't terribly ideal, but at least it got the page loading quickly again for the most popular products.

 

If I am missing an index or five, from the years of my marginally ham fisted attempts at being a dba, I'd love to be given a hint of "Oh dude, you're missing X Y and Z so go run fix_your_database.sql" and get the performance down from 40+ seconds back to realistic query times.

Link to comment
Share on other sites

@@mattsc Have you tried turning on the cache function?

 

Dan 

 

In a word... I have not. Part of this is historically the store hasn't used it and has been acceptably performant. I have been adverse to doing anything which I don't acceptably understand.... and thusly... fear. I have no better excuse as to why I haven't enabled cache. While I'm moderately accomplished at hacking y way through the OSC PHP code, and have some experience with database work, I am at best of days, a hack. I'm far more comfortable in the snuggly realms of Perl, where when I get in over my ass I can send up a call for help on perlmonks and get fifteen answers to any problem, all of which work, and not a one of the makes any sense. :)

Link to comment
Share on other sites

Try it Matt...it should solve your problem....if not you can always turn it off again.  There are settings for it in admin if you haven't run across them.

 

Dan

Link to comment
Share on other sites

If this is still a 2.2 version database, you are probably missing an index or three. I would compare the database schema for your database to the latest 2.3.4 and add in any missing indexes. It would probably help.

 

Cache is a horrible solution to anything. It should only be used as a temporary fix to keep the store running while you fix the problem. If you do use it, remember that you need to flush the cache every time you make a change or your change will be invisible until some random future time.

 

Regards

Jim

See my profile for a list of my addons and ways to get support.

Link to comment
Share on other sites

The other solution is to limit the number of orders the also_purchased searches through. eg 1 month orders ---  6 months etc

 

This needs a  line added in the database and a change to the also purchased module.

This is my also purchased

Copyright (c) 2010 osCommerce

  Released under the GNU General Public License
*/

  if (isset($HTTP_GET_VARS['products_id'])) {
    $orders_query = tep_db_query("select p.products_id, p.products_image, pd.products_name from " . TABLE_ORDERS_PRODUCTS . " opa, " . TABLE_ORDERS_PRODUCTS . " opb, " . TABLE_ORDERS . " o, " . TABLE_PRODUCTS . " p LEFT JOIN " . TABLE_PRODUCTS_DESCRIPTION . " pd on p.products_id = pd.products_id where opa.products_id = '" . (int)$HTTP_GET_VARS['products_id'] . "' and opa.orders_id = opb.orders_id and opb.products_id != '" . (int)$HTTP_GET_VARS['products_id'] . "' and opb.products_id = p.products_id and opb.orders_id = o.orders_id and p.products_status = '1' /*and p.products_quantity > '0'*/ and pd.language_id = '" . (int)$languages_id . "' and o.date_purchased >
SUBDATE( now( ) , INTERVAL " .
MAX_DISPLAY_TIME_ALSO_PURCHASED . " DAY ) group by p.products_id order by o.date_purchased desc limit " . MAX_DISPLAY_ALSO_PURCHASED);
    $num_products_ordered = tep_db_num_rows($orders_query);
    if ($num_products_ordered >= MIN_DISPLAY_ALSO_PURCHASED) {

      $also_pur_prods_content = NULL;

      while ($orders = tep_db_fetch_array($orders_query)) {
        $also_pur_prods_content .= '<div class="col-xs-6 col-sm-4 col-md-4">';
        $also_pur_prods_content .= '  <div class="thumbnail equal-height wrapper text-center">';
        $also_pur_prods_content .= '    <a href="' . tep_href_link(FILENAME_PRODUCT_INFO, 'products_id=' . $orders['products_id']) . '">' . tep_image(DIR_WS_IMAGES . $orders['products_image'], $orders['products_name'], SMALL_IMAGE_WIDTH, SMALL_IMAGE_HEIGHT) . '</a>';
        $also_pur_prods_content .= '    <div class="caption"><p class="text-center">';
        $also_pur_prods_content .= '      <a href="' . tep_href_link(FILENAME_PRODUCT_INFO, 'products_id=' . $orders['products_id']) . '">' . $orders['products_name'] . '</a>';
        $also_pur_prods_content .= '  </p></div>';
        $also_pur_prods_content .= '  </div>';
        $also_pur_prods_content .= '</div>';
      }

?>

  	<hr>

  <h3><?php echo TEXT_ALSO_PURCHASED_PRODUCTS; ?></h3>

  <div class="row">
    <?php echo $also_pur_prods_content; ?>
  </div>


<?php
    }
  }
?>

There is MAX_DISPLAY_TIME_ALSO_PURCHASED added in the database

In admin under configuration I have an input where i can say how many days to check.

 

I am unable to tell you how to add the necessary sql ( I dont know how to write the input) ... but maybe someone here can work backwards from the code above or show how to hard code the time frame in the above code.

 

Doug

Link to comment
Share on other sites

@@douglaswalker  That SQL scans the entire orders table looking for dates that fit in the required range. Without actually testing it, I would say that it is very likely to be slower than the stock code.

 

Regards

Jim

See my profile for a list of my addons and ways to get support.

Link to comment
Share on other sites

If this is still a 2.2 version database, you are probably missing an index or three. I would compare the database schema for your database to the latest 2.3.4 and add in any missing indexes. It would probably help.

 

Cache is a horrible solution to anything. It should only be used as a temporary fix to keep the store running while you fix the problem. If you do use it, remember that you need to flush the cache every time you make a change or your change will be invisible until some random future time.

 

Regards

Jim

 

It's been migrated to the 2.3.4-BS-Gold database. Now, if I did that CORRECTLY in entirety, I BELIEVE I did, but maybe a missing index or three could have happened in the migration of old data into the new structure. I've had to migrate data a few times in the ongoing development before the site went live (finally).

Link to comment
Share on other sites

The module doesn't use caching, which can't work properly for something that's supposed to give you the most recent three also-purchased for a product unless there's something really clever built into your checkout process.

 

Perhaps come at it from another direction and think about something that'll add more value to the store by selling what you want to sell rather than what people happen to have bought recently - a cross-sell or linked product addon. A lot of work to set up manually but if you can come up with a way to populate it from a query...

Contact me for work on updating existing stores - whether to Phoenix or the new osC when it's released.

Looking for a payment or shipping module? Maybe I've already done it.

Working on generalising bespoke solutions for Quickbooks integration, Easify integration and pay4later (DEKO) integration at 2.3.x

Link to comment
Share on other sites

The only other solution I can think of would be to rewrite the code to run the query manually in the Admin and store the results in a new database table. The catalog side query would then pull the results from this new table, which would be a great deal faster. The downside is that you need to manually run the query periodically to refresh the new table. That and the new code that would have to be written to do this of course.

 

This (as a new P_I content module).  I moved the stock also_purchased into its own module a little while back, so it should (for anyone on latest edge or updating as I make changes, to (assuming someone makes such a module, I know we're all super-busy) to simply turn off the stock one, upload the new one, and turn it on...

 

Plus a dashboard module that pops up (say) every 50 new orders to remind shopowner to "press the button" to refresh the also_purchased.

Link to comment
Share on other sites

What mySQL engine do you use @@mattsc?

:blink:
osCommerce based shop owner with minimal design and focused on background works. When the less is more.
Email managment with tracking pixel, package managment for shipping, stock management, warehouse managment with bar code reader, parcel shops management on 3000 pickup points without local store.

Link to comment
Share on other sites

@@BrockleyJohn

 

My also_purchased is not modular yet ( :blush: ) so the cache function still works. The cache resets itself every now and then.

It does work and I really don't notice any load issues.

 

@@kymation

 

I really like the idea to take it offline and into admin and just run it every now and then. 

Link to comment
Share on other sites

Hi there

 

http://addons.oscommerce.com/info/3294

 

found this very old addon which does what 

 

@@burt @@kymation were talking about .... re pre-populating database tables with ids for the also purchased products.

 

I have tested it in very limited form.. it works but I am unable to test speed etc

 

I wondered if anyone else would like to take it for a test drive and see what they think?

 

It talks a bout a cron job but i just ran the file to populate the fields.

 

Let me know what you think?

Link to comment
Share on other sites

I have about 20k in my orders table and 50k lines in orders_products.  When I run your first query using phpmyadmin, I get (3 total, Query took 0.0008 sec). 

I'm not really a dog.

Link to comment
Share on other sites

That is from the limit set on that query at the end of the query, "limit 3".

 

However, I'm using MySQL 5.6.35 and I noticed @@mattsc is using MariaDB.  I think Maria is supposed to be able to run all that MySQL runs.  On a side note, I see in my WHM of Cpanel that they are switching to Maria in the future instead of MySQL.

I'm not really a dog.

Link to comment
Share on other sites

1)

since the order_products table has a system generated auto incrementing key, you can just order by that key desc and avoid joining with the orders table completely.

select
         p.products_id, p.products_image, pd.products_name
     from
         orders_products opa,
         orders_products opb,
         products p
     JOIN
         products_description pd
     on
         p.products_id = pd.products_id
     where
         opa.products_id = 3048
         and opa.orders_id = opb.orders_id
         and opb.products_id != 3048
         and opb.products_id = p.products_id        
         and p.products_status = '1'
         and pd.language_id = 1
     group by
         p.products_id order
         by opb.orders_products_id desc
         limit 3;
2) have you ran an optimize on your database to rebuild the indexes ?

KEEP CALM AND CARRY ON

I do not use the responsive bootstrap version since i coded my responsive version earlier, but i have bought every 28d of code package to support burts effort and keep this forum alive (albeit more like on life support).

So if you are still here ? What are you waiting for ?!

 

Find the most frequent unique errors to fix:

grep "PHP" php_error_log.txt | sed "s/^.* PHP/PHP/g" |grep "line" |sort | uniq -c | sort -r > counterrors.txt

Link to comment
Share on other sites

1)

since the order_products table has a system generated auto incrementing key, you can just order by that key desc and avoid joining with the orders table completely.

select
         p.products_id, p.products_image, pd.products_name
     from
         orders_products opa,
         orders_products opb,
         products p
     JOIN
         products_description pd
     on
         p.products_id = pd.products_id
     where
         opa.products_id = 3048
         and opa.orders_id = opb.orders_id
         and opb.products_id != 3048
         and opb.products_id = p.products_id        
         and p.products_status = '1'
         and pd.language_id = 1
     group by
         p.products_id order
         by opb.orders_products_id desc
         limit 3;
2) have you ran an optimize on your database to rebuild the indexes ?

 

 

1) The re-written query returns 10 seconds faster, but instead of 42 seconds it's coming back in 32 seconds now. Not an insignificant change by any means, but still too slow. :( Ideally I'd have this query returning in < 5 seconds, tops.

 

2) I ran optimize table on products and order_products. No change in query performance. Still getting a 32 second execution time.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...