ericksaint Posted July 29, 2018 Share Posted July 29, 2018 Recently the site I watch over for a friend had a sudden traffic spike that led to his host shutting down his site for being over traffic. Never had a traffic problem once before in 10 years. Logs look like it's bots are just constantly downloading the entire image directory. Is there any way to protect this folder so the images can't be downloaded, but they can still be accessed when people click on the item in the store? I've tried blocking IP's through the server, but they just come back with other IP's. Link to comment Share on other sites More sharing options...
♥JcMagpie Posted July 29, 2018 Share Posted July 29, 2018 Have you blocked the bots in your .htaccess file? Its not 100% but still worth doing. I'm very surprised that a simple bot downloading images has got your site block by your host! Perhaps you need to look further at what is happening, also most hosts will give you several warnings about bandwidth usage before blocking! If yours did not perhaps you should look for a better host. Bot lists are changing all the time so you need to see which works for you, this is another one you may want to try. http://tab-studio.com/en/blocking-robots-on-your-page/ Add this to your .htaccess you can add any other bots you find accessing your site. Be sure to back up your file first. #Block bad bots RewriteEngine On RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR] RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:[email protected] [OR] RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR] RewriteCond %{HTTP_USER_AGENT} ^Custo [OR] RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR] RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR] RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR] RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR] RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR] RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR] RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR] RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR] RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR] RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR] RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR] RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR] RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR] RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR] RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR] RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR] RewriteCond %{HTTP_USER_AGENT} ^HMView [OR] RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR] RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR] RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR] RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR] RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR] RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR] RewriteCond %{HTTP_USER_AGENT} ^larbin [OR] RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR] RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR] RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR] RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR] RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR] RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR] RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR] RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR] RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR] RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR] RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR] RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR] RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR] RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR] RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR] RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR] RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR] RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR] RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR] RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR] RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR] RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR] RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR] RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR] RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR] RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR] RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR] RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR] RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR] RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR] RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR] RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR] RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR] RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR] RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR] RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR] RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR] RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR] RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR] RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR] RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR] RewriteCond %{HTTP_USER_AGENT} ^Wget [OR] RewriteCond %{HTTP_USER_AGENT} ^Widow [OR] RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR] RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR] RewriteCond %{HTTP_USER_AGENT} ^ICG-AutoExploiterBoT\ [OR] RewriteCond %{HTTP_USER_AGENT} ^Zeus RewriteRule ^.* - [F,L] #Prevent directory listings Options All -Indexes Link to comment Share on other sites More sharing options...
Jack_mcs Posted July 29, 2018 Share Posted July 29, 2018 @Ericksaint That is most likely caused by data skimmers. They are non-friendly bots that scour a site for data they can use or sell. Some will be hackers. If the site has been active for 10 years, then it has had them on before. It is just now there are either more of them or they are hitting at the same time. Many times it is due to know bots like yandex, baidu and mj12bot, among others. Blocking by user-agent will sometimes work but some will not use or disguise that field so they can get by such blocks. I suggest you install View Counter. It will allow you to see who is causing the problem and to block them. Support Links: For Hire: Contact me for anything you need help with for your shop: upgrading, hosting, repairs, code written, etc. Get the latest versions of my addons Recommended SEO Addons Link to comment Share on other sites More sharing options...
ericksaint Posted July 29, 2018 Author Share Posted July 29, 2018 5 hours ago, Jack_mcs said: @Ericksaint That is most likely caused by data skimmers. They are non-friendly bots that scour a site for data they can use or sell. Some will be hackers. If the site has been active for 10 years, then it has had them on before. It is just now there are either more of them or they are hitting at the same time. Many times it is due to know bots like yandex, baidu and mj12bot, among others. Blocking by user-agent will sometimes work but some will not use or disguise that field so they can get by such blocks. I suggest you install View Counter. It will allow you to see who is causing the problem and to block them. I actually did install View Counter last night after posting this. I just went into the monitor to see what was happening over the night time hours. When I click on the "next page" button at the bottom right I get the following. The only change I made to the settings was to show 50 lines per page instead of the default. I'm pretty sure the name and email in the error are fake, trying to track it down now. Fatal error: Uncaught exception 'Exception' with message 'invalid data, remaining: :"navigationHistory":2:{s:4:"path";a:1:{i:0;a:4:{s:4:"page";s:14:"contact_us.php";s:4:"mode";s:6:"NONSSL";s:3:"get";a:1:{s:6:"action";s:4:"send";}s:4:"post";a:6:{s:6:"formid";s:32:"92046e082f1021b3d5689f048baa0ff4";s:4:"name";s:12:"VincentJeant";s:5:"email";s:18:"[email protected]";s:7:"enquiry";s:113:"????????? ???????????? ?????? ????? <a href=http://495realty.ru/>495realty.ru</a>";s:20:"g-recaptcha-response";s:0:"";s:6:"submit";s:0:"";}}}s:8:"snapshot";a:0:{}}' in /home/XXXXXXXXXXXX/includes/functions/view_counter.php:1727 Stack trace: #0 /home/XXXXXXXXXXXX/includes/functions/view_counter.php(1325): UnserializeSession('sessiontoken|s:...') #1 /home/XXXXXXXXXXXX/view_counter.php(679): ShowCart('84fcdf65122c221...', 459, '') #2 {main} thrown in /home/XXXXXXXXXXXX/includes/functions/view_counter.php on line 1727 Link to comment Share on other sites More sharing options...
ericksaint Posted July 29, 2018 Author Share Posted July 29, 2018 11 hours ago, JcMagpie said: Have you blocked the bots in your .htaccess file? Its not 100% but still worth doing. I'm very surprised that a simple bot downloading images has got your site block by your host! Perhaps you need to look further at what is happening, also most hosts will give you several warnings about bandwidth usage before blocking! If yours did not perhaps you should look for a better host. Bot lists are changing all the time so you need to see which works for you, this is another one you may want to try. http://tab-studio.com/en/blocking-robots-on-your-page/ Add this to your .htaccess you can add any other bots you find accessing your site. Be sure to back up your file first. I have not blocked anything specific at this point, like above, in .htaccess. It gave a couple warnings about the increased traffic over about 3 days, from what I'm told, then ended up with the block. I just watch over the "code" part of the site for a friend. I don't get emails from his host. I just have access to the host server side interface and admin panels so I can help try to fix things when they go awry. I'm the "friend that knows about computers" if you will. Link to comment Share on other sites More sharing options...
Jack_mcs Posted July 30, 2018 Share Posted July 30, 2018 7 hours ago, ericksaint said: The only change I made to the settings was to show 50 lines per page instead of the default. I've never an error like that for VC and no one else has reported it. The link it has is from a hacker from Russia trying to cause problems. I suppose that something in the session data for that visitor could cause the failure but I can't reproduce it so I can't say for sure. Just out of curiosity, what versiond of oscommerce and oho are being used? Support Links: For Hire: Contact me for anything you need help with for your shop: upgrading, hosting, repairs, code written, etc. Get the latest versions of my addons Recommended SEO Addons Link to comment Share on other sites More sharing options...
JAValeryon Posted July 30, 2018 Share Posted July 30, 2018 On 29/7/2018 at 4:12 AM, ericksaint said: Recently the site I watch over for a friend had a sudden traffic spike that led to his host shutting down his site for being over traffic. Never had a traffic problem once before in 10 years. Logs look like it's bots are just constantly downloading the entire image directory. Is there any way to protect this folder so the images can't be downloaded, but they can still be accessed when people click on the item in the store? I've tried blocking IP's through the server, but they just come back with other IP's. I had a similar problem in my site. I was suffering hotlinking (bandwidth leeching). https://httpd.apache.org/docs/2.4/rewrite/access.html Link to comment Share on other sites More sharing options...
ericksaint Posted July 30, 2018 Author Share Posted July 30, 2018 21 hours ago, Jack_mcs said: I've never an error like that for VC and no one else has reported it. The link it has is from a hacker from Russia trying to cause problems. I suppose that something in the session data for that visitor could cause the failure but I can't reproduce it so I can't say for sure. Just out of curiosity, what versiond of oscommerce and oho are being used? The error went away after I applied this fix, from the VC support thread, in the "Unserialize" portion of the file. The osC version is 2.3.4, sorry I'm not sure what "oho" means. Link to comment Share on other sites More sharing options...
Jack_mcs Posted July 31, 2018 Share Posted July 31, 2018 I'm sorry. I meant php, not oho. But it doesn't matter now that it is working. Support Links: For Hire: Contact me for anything you need help with for your shop: upgrading, hosting, repairs, code written, etc. Get the latest versions of my addons Recommended SEO Addons Link to comment Share on other sites More sharing options...
MrPhil Posted July 31, 2018 Share Posted July 31, 2018 On 7/30/2018 at 7:39 AM, JAValeryon said: I had a similar problem in my site. I was suffering hotlinking (bandwidth leeching). https://httpd.apache.org/docs/2.4/rewrite/access.html Well, that depends on how the bots are accessing image files. If they are directly linking to them, some sort of standard "hotlink" protection in your .htaccess could be enough to deny them access (by allowing "REFERER" access only to your domain). If they are running your PHP code to get images sent to them, perhaps something can be done in .htaccess to ban access to .php files (except index.php), at least in that directory (REQUEST_URI), unless they're going through index.php to get to the images. If someone sharing your server is directly reading your image files, complain to your host and experiment with 700 permissions (make sure it doesn't break web access, or osC/PHP's ability to read/write files). Finally, some hosts offer a "data leech" protection in their control panels that may do what you want (I don't think the definition is standardized... it may refer to users handing out passwords to restricted areas to other people). Link to comment Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.