Jump to content
ericksaint

Protecting image directory

Recommended Posts

Recently the site I watch over for a friend had a sudden traffic spike that led to his host shutting down his site for being over traffic. Never had a traffic problem once before in 10 years. Logs look like it's bots are just constantly downloading the entire image directory. Is there any way to protect this folder so the images can't be downloaded, but they can still be accessed when people click on the item in the store? I've tried blocking IP's through the server, but they just come back with other IP's.

Share this post


Link to post
Share on other sites

Have you blocked the bots in your .htaccess file?  Its not 100% but still worth doing.

I'm very surprised that a simple bot downloading images has got your site block by your host! Perhaps you need to look further at what is happening, also most hosts will give you several warnings about bandwidth usage before blocking! If yours did not perhaps you should look for a better host.

Bot lists are changing all the time so you need to see which works for you, this is another one you may want to try. http://tab-studio.com/en/blocking-robots-on-your-page/

Add this to your .htaccess  you can add any other bots you find accessing your site. Be sure to back up your file first.

#Block bad bots
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^ICG-AutoExploiterBoT\ [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus

RewriteRule ^.* - [F,L]

#Prevent directory listings
Options All -Indexes

 


 

Share this post


Link to post
Share on other sites

@Ericksaint  That is most likely caused by data skimmers. They are non-friendly bots that scour a site for data they can use or sell. Some will be hackers. If the site has been active for 10 years, then it has had them on before. It is just now there are either more of them or they are hitting at the same time. Many times it is due to know bots like yandex, baidu and mj12bot, among others. Blocking by user-agent will sometimes work but some will not use or disguise that field so they can get by such blocks. I suggest you install View Counter. It will allow you to see who is causing the problem and to block them.

Share this post


Link to post
Share on other sites
5 hours ago, Jack_mcs said:

@Ericksaint  That is most likely caused by data skimmers. They are non-friendly bots that scour a site for data they can use or sell. Some will be hackers. If the site has been active for 10 years, then it has had them on before. It is just now there are either more of them or they are hitting at the same time. Many times it is due to know bots like yandex, baidu and mj12bot, among others. Blocking by user-agent will sometimes work but some will not use or disguise that field so they can get by such blocks. I suggest you install View Counter. It will allow you to see who is causing the problem and to block them.

I actually did install View Counter last night after posting this. I just went into the monitor to see what was happening over the night time hours. When I click on the "next page" button at the bottom right I get the following. The only change I made to the settings was to show 50 lines per page instead of the default. I'm pretty sure the name and email in the error are fake, trying to track it down now.

Fatal error: Uncaught exception 'Exception' with message 'invalid data, remaining: :"navigationHistory":2:{s:4:"path";a:1:{i:0;a:4:{s:4:"page";s:14:"contact_us.php";s:4:"mode";s:6:"NONSSL";s:3:"get";a:1:{s:6:"action";s:4:"send";}s:4:"post";a:6:{s:6:"formid";s:32:"92046e082f1021b3d5689f048baa0ff4";s:4:"name";s:12:"VincentJeant";s:5:"email";s:18:"akkucz9494@mail.ru";s:7:"enquiry";s:113:"????????? ???????????? ?????? ????? <a href=http://495realty.ru/>495realty.ru</a>";s:20:"g-recaptcha-response";s:0:"";s:6:"submit";s:0:"";}}}s:8:"snapshot";a:0:{}}' in /home/XXXXXXXXXXXX/includes/functions/view_counter.php:1727 Stack trace: #0 /home/XXXXXXXXXXXX/includes/functions/view_counter.php(1325): UnserializeSession('sessiontoken|s:...') #1 /home/XXXXXXXXXXXX/view_counter.php(679): ShowCart('84fcdf65122c221...', 459, '') #2 {main} thrown in /home/XXXXXXXXXXXX/includes/functions/view_counter.php on line 1727

 

Share this post


Link to post
Share on other sites
11 hours ago, JcMagpie said:

Have you blocked the bots in your .htaccess file?  Its not 100% but still worth doing.

I'm very surprised that a simple bot downloading images has got your site block by your host! Perhaps you need to look further at what is happening, also most hosts will give you several warnings about bandwidth usage before blocking! If yours did not perhaps you should look for a better host.

Bot lists are changing all the time so you need to see which works for you, this is another one you may want to try. http://tab-studio.com/en/blocking-robots-on-your-page/

Add this to your .htaccess  you can add any other bots you find accessing your site. Be sure to back up your file first.

 

I have not blocked anything specific at this point, like above, in .htaccess. It gave a couple warnings about the increased traffic over about 3 days, from what I'm told, then ended up with the block. I just watch over the "code" part of the site for a friend. I don't get emails from his host. I just have access to the  host server side interface and admin panels so I can help try to fix things when they go awry. I'm the "friend that knows about computers" if you will.

Share this post


Link to post
Share on other sites
7 hours ago, ericksaint said:

The only change I made to the settings was to show 50 lines per page instead of the default.

I've never an error like that for VC and no one else has reported it. The link it has is from a hacker from Russia trying to cause problems. I suppose that something in the session data for that visitor could cause the failure but I can't reproduce it so I can't say for sure. Just out of curiosity, what versiond of oscommerce and oho are being used?

Share this post


Link to post
Share on other sites
On 29/7/2018 at 4:12 AM, ericksaint said:

Recently the site I watch over for a friend had a sudden traffic spike that led to his host shutting down his site for being over traffic. Never had a traffic problem once before in 10 years. Logs look like it's bots are just constantly downloading the entire image directory. Is there any way to protect this folder so the images can't be downloaded, but they can still be accessed when people click on the item in the store? I've tried blocking IP's through the server, but they just come back with other IP's.

I had a similar problem in my site. I was suffering hotlinking (bandwidth leeching).

https://httpd.apache.org/docs/2.4/rewrite/access.html

 

 

 

Share this post


Link to post
Share on other sites
21 hours ago, Jack_mcs said:

I've never an error like that for VC and no one else has reported it. The link it has is from a hacker from Russia trying to cause problems. I suppose that something in the session data for that visitor could cause the failure but I can't reproduce it so I can't say for sure. Just out of curiosity, what versiond of oscommerce and oho are being used?

The error went away after I applied this fix, from the VC support thread, in the "Unserialize" portion of the file.

The osC version is 2.3.4, sorry I'm not sure what "oho" means.

Share this post


Link to post
Share on other sites

I'm sorry. I meant php, not oho. But it doesn't matter now that it is working.

Edited by Jack_mcs

Share this post


Link to post
Share on other sites
On 7/30/2018 at 7:39 AM, JAValeryon said:

I had a similar problem in my site. I was suffering hotlinking (bandwidth leeching).

https://httpd.apache.org/docs/2.4/rewrite/access.html

Well, that depends on how the bots are accessing image files. If they are directly linking to them, some sort of standard "hotlink" protection in your .htaccess could be enough to deny them access (by allowing "REFERER" access only to your domain). If they are running your PHP code to get images sent to them, perhaps something can be done in .htaccess to ban access to .php files (except index.php), at least in that directory (REQUEST_URI), unless they're going through index.php to get to the images. If someone sharing your server is directly reading your image files, complain to your host and experiment with 700 permissions (make sure it doesn't break web access, or osC/PHP's ability to read/write files). Finally, some hosts offer a "data leech" protection in their control panels that may do what you want (I don't think the definition is standardized... it may refer to users handing out passwords to restricted areas to other people).


If you are running the "official" osC 2.3.4 or 2.3.4.1 download, your installation is obsolete! Get (stable) Frozenpatches or (unstable) Edge. See also the naming convention and the latest community-supported responsive "Edge" release

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×