Jump to content


Corporate Sponsors


Latest News: (loading..)

- - - - -

extremely high traffic


17 replies to this topic

#1 richo3880

  • Community Member
  • 24 posts
  • Real Name:Richard

Posted 28 July 2011, 15:30

i just noticed that my oscommerce website has extremely high traffic, to a limit where it passes the bandwidth limit, it is now on 77GB.
the traffic has been going up for the past 3 months.
inside aw-stats it says 10GB are traffic viewed but the other 60some GB are not viewed traffic, and my FTP usage is just 58MB out of the deal.
how to track what is going on here ? it seem like a security issue here...

any advice is greatly appreciated.
thank you

#2 richo3880

  • Community Member
  • 24 posts
  • Real Name:Richard

Posted 29 July 2011, 13:13

these were robots and spiders taking up all the bandwith. i did adjust the robot.txt file yesterday and in noticed less bandwidth usage since yesterday, hopefully this stay the same which means this was the issue here...

#3 DunWeb

  • Community Sponsor
  • 10,464 posts
  • Real Name:Chris Dunn
  • Gender:Male
  • Location:Tecumseh, Ontario, Canada N8N 1X8

Posted 29 July 2011, 13:57

Richard,



If you get a hosting account with unlimited bandwidth usage, you won't have to worry about keeping track of it. There are some that start as low as $4 a month.




Chris
:|: Was this post helpful ? Click the LIKE THIS button :|:

:|: Click Here to learn how I can help you with custom coding, add ons, security and templates :|:

:|: Need an Area Calculator, Pre-Paid Account, Virtual Pin, Auction or Layaway Add on ? Click Here :|:

#4 Jack_mcs

  • Community Member
  • 24,454 posts
  • Real Name:Jack
  • Gender:Male

Posted 29 July 2011, 14:34

All hosting companies, that I've checked, that offer unlimited resources have strict guidelines in their TOS that limit the usage, so unlimited is usually not unlimited. But even if it was, extreme usage shouldn't be ignored because if the bandwidth is that high, it means the server is busy serving the data and that means the server may be busy when a real customer comes along. I suggest you update the spiders file first. If the usage is still high, then you should look in the server logs to find who is doing all of the accessing and ban their IP's.

#5 richo3880

  • Community Member
  • 24 posts
  • Real Name:Richard

Posted 29 July 2011, 15:08

some times i get error 509 Bandwidth Limit Exceeded and the site stop working. so i had to track my bandwidth, since i'm not the one who purchased the plan.
and tracking previous months from up to 3 years ago for the same site, the bandwidth usage was the highest around 13GB. but to jump to 75GB within the past 2 months is strange.
and on awstats, i read the highest bandwidth usage for this month is for an "Unknown robot (identified by 'spider')" so i did update my robot.txt file for block "spider" hopefully this make it work. unless there is other file that i'm not aware of that i should adjust as well.

thanks for the advice guys, you rock.

#6 Taipo

  • Community Member
  • 751 posts
  • Real Name:Te Taipo
  • Gender:Male

Posted 29 July 2011, 20:07

In the stats are you finding that there is one or a few files that are being continuously requested or is it more of a general spidering by search servers trolling their way through all the files?
- Stop Oscommerce hacks dead in their tracks with osC_Sec (see discussion here)
- Another discussion about infected files ::here::
- A discussion on file permissions ::here::
- Site hacked? Should you upgrade or not, some thoughts ::here::
- Ignore this link - just a honeypot site to test my ideas out for osC_Sec and allow the site to be picked up by attackers.
- Fix the admin login bypass exploit here

#7 richo3880

  • Community Member
  • 24 posts
  • Real Name:Richard

Posted 29 July 2011, 21:04

i have no idea... the pages request look normal as comparison to previous months. it's general crawling/spidering i guess. my stats said and i quote "Unknown robot (identified by 'spider')" under one line and one set of numbers. other lines are for google, yahoo... and other search engines
here is an screen shot from what i read
[img]http://www.windsorbeautysupply.com/stats.jpg[/img]

#8 Dennisra

  • Community Member
  • 507 posts
  • Real Name:Joseph D. Jefferson
  • Gender:Male

Posted 29 July 2011, 21:51

Alter your robots.txt if you don't like it.

#9 Debs

  • Community Member
  • 133 posts
  • Real Name:Debs
  • Gender:Female
  • Location:Fargo, ND UNITED STATES

Posted 29 July 2011, 22:51

It wouldn't be the Yandex bot would it?

A misbehaved, hyper-aggressive Russian spider that ignores rules.
If left unchecked it can eat up to 5GB a day on some sites...

There are a few ways to block it... here's one that should work in your root htaccess file:

########## start block
SetEnvIfNoCase User-Agent "^Yandex*" bad_bot

<Limit GET POST>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</Limit>
########## end block

#10 richo3880

  • Community Member
  • 24 posts
  • Real Name:Richard

Posted 29 July 2011, 23:41

Interesting Debs, we had previous encounters from Russia and eastern European ip's before. so it is not far fetched. ill check into this, and see what will happen.
btw, i monitored my bandwidth all day today, and i noticed the difference that it hasn't exceeded 500MB all day yesterday after i changed the robot.txt file to block all spider agents.

thanks for the insights.

#11 Jack_mcs

  • Community Member
  • 24,454 posts
  • Real Name:Jack
  • Gender:Male

Posted 30 July 2011, 03:42

Blocking all spiders will pretty much ruin your site as far as getting traffic from web searches.

#12 richo3880

  • Community Member
  • 24 posts
  • Real Name:Richard

Posted 30 July 2011, 04:02

so how can i block specific ones? all i have on hand is "spider" ad unknown.

#13 Jack_mcs

  • Community Member
  • 24,454 posts
  • Real Name:Jack
  • Gender:Male

Posted 30 July 2011, 11:16

View Postricho3880, on 30 July 2011, 04:02, said:

so how can i block specific ones? all i have on hand is "spider" ad unknown.
As I already mentioned, first update the spiders file. Then locate the ip that is causing the problem in the logs. Look up the domain infor of the ip using one of the services on the web to get the ip range they use and block the whole range.

#14 richo3880

  • Community Member
  • 24 posts
  • Real Name:Richard

Posted 30 July 2011, 13:22

i'll try this, and will let you know how it goes, thank you.

#15 richo3880

  • Community Member
  • 24 posts
  • Real Name:Richard

Posted 02 August 2011, 13:35

how can i find the bots and spiders ip address?

#16 Jack_mcs

  • Community Member
  • 24,454 posts
  • Real Name:Jack
  • Gender:Male

Posted 02 August 2011, 15:56

In your server logs. Explain to your host what you want to do and ask how you can see the logs. It varies with the server but they should be able to provide instructions, or the log.

#17 Taipo

  • Community Member
  • 751 posts
  • Real Name:Te Taipo
  • Gender:Male

Posted 02 August 2011, 22:21

An analysis of your logs is the first place to look as mentioned by others. From there you should be able to see where the excessive bandwidth is being consumed. From there you can make a plan of attack but until then it is best to leave your options open as to how to fix this issue rather than zeroing in on what is at this stage still an assumption.
- Stop Oscommerce hacks dead in their tracks with osC_Sec (see discussion here)
- Another discussion about infected files ::here::
- A discussion on file permissions ::here::
- Site hacked? Should you upgrade or not, some thoughts ::here::
- Ignore this link - just a honeypot site to test my ideas out for osC_Sec and allow the site to be picked up by attackers.
- Fix the admin login bypass exploit here

#18 richo3880

  • Community Member
  • 24 posts
  • Real Name:Richard

Posted 03 August 2011, 13:51

the robot.txt worked just fine.
thanks all for your help and advice.