Latest News: (loading..)

Archived

This topic is now archived and is closed to further replies.

richo3880

extremely high traffic

18 posts in this topic

i just noticed that my oscommerce website has extremely high traffic, to a limit where it passes the bandwidth limit, it is now on 77GB.

the traffic has been going up for the past 3 months.

inside aw-stats it says 10GB are traffic viewed but the other 60some GB are not viewed traffic, and my FTP usage is just 58MB out of the deal.

how to track what is going on here ? it seem like a security issue here...

 

any advice is greatly appreciated.

thank you

Share this post


Link to post
Share on other sites

these were robots and spiders taking up all the bandwith. i did adjust the robot.txt file yesterday and in noticed less bandwidth usage since yesterday, hopefully this stay the same which means this was the issue here...

Share this post


Link to post
Share on other sites

Richard,

 

 

 

If you get a hosting account with unlimited bandwidth usage, you won't have to worry about keeping track of it. There are some that start as low as $4 a month.

 

 

 

 

Chris

Share this post


Link to post
Share on other sites

All hosting companies, that I've checked, that offer unlimited resources have strict guidelines in their TOS that limit the usage, so unlimited is usually not unlimited. But even if it was, extreme usage shouldn't be ignored because if the bandwidth is that high, it means the server is busy serving the data and that means the server may be busy when a real customer comes along. I suggest you update the spiders file first. If the usage is still high, then you should look in the server logs to find who is doing all of the accessing and ban their IP's.

Share this post


Link to post
Share on other sites

some times i get error 509 Bandwidth Limit Exceeded and the site stop working. so i had to track my bandwidth, since i'm not the one who purchased the plan.

and tracking previous months from up to 3 years ago for the same site, the bandwidth usage was the highest around 13GB. but to jump to 75GB within the past 2 months is strange.

and on awstats, i read the highest bandwidth usage for this month is for an "Unknown robot (identified by 'spider')" so i did update my robot.txt file for block "spider" hopefully this make it work. unless there is other file that i'm not aware of that i should adjust as well.

 

thanks for the advice guys, you rock.

Share this post


Link to post
Share on other sites

In the stats are you finding that there is one or a few files that are being continuously requested or is it more of a general spidering by search servers trolling their way through all the files?

Share this post


Link to post
Share on other sites

i have no idea... the pages request look normal as comparison to previous months. it's general crawling/spidering i guess. my stats said and i quote "Unknown robot (identified by 'spider')" under one line and one set of numbers. other lines are for google, yahoo... and other search engines

here is an screen shot from what i read

stats.jpg

Share this post


Link to post
Share on other sites

It wouldn't be the Yandex bot would it?

 

A misbehaved, hyper-aggressive Russian spider that ignores rules.

If left unchecked it can eat up to 5GB a day on some sites...

 

There are a few ways to block it... here's one that should work in your root htaccess file:

 

########## start block

SetEnvIfNoCase User-Agent "^Yandex*" bad_bot

 

<Limit GET POST>

Order Allow,Deny

Allow from all

Deny from env=bad_bot

</Limit>

########## end block

Share this post


Link to post
Share on other sites

Interesting Debs, we had previous encounters from Russia and eastern European ip's before. so it is not far fetched. ill check into this, and see what will happen.

btw, i monitored my bandwidth all day today, and i noticed the difference that it hasn't exceeded 500MB all day yesterday after i changed the robot.txt file to block all spider agents.

 

thanks for the insights.

Share this post


Link to post
Share on other sites

Blocking all spiders will pretty much ruin your site as far as getting traffic from web searches.

Share this post


Link to post
Share on other sites

so how can i block specific ones? all i have on hand is "spider" ad unknown.

As I already mentioned, first update the spiders file. Then locate the ip that is causing the problem in the logs. Look up the domain infor of the ip using one of the services on the web to get the ip range they use and block the whole range.

Share this post


Link to post
Share on other sites

In your server logs. Explain to your host what you want to do and ask how you can see the logs. It varies with the server but they should be able to provide instructions, or the log.

Share this post


Link to post
Share on other sites

An analysis of your logs is the first place to look as mentioned by others. From there you should be able to see where the excessive bandwidth is being consumed. From there you can make a plan of attack but until then it is best to leave your options open as to how to fix this issue rather than zeroing in on what is at this stage still an assumption.

Share this post


Link to post
Share on other sites