stevel Posted April 11, 2006 Author Share Posted April 11, 2006 Correct. This contribution is simply an update for the spiders.txt file that is included in the osC distribution. It adds new spiders and is optimized. No code is changed. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
minuteman1970 Posted April 12, 2006 Share Posted April 12, 2006 hi Steve, Thanks for the quick response. I just installed your contribution a few minutes ago, by copying the spiders.txt file into my /catalog/includes DIR, and turning "prevent spider sessions" to TRUE. Is there any way that I can test to see that it is working OK? I don't want to "scare" away the spiders! thanks, Ray Quote Link to comment Share on other sites More sharing options...
stevel Posted April 12, 2006 Author Share Posted April 12, 2006 Got Firefox? Install the "User Agent Switcher" extension and set the useragent to "Googlebot". That's how I test it. Be sure you close your browser session and reopen it to clear session cookies. Then try adding something to your cart. Note that this does not prevent spiders from indexing your store. All it does is keep them from obtaining sessions. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
minuteman1970 Posted April 12, 2006 Share Posted April 12, 2006 Got Firefox? Install the "User Agent Switcher" extension and set the useragent to "Googlebot". That's how I test it. Be sure you close your browser session and reopen it to clear session cookies. Then try adding something to your cart. Note that this does not prevent spiders from indexing your store. All it does is keep them from obtaining sessions. Steve, I actually use IE on both my PC's. Is there any other way to test? One thing I just noticed while looking at my "Who's online, is that there are four "Mozilla" bots from the same IP checking out various links. Three of them have a "yes" under the session column and are checking out products, while the fourth has a "no". The one with the no is viewing "/catalog/cookie_usage.php. Is Who's online an accurate way to guage whether this is working OK or not? thanks. Ray www.specopstactical.com Quote Link to comment Share on other sites More sharing options...
stevel Posted April 12, 2006 Author Share Posted April 12, 2006 "Mozilla" is not a bot. Actually, if you see Mozilla there, you have no idea what it is, since just about every browser includes "Mozilla" in its UA. I tried your store and the Prevent Spider Sessions is working fine. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
minuteman1970 Posted April 12, 2006 Share Posted April 12, 2006 "Mozilla" is not a bot. Actually, if you see Mozilla there, you have no idea what it is, since just about every browser includes "Mozilla" in its UA. I tried your store and the Prevent Spider Sessions is working fine. Steve, Thanks a million! I'm glad to hear that the mod is working fine. Now off to further OSc refinements! -Ray Quote Link to comment Share on other sites More sharing options...
formmailer Posted April 15, 2006 Share Posted April 15, 2006 Hi! I like your list! :) Could you please add the following: findlinks/1.1-a8 (+http://wortschatz.uni-leipzig.de/findlinks/) also known as findlinks/1.1.1-a1 (+http://wortschatz.uni-leipzig.de/findlinks/) ilse Thanks for the great work! - Jasper Quote Link to comment Share on other sites More sharing options...
stevel Posted April 15, 2006 Author Share Posted April 15, 2006 (edited) Please post the complete user agent string as found in your access log. If "ilse" is the one I'm thinking of, it should already be covered by "crawl". findlinks is already there. Edited April 15, 2006 by stevel Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
FixItPete Posted April 15, 2006 Share Posted April 15, 2006 why isn't googlebot on the list??? Confused. Pete Quote I find the fun in everything. Link to comment Share on other sites More sharing options...
stevel Posted April 15, 2006 Author Share Posted April 15, 2006 It is - Googlebot is covered by the string "ebot". Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
DriWashSolutions Posted April 20, 2006 Share Posted April 20, 2006 I've got two stores, both are OSCMAX. Both have an updated SPIDERS.TXT in the /includes folder, both have "prevent spider sessions" set to TRUE, yet one store can't see any bots, and the other does. Anything else I need to change? Quote John Skurka Link to comment Share on other sites More sharing options...
stevel Posted April 20, 2006 Author Share Posted April 20, 2006 I don't know "OSCMAX". There isn't anything else in a standard osC store to change. What do you mean by "can't see any bots"? You can add debug code to application_top.php to see if you can find out why one store is misbehaving. Note that, unless you're on a Windows host, the case of the filename is important - it is looking for spiders.txt not SPIDERS.TXT. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
DriWashSolutions Posted April 20, 2006 Share Posted April 20, 2006 I don't know "OSCMAX". There isn't anything else in a standard osC store to change. What do you mean by "can't see any bots"? I check via "Who's On-Line" and one store shows bots, along with their names in red. The other store, never shows any bots. Always guests. The stores are identical (well, obviously there's something different). I also have a straight OSC MS 2.2 store, which works fine as well. I checked CHMOD settings, and all are identical as well. Hmmmm. You can add debug code to application_top.php to see if you can find out why one store is misbehaving. Note that, unless you're on a Windows host, the case of the filename is important - it is looking for spiders.txt not SPIDERS.TXT. I understand the caps - it is lower case on the server, just wanted to emphasize it in the post. Quote John Skurka Link to comment Share on other sites More sharing options...
stevel Posted April 20, 2006 Author Share Posted April 20, 2006 Maybe there are no bots visiting the other store? If instead what you see is that there are visitors that are clearly bots but that have sessions, you have some further analysis to do to find out why. If you'll give me the URL of the store that is a problem, I can check to see if spiders get sessions. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
DriWashSolutions Posted April 20, 2006 Share Posted April 20, 2006 If you'll give me the URL of the store that is a problem, I can check to see if spiders get sessions. The store that works: www.atoolcrib.com The store that doesn't: www.vehitronix.com I know some of the "visitors" to Vehitronix are bots, based on the IP address of the visitor as reported in the Who's Online contrib. Quote John Skurka Link to comment Share on other sites More sharing options...
stevel Posted April 20, 2006 Author Share Posted April 20, 2006 I tried your site with my user agent set to "Googlebot" and I did not get a session. So whatever issue you have with the "Who's Online" feature, it isn't related to use of spiders.txt. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
DriWashSolutions Posted April 21, 2006 Share Posted April 21, 2006 So whatever issue you have with the "Who's Online" feature, it isn't related to use of spiders.txt. You got it - I overwrote the "WOL" Code with the most current version and everything is working correctly now! Thanks for your help in debugging this. Quote John Skurka Link to comment Share on other sites More sharing options...
DVBHardware Posted April 26, 2006 Share Posted April 26, 2006 Stevel, Can you have a look at this problem concerning the latest 2 spiders.txt on local dev machines usinf firefox. I tried looking through pasts posts and seen no referance to it. http://www.oscommerce.com/forums/index.php?s=&...ndpost&p=845019 Jimmy Quote I'm not a coder just a splicer. Link to comment Share on other sites More sharing options...
stevel Posted April 26, 2006 Author Share Posted April 26, 2006 There would certainly be a problem with the 3/31 file but there shouldn't be with the newer ones. Please make sure that your spiders.txt does NOT contain the line: ox/ Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
DVBHardware Posted April 26, 2006 Share Posted April 26, 2006 (edited) You are correct, it is 3-31 jimmy DVBHardware.com Edited April 26, 2006 by RI Downlink Quote I'm not a coder just a splicer. Link to comment Share on other sites More sharing options...
sheepiedog Posted May 18, 2006 Share Posted May 18, 2006 I have updated to the latest spiders.txt but have a spider 64.124.140.15x that is making 4 - 5 connections at a time, 24 hours a day for the last few days and is loading up the cart with each connection from what i can see in my Whos Online. Is there anything i can do about this ? Quote Link to comment Share on other sites More sharing options...
stevel Posted May 18, 2006 Author Share Posted May 18, 2006 What's the user agent string for this spider? I can't associate a known spider with that IP. Get that from your access log. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
sheepiedog Posted May 18, 2006 Share Posted May 18, 2006 I am hoping this is correct, I have never done anything with my access logs before. I downloaded the access log file and found the correct ips and this is what it says Is this the info you were asking for ? 64.124.140.150 - - [17/May/2006:10:39:40 -0500] "GET /product_info.php?products_id=6129 HTTP/1.1" 200 50494 "-" "Mozilla/5.0 (compatible; FatBot 2.0; www.FatLens.com)" 64.124.140.151 - - [17/May/2006:03:40:39 -0500] "GET /product_info.php?products_id=3258 HTTP/1.1" 200 45109 "-" "Mozilla/5.0 (compatible; FatBot 2.0; www.FatLens.com)" 64.124.140.152 - - [17/May/2006:08:19:10 -0500] "GET /product_info.php?products_id=4023 HTTP/1.1" 200 27728 "-" "Mozilla/5.0 (compatible; FatBot 2.0; www.FatLens.com)" 64.124.140.153 - - [17/May/2006:08:10:01 -0500] "GET /product_info.php?products_id=4311 HTTP/1.1" 200 28744 "-" "Mozilla/5.0 (compatible; FatBot 2.0; www.FatLens.com)" 64.124.140.154 - - [17/May/2006:08:25:39 -0500] "GET /product_info.php?products_id=4309 HTTP/1.1" 200 28188 "-" "Mozilla/5.0 (compatible; FatBot 2.0; www.FatLens.com)" 64.124.140.176 - - [17/May/2006:14:37:50 -0500] "GET /index.php?cPath=63 HTTP/1.1" 200 41878 "-" "Mozilla/5.0 (compatible; FatBot 2.0; www.FatLens.com)" 64.124.140.177 - - [17/May/2006:14:43:57 -0500] "GET /privacy.php HTTP/1.1" 200 23373 "-" "Mozilla/5.0 (compatible; FatBot 2.0; www.FatLens.com)" 64.124.140.178 - - [17/May/2006:14:53:14 -0500] "GET /index.php?cPath=232 HTTP/1.1" 200 41679 "-" "Mozilla/5.0 (compatible; FatBot 2.0; www.FatLens.com)" 64.124.140.180 - - [17/May/2006:14:36:48 -0500] "GET /index.php?cPath=231 HTTP/1.1" 200 43138 "-" "Mozilla/5.0 (compatible; FatBot 2.0; www.FatLens.com)" 64.124.140.181 - - [17/May/2006:17:37:35 -0500] "GET /index.php?cPath=222 HTTP/1.1" 200 41866 "-" "Mozilla/5.0 (compatible; FatBot 2.0; www.FatLens.com)" Quote Link to comment Share on other sites More sharing options...
sheepiedog Posted May 18, 2006 Share Posted May 18, 2006 when i look up fatlens.com it is a event ticket search site, i dont even know why they are on my site that sells dog breed merchandise.... Quote Link to comment Share on other sites More sharing options...
stevel Posted May 18, 2006 Author Share Posted May 18, 2006 Actually, fatlens is a site under construction. Add the string "tbot" to spiders.txt. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.