bharathiphp Posted March 4, 2009 Share Posted March 4, 2009 Hi, I have installed the oscommerce 2.2 and I have installed the SID Killer and Ultimate SEO URL Contribution... But nothing is happening there... I want to remove the SID and URL changed into SEO Friendly... anybody give their suggestion to solve the problem... Thanks Bharathi Quote Link to comment Share on other sites More sharing options...
stevel Posted March 4, 2009 Author Share Posted March 4, 2009 Bharathi, You want to ask about this in a different thread as it has nothing to do with spiders.txt. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
jezjames Posted March 24, 2009 Share Posted March 24, 2009 A replacement for catalog/includes/spiders.txt - updated with newly seen spiders and optimized for quicker processing. For 2.2-MS2 or later. Comments, questions and suggestions welcomed here. http://www.oscommerce.com/community/contributions,2455 I am getting the following errors after instaling who is online enhancement On my browser 1054 - Unknown column 'hostname' in 'field list' insert into whos_online (customer_id, full_name, session_id, ip_address, hostname, time_entry, time_last_click, last_page_url, http_referer, user_agent) values ('0', 'Guest', '068c9e28d9d03f07cfffc77be2792723', '127.0.0.1', 'localhost', '1237867301', '1237867301', '/osctest1/', '', 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR 1.1.4322)') [TEP STOP] on my admin panel 1054 - Unknown column 'hostname' in 'field list' select customer_id, full_name, ip_address, hostname, time_entry, time_last_click, last_page_url, http_referer, user_agent, session_id from whos_online order by time_last_click DESC [TEP STOP] Quote Link to comment Share on other sites More sharing options...
SteveDallas Posted March 24, 2009 Share Posted March 24, 2009 I am getting the following errors after instaling who is online enhancement Who's Online Enhancement has its own support thread, which you will find at http://www.oscommerce.com/forums/index.php?showtopic=124853. The above link appears in the description of most full uploads to Who's Online Enhancement. --Glen Quote Link to comment Share on other sites More sharing options...
Guest Posted April 1, 2009 Share Posted April 1, 2009 Steve, First and foremost, you da Man!!! :thumbsup: I wanted to make you aware that the current spiders list will send a browser with the Teoma Toolbar to the cookies_usage page. Based on my research Teoma's robot reports itself as: Mozilla/2.0 (compatible; Ask Jeeves/Teoma) I would recommend just adding a / in front of the current entry for teoma. I added this to my site and successfully tested the Teoma Toolbar. I think I'm seeing Teoma getting sessions and adding stuff to carts now (based on the IPs I see on Who's Online). Has anyone else seen this, or am I wrong? Should the / be removed? Can it be listed both with and without the /? Quote Link to comment Share on other sites More sharing options...
stevel Posted April 1, 2009 Author Share Posted April 1, 2009 If you leave off the slash, then users with the toolbar won't see your store. What's a sample entry from the access log? Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
Guest Posted April 1, 2009 Share Posted April 1, 2009 (edited) If you leave off the slash, then users with the toolbar won't see your store. What's a sample entry from the access log? 66.235.127.136 - - [01/Apr/2009:06:00:56 -0400] "GET /catalog/index.php?cPath=326&sort=products_sort_order&action=buy_now&products_id=735 HTTP/1.1" 302 680 "-" "RedCarpet/1.4 (http://www.pronto.com/robots.html)" Maybe it wasn't Teoma after all.......... This is what made me think it was: NetRange: 66.235.112.0 - 66.235.127.255 CIDR: 66.235.112.0/20 OriginAS: AS16798 NetName: ASK-DOT-COM-NETWORK NetHandle: NET-66-235-112-0-1 Parent: NET-66-0-0-0-0 NetType: Direct Assignment NameServer: NAME1.ASK.COM NameServer: NAME2.ASK.COM NameServer: NAME5.ASK.COM NameServer: NAME6.ASK.COM Comment: http://www.ask.com Need to add RedCarpet? Edited April 1, 2009 by baddog Quote Link to comment Share on other sites More sharing options...
stevel Posted April 1, 2009 Author Share Posted April 1, 2009 No, because the user agent string includes "obot". Note that there is no osCsid in the URL. Unless this is an unusual robot that accepts cookies, it cannot add things to the cart. And because the UA has "obot", it should not be getting a session at all if you have properly enabled Prevent Spidetr Sessions. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
Guest Posted April 1, 2009 Share Posted April 1, 2009 No, because the user agent string includes "obot". Note that there is no osCsid in the URL. Unless this is an unusual robot that accepts cookies, it cannot add things to the cart. And because the UA has "obot", it should not be getting a session at all if you have properly enabled Prevent Spidetr Sessions. Well, I have Prevent Spider Sessions set to True and I can tell you that this thing puts stuff in a cart. Quote Link to comment Share on other sites More sharing options...
stevel Posted April 1, 2009 Author Share Posted April 1, 2009 I don't know what to tell you, then, other than suggesting debugging this with the Firefox add-in User Agent Switcher to "pretend" that you are this robot and see how your store behaves. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
Guest Posted April 2, 2009 Share Posted April 2, 2009 I don't know what to tell you, then, other than suggesting debugging this with the Firefox add-in User Agent Switcher to "pretend" that you are this robot and see how your store behaves. I've never used that tool before. I'll play around with it. Can you suggest what I should enter in the boxes to set up RedCarpet as a new User Agent? Description: User Agent: App Name: App Version: Platform: Vendor: Vendor Sub: Thanks. Quote Link to comment Share on other sites More sharing options...
stevel Posted April 2, 2009 Author Share Posted April 2, 2009 The only field you need is the User Agent, set to RedCarpet/1.4 (http://www.pronto.com/robots.html) Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
dangermouse1981 Posted April 14, 2009 Share Posted April 14, 2009 Is there any particular reason why Googlebot and its variations is missing from this file? Quote Link to comment Share on other sites More sharing options...
Guest Posted April 14, 2009 Share Posted April 14, 2009 Is there any particular reason why Googlebot and its variations is missing from this file? I think you will find that they are covered. Do they get a session? Quote Link to comment Share on other sites More sharing options...
dangermouse1981 Posted April 15, 2009 Share Posted April 15, 2009 The latest version of spiders.txt I donloaded contained here http://addons.oscommerce.com/info/2455 clearly doesn't include the user ages specified by Google at http://www.google.com/support/webmasters/b...mp;answer=40364, however when I spoof my UserAgent as Googlebot with cookies disabled session IDs are indeed removed from links. Strange. Quote Link to comment Share on other sites More sharing options...
SteveDallas Posted April 15, 2009 Share Posted April 15, 2009 (edited) The latest version of spiders.txt I donloaded contained here http://addons.oscommerce.com/info/2455 clearly doesn't include the user ages specified by Google at http://www.google.com/support/webmasters/b...mp;answer=40364, however when I spoof my UserAgent as Googlebot with cookies disabled session IDs are indeed removed from links. Strange. Not strange at all. In order to match the largest number of bots with the fewest and shortest comparisons, the spiders.txt file contains string fragments that are compared against the User Agent strings. Googlebot is matched by the string "ebot" in the file. The file is optimized to put the most common strings at the top, to reduce overhead. --Glen Edited April 15, 2009 by SteveDallas Quote Link to comment Share on other sites More sharing options...
girl Posted May 17, 2009 Share Posted May 17, 2009 Thanks for the update. I hope this works. Gogglebot is causing havoc on my download site. every time I get visitors it seems to be tagging along and messing up their downloads. Quote Link to comment Share on other sites More sharing options...
Black Jack 21 Posted August 5, 2009 Share Posted August 5, 2009 I've found a not listed spider: gonzo* It's from a german search site "suchen.de". Please add this to your updates. Thank you for the great contrib! Quote Link to comment Share on other sites More sharing options...
nudylady Posted August 13, 2009 Share Posted August 13, 2009 (edited) I uploaded new spider.txt. Will this make my site a bit slow? In catalog/include/configure.php, define('STORE_SESSIONS', 'mysql'); do I have to change mysql to session? Edited August 13, 2009 by nudylady Quote Link to comment Share on other sites More sharing options...
smiler99 Posted August 24, 2009 Share Posted August 24, 2009 is This the latest Spiders.txt file (28/12/2008) I cannot find any other variants. Bling 65.55.109.244, 65.55.110.23 64.233.173.2 msnbot-65-55-110-23.search.msn.com and AOL? 195.93.21.68 cache-los-XXXX.proxy.aol.com are picking up session id's and are not being reported as BOTS in visitors tracking or super tracker Quote Link to comment Share on other sites More sharing options...
stevel Posted August 31, 2009 Author Share Posted August 31, 2009 Andreas, I cannot find any reference to a spider named "gonzo". What is the user agent string? nudylady, typically you want to store sessions in MySQL Using spiders.txt will not noticeably slow down your pages, and will help by preventing search engine spiders from getting session IDs and, in most cases, adding items to carts. It also prevents session IDs from showing up in search results. Chris, yes, that is the latest I have updated. I have not seen any new spiders come along in a while. IPs aren't useful to me - I need to see user agent strings from the access log. msnbot is definitely picked up by spiders.txt. The AOL IPs are from AOL users, not spiders. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
Black Jack 21 Posted September 1, 2009 Share Posted September 1, 2009 Andreas, I cannot find any reference to a spider named "gonzo". What is the user agent string? Hi, it's gonzo*, e.g. "GET /robots.txt HTTP/1.1" 200 391 "-" "gonzo2[P] +http://www.suchen.de/faq.html" cheers Quote Link to comment Share on other sites More sharing options...
stevel Posted September 1, 2009 Author Share Posted September 1, 2009 Ok, thanks. Odd that I can't find this in lists or discussions of known spiders. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
smiler99 Posted September 1, 2009 Share Posted September 1, 2009 Steve, Thanks for your reply, msnbot was not being picked up, i understand that if any user agent contains any of the words in spiders.txt then it is regarded as a spider. an extract of your spider.txt has the following that contains 'nbot' which in theory should pickup msnbot, however i have had to specifically add 'msnbot' for it to be recognised in supertracker and visitors tracking (it may be those mods that are not using spiders.txt correctly) lbot mbot nbot pbot rbot sbot tbot vbot ybot zbot bot. bot/ _bot .bot /bot -bot :bot Quote Link to comment Share on other sites More sharing options...
stevel Posted September 1, 2009 Author Share Posted September 1, 2009 As you say, the string "nbot" should pick up msnbot, and it does in my tests. I can't speak for those other mods. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.