Guest Posted July 10, 2006 Share Posted July 10, 2006 A common scenario is for store owners that were not aware of the "Prevent Spider Sessions" option to have several URLs indexed by spiders with the session ID appended. This situation is troublesome and there are a few options to handle referrals sent through the "wild" session ID URL. However, the true solution to the problem is to REMOVE THE SESSION ID's from the search engine index! So, how hard is it? Pretty easy! In includes/application_top.php find this code: // include the language translations require(DIR_WS_LANGUAGES . $language . '.php'); Under that paste this code: if ( $spider_flag == true ){ if ( eregi(tep_session_name(), $_SERVER['REQUEST_URI']) ){ $location = tep_href_link(basename($_SERVER['SCRIPT_NAME']), tep_get_all_get_params(array(tep_session_name())), 'NONSSL', false); header("HTTP/1.0 301 Moved Permanently"); header("Location: $location"); // redirect...bye bye } } Link to comment Share on other sites More sharing options...
Guest Posted July 10, 2006 Share Posted July 10, 2006 that is not going to help you much as spiders can cache results for very long time. There are other alternatives if you check the contributions. Link to comment Share on other sites More sharing options...
Guest Posted July 10, 2006 Share Posted July 10, 2006 What other alternatives are there in the contribution area? By sending a 301 header it signals the search engines to remove that entry in the index and replace it with the new one. domain.com/index.php?osCsid=xxx => [ 301 header ] => domain/index.php Bobby Link to comment Share on other sites More sharing options...
Guest Posted July 10, 2006 Share Posted July 10, 2006 http://www.oscommerce.com/community/contributions,952 http://www.oscommerce.com/community/contributions,2819 and there are threads in the tips and tricks forum. I personally use this one http://www.oscommerce.com/community/contributions,4112 because it alters the session during login/create account so even if a store has backlinks with the session appended a new session is created. At the top of everything else I redirect spiders to the cookies usage page if I detect they try to access any pages that goes to SSL. There is a stiff competition with search engines so they keep the cache for very long time. You have more content you are more popular you see. also the request_uri could be manipulated or you may have visitors who already stored the url with the favorites. It's also subject to the server your store is running on. Link to comment Share on other sites More sharing options...
Guest Posted July 10, 2006 Share Posted July 10, 2006 Fair enough...but what if the store is hosted on a Windows (IIS) server? Would the htaccess method still be viable? The answer is no. Allow me to address each of your mentioned contributions. SID Killer It is urban legend that this contribution is even needed anymore. It has its roots back in the MS1 days (yes, don't let the newly registered account fool you...I've been around since TEP). Back then the application had issues with spider sessions but the MS2 code is reliable as long as you keep the spiders.txt file updated (thanks to Steve). Spider Session Remover Although this is a nice modification it does not work cross OS. Further, if there are restrictions with AllowOverride it may not work at all. However, I will concede that most NIX servers can use this option and is most directly addressing the issue. Session Regeneration Nice coding and great implementation...but why not just use session_regenerate_id()? I understand that SE's keep a cache copy of the URL however that is what 301 headers are for :) REFERENCE: RFC 2616 Status Code Definitions The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use one of the returned URIs. Clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references... With respect to your comment of manipulating SERVER global vars...please provide an example for an exploit that is capable of manipulating this global. With respect to customers or non-spider visitors the code would not affect them as the very first conditional verifies that it is a bot. Link to comment Share on other sites More sharing options...
skylla Posted July 27, 2006 Share Posted July 27, 2006 Elvis! Link to comment Share on other sites More sharing options...
Chance Posted July 28, 2006 Share Posted July 28, 2006 Moved to Tips and Tricks forum. My advice comes in two flavors- Pick the one that won't offend you. Hard and Cynical: How to Make a Horrible osCommerce Site Warm and Fuzzy: How to Make an Awesome osCommerce Site Link to comment Share on other sites More sharing options...
Guest Posted August 22, 2006 Share Posted August 22, 2006 unfortunately Google views this as cloaking and bans your site because you are returning output to a spider that is different to the output that the user sees. Believe me I am talking from experience here. The instructions in the links below will get you kicked out of google. http://www.oscommerce.com/community/contributions,952http://www.oscommerce.com/community/contributions,2819 and there are threads in the tips and tricks forum. I personally use this one http://www.oscommerce.com/community/contributions,4112 because it alters the session during login/create account so even if a store has backlinks with the session appended a new session is created. At the top of everything else I redirect spiders to the cookies usage page if I detect they try to access any pages that goes to SSL. There is a stiff competition with search engines so they keep the cache for very long time. You have more content you are more popular you see. also the request_uri could be manipulated or you may have visitors who already stored the url with the favorites. It's also subject to the server your store is running on. Link to comment Share on other sites More sharing options...
boxtel Posted August 22, 2006 Share Posted August 22, 2006 unfortunately Google views this as cloaking and bans your site because you are returning output to a spider that is different to the output that the user sees. Believe me I am talking from experience here. The instructions in the links below will get you kicked out of google. nonsense. Treasurer MFC Link to comment Share on other sites More sharing options...
excell Posted August 22, 2006 Share Posted August 22, 2006 nonsense. I'm not convinced :-" - I don't think the concern is nonsense - not that I know a lot, but I think this is a big issue for how to handle such things in a search engine friendly way and if anything, we need to look at what they would say about it and what would be the best way to go... can anyone provide a referance, has anyone asked Matt Cutts or similar? Link to comment Share on other sites More sharing options...
boxtel Posted August 22, 2006 Share Posted August 22, 2006 I'm not convinced :-" - I don't think the concern is nonsense - not that I know a lot, but I think this is a big issue for how to handle such things in a search engine friendly way and if anything, we need to look at what they would say about it and what would be the best way to go... can anyone provide a referance, has anyone asked Matt Cutts or similar? If you were talking from a little more experience you should know that these facilities are there to prevent spiders from obtaining sessions and removing already indexed links with session id's from SE indexes. Both are a must and both have absolutely nothing to do with providing altered content specifically targeted at search engines. has anyone asked Matt Cutts or similar suggest you take this on your plate and get back to us all. Treasurer MFC Link to comment Share on other sites More sharing options...
Guest Posted August 24, 2006 Share Posted August 24, 2006 unfortunately Google views this as cloaking and bans your site because you are returning output to a spider that is different to the output that the user sees. Believe me I am talking from experience here. The instructions in the links below will get you kicked out of google. I don't think so, but if you think it will help you go ahead and compromise your customer's private info by exposing sessions to search engines. See what happens. Link to comment Share on other sites More sharing options...
matrix2223 Posted September 1, 2006 Share Posted September 1, 2006 Ok im gettin mixed reports on this could someone kindly tell what one has to do to get rid of the osCids already listed and 2) keep them from listing future ones. As far as coding goes I have this in place that boxtel was kind enough to share with us on this thread: http://www.oscommerce.com/forums/index.php?sho...&st=0 Is this everything I need or do I need a contrib of some sort??? Thanks, Eric Link to comment Share on other sites More sharing options...
matrix2223 Posted September 7, 2006 Share Posted September 7, 2006 So does this work, do I need anything else to go along with this to get the already indexed links un-listed in the SEs? I also need something for a redirect so to speak for when the links from the SE is clicked it gives a customer a new id# Anyone?? Thanks, Eric Link to comment Share on other sites More sharing options...
Debs Posted September 23, 2006 Share Posted September 23, 2006 So does this work, do I need anything else to go along with this to get the already indexed links un-listed in the SEs? I also need something for a redirect so to speak for when the links from the SE is clicked it gives a customer a new id# Anyone?? Thanks, Eric //////////////////////////////////////////// Hi Eric, Use this in your .htaccess # Skip the next two rewriterules if NOT a spider RewriteCond %{HTTP_USER_AGENT}!(msnbot?slurp?googlebot) [NC] RewriteRule .* - [s=2] # case: leading and trailing parameters RewriteCond %{QUERY_STRING} ^(.+)&osCSid=[0-9a-z]+&(.+)$ [NC] RewriteRule (.*) $1?%1&%2 [R=301,L] # # case: leading-only, trailing-only or no additional parameters RewriteCond %{QUERY_STRING} ^(.+)&osCSid=[0-9a-z]+$?^osCSid=[0-9a-z]+&?(.*)$ [NC] RewriteRule (.*) $1?%1 [R=301,L] Link to comment Share on other sites More sharing options...
Rachael w. Posted June 3, 2007 Share Posted June 3, 2007 At what point in time would one remove this code from their application_top? In includes/application_top.php find this code: CODE // include the language translations require(DIR_WS_LANGUAGES . $language . '.php'); Under that paste this code: CODE if ( $spider_flag == true ){ if ( eregi(tep_session_name(), $_SERVER['REQUEST_URI']) ){ $location = tep_href_link(basename($_SERVER['SCRIPT_NAME']), tep_get_all_get_params(array(tep_session_name())), 'NONSSL', false); header("HTTP/1.0 301 Moved Permanently"); header("Location: $location"); // redirect...bye bye } } I added this way back when I first set up the site. Should I remove it now? I dont have any pages indexed with the oscid any longer. Link to comment Share on other sites More sharing options...
troubleshooter2000 Posted August 19, 2010 Share Posted August 19, 2010 A common scenario is for store owners that were not aware of the "Prevent Spider Sessions" option to have several URLs indexed by spiders with the session ID appended. This situation is troublesome and there are a few options to handle referrals sent through the "wild" session ID URL. However, the true solution to the problem is to REMOVE THE SESSION ID's from the search engine index! So, how hard is it? Pretty easy! In includes/application_top.php find this code: // include the language translations require(DIR_WS_LANGUAGES . $language . '.php'); Under that paste this code: if ( $spider_flag == true ){ if ( eregi(tep_session_name(), $_SERVER['REQUEST_URI']) ){ $location = tep_href_link(basename($_SERVER['SCRIPT_NAME']), tep_get_all_get_params(array(tep_session_name())), 'NONSSL', false); header("HTTP/1.0 301 Moved Permanently"); header("Location: $location"); // redirect...bye bye } } Can someone confirm whether this actually works? Link to comment Share on other sites More sharing options...
matrix2223 Posted August 16, 2011 Share Posted August 16, 2011 It would be far easier, I think just to add this to your robots.txt file and more can be found on google search for robots.txt User Agent: Disallow: /osCid.* The .* is a wild card for everything Also see above post as well Link to comment Share on other sites More sharing options...
♥kymation Posted August 17, 2011 Share Posted August 17, 2011 Unfortunately it's not that easy. Your code will block the search engines from accessing a file named osCid.* in the root of the site. Regards Jim See my profile for a list of my addons and ways to get support. Link to comment Share on other sites More sharing options...
♥toyicebear Posted August 17, 2011 Share Posted August 17, 2011 Just sign up at Google webmaster central and add in your site and then add osCid to the "ignore" list Note: this will affect Google search listings only Basics for osC 2.2 Design - Basics for Design V2.3+ - Seo & Sef Url's - Meta Tags for Your osC Shop - Steps to prevent Fraud... - MS3 and Team News... - SEO, Meta Tags, SEF Urls and osCommerce - Commercial Support Inquiries - OSC 2.3+ How To To see what more i can do for you check out my profile [click here] Link to comment Share on other sites More sharing options...
ggrant3 Posted November 15, 2011 Share Posted November 15, 2011 Just sign up at Google webmaster central and add in your site and then add osCid to the "ignore" list Note: this will affect Google search listings only Is there something (a setting, contribution, code change, etc...) that will remove the oscsid from the url's? I have been looking for an answer to this and all I find is confusing post's back and forth about whether post "A" will work or whether "Post "B" is better or if Post "A" will get you banned from Google. In admin I have admin>Configuration>Session setup so "Prevent Spider Sessions" and "Recreate Sessions" are set to true. But I still have the oscsid at the end of my urls. *Edit* I just looked at the url's that google has for my items (in my google merchant account) and they show the url fine (without the oscsid at the end of the url google has listed). But when I browse my site I see the oscsid in my url's. There is a way to get rid of the oscsid in the actual url's, right? I don't know if google is ignoring them or what, but my concern is that there still appear when browsing the site. Link to comment Share on other sites More sharing options...
Jack_mcs Posted November 15, 2011 Share Posted November 15, 2011 You just need to setup your configure file correctly: http://www.oscommerce.com/forums/index.php?showtopic=193738&hl= Support Links: For Hire: Contact me for anything you need help with for your shop: upgrading, hosting, repairs, code written, etc. All of My Addons Get the latest versions of my addons Recommended SEO Addons Link to comment Share on other sites More sharing options...
ggrant3 Posted November 15, 2011 Share Posted November 15, 2011 You just need to setup your configure file correctly: http://forums.oscomm...topic=193738= I was told this in another thread (I think by you) and I have gone through this thread (and just did it again), but I don't see where anything is set wrong Is there a specific line/command I should be focusing on? I just get overwhemled looking at all the similar lines/commands and url info. Maybe I am continually missing something. Link to comment Share on other sites More sharing options...
matrix2223 Posted December 20, 2011 Share Posted December 20, 2011 It depends on what you are trying to accomplish?? If you merely want to change the value of osCsid to something like sid or id then that is on line 133 of application_top.php // set the session name and save path tep_session_name('osCsid'); <<<<---------This line here tep_session_save_path(SESSION_WRITE_DIRECTORY); If you want to remove the sid all together use one of the seo url addons I always recomend locking down a site so only I can view it, if I am unable to develop it locally. To do this use this bit in your .htaccess file where your catalogs index.php is located. (I also use this to add another layer of security to the admin folder.) AuthUserFile /dev/null AuthGroupFile /dev/null AuthName "Access Control" AuthType Basic order deny,allow deny from all # IP address of my 2nd home computer allow from xxx.xxx.xxx.xxx # IP addresses of my two work computers allow from 00.000.000.000 Like I said at the beginning though, its all in what youre trying to accomplish. Hope this helps some Link to comment Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.