Jump to content
Sign in to follow this  
krnl

Spider sessions Mod Rewrite

Recommended Posts

I am having problems with spiders continuing to be assigned session IDs even after setting "Prevent Spider Sessions" to yes and updating .../includes/spiders.txt.

 

Then I learned about the Spider Session Remover contribution which uses mod_rewrite to remove the session ID when the user-agent matches a specific string (googlebot, msnbot, slurp, etc).

 

When I added the lines from the contribution to my .htaccess, I started getting Internal Server Errors (http-500). I copied the rules exactly from the contribution, but it won't work.

 

Has anyone else had this problem or able to provide guidance?

 

Here are the rewrite rules from the contribution:

 

RewriteEngine on

RewriteBase /

#

# Skip the next two rewriterules if NOT a spider

RewriteCond %{HTTP_USER_AGENT} !(msnbot|slurp|googlebot) [NC]

RewriteRule .* - [s=2]

#

# case: leading and trailing parameters

RewriteCond %{QUERY_STRING} ^(.+)&osCsid=[0-9a-z]+&(.+)$ [NC]

RewriteRule (.*) $1?%1&%2 [R=301,L]

#

# case: leading-only, trailing-only or no additional parameters

RewriteCond %{QUERY_STRING} ^(.+)&osCsid=[0-9a-z]+$|^osCsid=[0-9a-z]+&?(.*)$ [NC]

RewriteRule (.*) $1?%1 [R=301,L]

 

 

Thanks,

Rick

Share this post


Link to post
Share on other sites
I am having problems with spiders continuing to be assigned session IDs even after setting "Prevent Spider Sessions" to yes and updating .../includes/spiders.txt.

 

Then I learned about the Spider Session Remover contribution which uses mod_rewrite to remove the session ID when the user-agent matches a specific string (googlebot, msnbot, slurp, etc).

 

When I added the lines from the contribution to my .htaccess, I started getting Internal Server Errors (http-500).  I copied the rules exactly from the contribution, but it won't work.

 

Has anyone else had this problem or able to provide guidance?

 

Here are the rewrite rules from the contribution:

 

RewriteEngine on

RewriteBase /

#

# Skip the next two rewriterules if NOT a spider

RewriteCond %{HTTP_USER_AGENT} !(msnbot|slurp|googlebot) [NC]

RewriteRule .* - [s=2]

#

# case: leading and trailing parameters

RewriteCond %{QUERY_STRING} ^(.+)&osCsid=[0-9a-z]+&(.+)$ [NC]

RewriteRule (.*) $1?%1&%2 [R=301,L]

#

# case: leading-only, trailing-only or no additional parameters

RewriteCond %{QUERY_STRING} ^(.+)&osCsid=[0-9a-z]+$|^osCsid=[0-9a-z]+&?(.*)$ [NC]

RewriteRule (.*) $1?%1 [R=301,L]

Thanks,

Rick

Maybe you didn't edit the file properly, did you edit your htaccess file with notepad? Did you leave empty spaces, did you try to use the original supplied..

In any case i tried the original as well as modifying my own.

But i am unable to see any sessions id being stripped off.. My site does work normal, but spiders still have the id's added to their urls :huh:


Kind regards

 

Hakan Haknuz

Share this post


Link to post
Share on other sites

Set some options

Options +FollowSymLinks

Options -Indexes

 

try this i had to modify to get it to work try putting the line options +followsymlinks above the other line see above. the original file had the lines set up the other way around.i also had to change the - to a plus sign+.has something to do with the server so it will work if your using appache server.

 

see if that helps or do a google search lots of help there.

 

Don

Share this post


Link to post
Share on other sites

After working with tech support at my hosting company, I found that the reason for the 500 Internal Server error when accessing SSL was because they did not have the mod_rewrite module loaded in their secure server config. They fixed that and now SSL is working fine. I haven't had a chance to see if sessions are still being assigned to crawlers yet though.

Share this post


Link to post
Share on other sites

just so u know it can take weeks or even months for the sid to be taken out of search engine links. just keep checking them once and a while. do u have a robots.txt file you may want to install that as well if u dont have one you can get one at the contributions site then modify for what you need.

 

Don

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×