Jump to content
Sign in to follow this  
spooks

Remove & Prevent duplicate content with the canonical tag

Recommended Posts

product_info.php?cPath=0_26&products_id=123

 

and it should be:

product_info.php?cPath=31_26&products_id=123

 

 

Google must have found that link sometime, but I should worry not, as the canonical will mean google will soon realise what the link should be & remove the invalid.

 

Some here have reported cases of competitiors generating bad links for sites in an effort to drop a sites rank, but again with that tacktic this will correct the link & reverse that attack.

 

 

Remember also this only removes param's from the uri, it does not modify them.


Sam

 

Remember, What you think I ment may not be what I thought I ment when I said it.

 

Contributions:

 

Auto Backup your Database, Easy way

 

Multi Images with Fancy Pop-ups, Easy way

 

Products in columns with multi buy etc etc

 

Disable any Category or Product, Easy way

 

Secure & Improve your account pages et al.

Share this post


Link to post
Share on other sites

Google must have found that link sometime, but I should worry not, as the canonical will mean google will soon realise what the link should be & remove the invalid.

 

Remember also this only removes param's from the uri, it does not modify them.

 

Sam,

I think it's the sitemap SEO on the product pages, sorry. I will go ask Jack. Bother, serves me right for not sticking to my principle of changing one thing at a time!

 

But, in who's online, when google comes to see me, it will show the full url (not the canonical) is that correct?


I'm feeling lucky today......maybe someone will answer my post!

I do try and answer a simple post when I can just to give something back.

------------------------------------------------

PM me? - I'm not for hire

Share this post


Link to post
Share on other sites

 

But, in who's online, when google comes to see me, it will show the full url (not the canonical) is that correct?

 

 

yes, this creates the canonical, it does nothing else, it cannot change any links used.


Sam

 

Remember, What you think I ment may not be what I thought I ment when I said it.

 

Contributions:

 

Auto Backup your Database, Easy way

 

Multi Images with Fancy Pop-ups, Easy way

 

Products in columns with multi buy etc etc

 

Disable any Category or Product, Easy way

 

Secure & Improve your account pages et al.

Share this post


Link to post
Share on other sites

yes, this creates the canonical, it does nothing else, it cannot change any links used.

 

ok, cool, well it's a good job I had this installed before the sitemap went live or I'd be in more trouble than I am now :thumbsup:

 

thanks again


I'm feeling lucky today......maybe someone will answer my post!

I do try and answer a simple post when I can just to give something back.

------------------------------------------------

PM me? - I'm not for hire

Share this post


Link to post
Share on other sites

Hi Sam,

 

I'm having the canonical issue of www.mysite.com/ and www.mysite/com/index.php having duplicate title and meta description tags according to Google Webmaster Tools. I'm a bit confused as to my options to resolve this kind of problem. Should I edit FILENAME_DEFAULT everywhere as some people in this forum have done?:

 

http://forums.oscommerce.com/index.php?showtopic=155079

 

Or I see in the installation notes for this add-on that you mention the problem I have, giving these instructions:

 

Its possible the you will get a duplicate content issue with my-domain.com & my-domain.com/index.php as these are one & the same page, you can use .htacces to deal with this:

 

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\ HTTP/

RewriteRule ^index\.php$ http://www.my-domain.com/ [R=301,L]

 

That is normally sufficient, however on some servers there can still be an issue, if so enable removal of index.php from the uri by setting $rem_index to true in the code.

 

But going back to Robert's reference to Matt Cutts' advice, does using a redirect cause any SEO problems? Or is this the appropriate course of action?

 

Thanks for the advice,

 

Nick

Share this post


Link to post
Share on other sites

But going back to Robert's reference to Matt Cutts' advice, does using a redirect cause any SEO problems? Or is this the appropriate course of action?

 

 

 

Where you have duplicate content a redirect is appropriate, changing FILENAME_DEFAULT can create new issues depending on your site, most common being Unable to determine the page link! errors on some links.

 

Remember the canonical is still efectivly a redirect.

 

Personally I always use that htaccess snipit to avoid an issue.

Edited by spooks

Sam

 

Remember, What you think I ment may not be what I thought I ment when I said it.

 

Contributions:

 

Auto Backup your Database, Easy way

 

Multi Images with Fancy Pop-ups, Easy way

 

Products in columns with multi buy etc etc

 

Disable any Category or Product, Easy way

 

Secure & Improve your account pages et al.

Share this post


Link to post
Share on other sites

Where you have duplicate content a redirect is appropriate, changing FILENAME_DEFAULT can create new issues depending on your site, most common being Unable to determine the page link! errors on some links.

 

Remember the canonical is still efectivly a redirect.

 

Personally I always use that htaccess snipit to avoid an issue.

 

 

Ok...good to know. Thanks!

Share this post


Link to post
Share on other sites

Apologies if this has been asked before - I have searched but not found an answer.

 

I tried to implement this code today but have the following issue.

 

If my url is, for example, http://www.mysite.co.uk/catalog/product-name-c-10_100.html'>http://www.mysite.co.uk/catalog/product-name-c-10_100.html

 

the canonical tag is being written as http://www.mysite.co.uk/catalog/product-name-c-10 100.html

 

and this of course causes issues - but I can't see (in the code) why the '_' is being replaced with a space.

 

Can you help please?

 

Many Thanks

 

Stuart

Share this post


Link to post
Share on other sites

 

There is nothing in the code to do that, it does not search for the underscore & it does not replace anything with a space, you have some other code doing that.

 

What seo package are you using, underscore's are rarely used for param seperators due to the issues that can occur as osc filnames contain underscores, usually only dashes are used.


Sam

 

Remember, What you think I ment may not be what I thought I ment when I said it.

 

Contributions:

 

Auto Backup your Database, Easy way

 

Multi Images with Fancy Pop-ups, Easy way

 

Products in columns with multi buy etc etc

 

Disable any Category or Product, Easy way

 

Secure & Improve your account pages et al.

Share this post


Link to post
Share on other sites

Sam

 

you're correct - my issue was with a change I had made to the SEO code (and not noticed the error). I saw the error after adding your contribution and did a 2+2=5!

 

Many thanks and thanks for sharing your code

 

Stuart

Share this post


Link to post
Share on other sites

Can you help me with this?

It seems when there are %20 (spaces) or similar within the links, as in OVRAW etc.

 

hxxp://www.website.com/index.php?utm_source=Yahoo&utm_medium=cpc&utm_term=xbox%20live%20points%20kaufen&utm_content=7777777777&utm_campaign=666666666&OVRAW=xbox%20live%20points%20kaufen&OVKEY=xbox%20live%20points%20kaufen&OVMTC=standard&OVADID=1111111111&OVKWID=22222222222&OVCAMPGID=44444444444&OVADGRPID=5555555555

 

 

this values will not be removed completely

 

<link rel="canonical" href="http://www.website.com/index.php?%20live%20points%20kaufen%20live%20points%20kaufen%20live%20points%20kaufen" >

 

Regards,

Stephan

Edited by Stephan Gebbers

 

 

Share this post


Link to post
Share on other sites

Can you help me with this?

It seems when there are %20 (spaces) or similar within the links, as in OVRAW etc.

 

hxxp://www.website.com/index.php?utm_source=Yahoo&utm_medium=cpc&utm_term=xbox%20live%20points%20kaufen&utm_content=7777777777&utm_campaign=666666666&OVRAW=xbox%20live%20points%20kaufen&OVKEY=xbox%20live%20points%20kaufen&OVMTC=standard&OVADID=1111111111&OVKWID=22222222222&OVCAMPGID=44444444444&OVADGRPID=5555555555

 

 

this values will not be removed completely

 

<link rel="canonical" href="http://www.website.com/index.php?%20live%20points%20kaufen%20live%20points%20kaufen%20live%20points%20kaufen" >

 

Regards,

Stephan

 

You have something else doing that, not this, perhaps some seo add-on, if I put your uri into my test site the result is:

 

<link rel="canonical" href="http:/ /my_site.co.uk/?utm_source=Yahoo&utm_medium=cpc&utm_term=xbox%20live%20points%20kaufen&utm_content=7777777777&utm_campaign=666666666&OVRAW=xbox%20live%20points%20kaufen&OVKEY=xbox%20live%20points%20kaufen&OVMTC=standard&OVADID=1111111111&OVKWID=22222222222&OVCAMPGID=44444444444&OVADGRPID=5555555555" >

 

 

 

ie as expected, the code does not look for %20 or vary output on content, it removes requested params only & looks for standard & seo seperators only.

 

 

 

 

perhaps u made an error adding to the removed list, what have u added exactly.

 


Sam

 

Remember, What you think I ment may not be what I thought I ment when I said it.

 

Contributions:

 

Auto Backup your Database, Easy way

 

Multi Images with Fancy Pop-ups, Easy way

 

Products in columns with multi buy etc etc

 

Disable any Category or Product, Easy way

 

Secure & Improve your account pages et al.

Share this post


Link to post
Share on other sites

You have something else doing that, not this, perhaps some seo add-on, if I put your uri into my test site the result is:

 

<link rel="canonical" href="http:/ /my_site.co.uk/?utm_source=Yahoo&utm_medium=cpc&utm_term=xbox%20live%20points%20kaufen&utm_content=7777777777&utm_campaign=666666666&OVRAW=xbox%20live%20points%20kaufen&OVKEY=xbox%20live%20points%20kaufen&OVMTC=standard&OVADID=1111111111&OVKWID=22222222222&OVCAMPGID=44444444444&OVADGRPID=5555555555" >

 

 

 

ie as expected, the code does not look for %20 or vary output on content, it removes requested params only & looks for standard & seo seperators only.

 

 

 

 

perhaps u made an error adding to the removed list, what have u added exactly.

 

$remove_array = array('OVRAW', 'OVADGRPID', 'OVCAMPGID', 'OVKEY', 'OVMTC', 'OVKWID', 'OVADID', 'utm_content', 'utm_term', 'utm_source', 'utm_medium', 'utm_campaign', 'action', 'gclid', 'currency','language','main_page','page','sort','ref','affiliate_banner_id','max');

 

I have "Ultimate seo urls 5 r141 stable" installed

 

Currently i'm using a workaround..

In html_output.php, right before the definition of $remove_array

 

$request_uri = preg_replace("([^a-zA-Z0-9äöüÄÖÜ\/\-\.\=\?\&\_])", "", $request_uri);

 

Regards,

Stephan

 

PS: have you tested the string "xbox live points kaufen" with actual spaces instead of %20 too?

Edited by Stephan Gebbers

 

 

Share this post


Link to post
Share on other sites

spaces in uri

 

 

Hi, yes there is an issue, I should have checked more thoughly blush.gif

 

This should fix the isssue

 

replace:

$search[] = '/&*' . $value . '[=\/]+\w*\/?/i';

with:

$search[] = '/&*' . $value . '[=\/]+[\w%..\+]*\/?/i';

 

Let me know how you get on. smile.gif

 

PS Are you aware spaces should be avoided with uri where ever possible, they are considered 'unsafe' due to indeterminate behaviour.


Sam

 

Remember, What you think I ment may not be what I thought I ment when I said it.

 

Contributions:

 

Auto Backup your Database, Easy way

 

Multi Images with Fancy Pop-ups, Easy way

 

Products in columns with multi buy etc etc

 

Disable any Category or Product, Easy way

 

Secure & Improve your account pages et al.

Share this post


Link to post
Share on other sites

Is it possible to have an exclusion array just for the category pages?

 

 

category pages are listed within index.php, an empty array already exists in the code, just add to that. wink.gif


Sam

 

Remember, What you think I ment may not be what I thought I ment when I said it.

 

Contributions:

 

Auto Backup your Database, Easy way

 

Multi Images with Fancy Pop-ups, Easy way

 

Products in columns with multi buy etc etc

 

Disable any Category or Product, Easy way

 

Secure & Improve your account pages et al.

Share this post


Link to post
Share on other sites

category pages are listed within index.php, an empty array already exists in the code, just add to that. wink.gif

 

Sorry, I missed that! Works great, thanks.

Share this post


Link to post
Share on other sites

Hi Sam

 

I see Google webmaster tools reports a lot of duplicate mete descriptions and title tags after I waited for Google to update my site content:

 

They are all similar links(conical)like the following:

e.g.

Title tags:

index.php?cPath=46_61

and index.php?cPath=61

 

meta dexcriptions:

/product_info.php?cPath=84&products_id=256

/product_info.php?cPath=79&products_id=256

 

Ends up the on same page though

 

Do we need to insert

#redirect index.php to root 
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\ HTTP/ 
RewriteRule ^index\.php$ http://www.yourdomain.com/ [R=301,L]

into the .htaccess file

 

Is there something I must correct and change?

 

Thanks


Thanks to all source contributors, 2.3.4 Edge just works fine

https://github.com/gburton/Responsive-osCommerce/archive/master.zip

Share this post


Link to post
Share on other sites

Yes sure.

 

What has to be remembered is that canonical tags are not the holy grail, very useful to be sure but not good for everything.

 

canonical tags are effectively to the bots a soft 301 redirect. So if you take split page results pages (a typical duplicate content example) . . adding a canonical tag where there is e.g. page and sort in the querystring informs the bots that this is a non page and that only the main page is the canonical version .. this in turn means that the links on those pages will not be followed possibly leading to those products not being indexed.

 

So look at what we actually want from these pages .. we don't want them indexed as they are "non pages" with no true content nor meaning .. we do however want the bots to follow the links and index the pages they find .. so .. in comes the ..

 

<meta name="robots" content="noindex, follow" />

 

It does exactly this .. tells the bots not to index the page but to follow the links and index those it finds .. perfect!

 

Hi Sam,

 

With the above points in mind I am wanting to make a slight tweak to your contrbution so that when your on a split page results page it shows this

 

echo '<meta name="robots" content="noindex, follow">' . "\n";

 

Instead of the canonical link.

 

I've tried having a go at this but had no joy. I use a header tags contibution so i was thinking where i put "CanonicalLink( $xhtml = false, 'SSL' );" i could add an if else statement like this:

 

If on split page results page

 

echo '<meta name="robots" content="noindex, follow">' . "\n";

 

} else {

 

CanonicalLink( $xhtml = false, 'SSL' );

 

I just can't work out what to put in that "if" statement so the system knows i'm on a split page results page.

 

Could you offer any guidence on this please mate?

Share this post


Link to post
Share on other sites

 

meta dexcriptions:

/product_info.php?cPath=84&products_id=256

/product_info.php?cPath=79&products_id=256

 

 

cpath is removed from the uri for product_info.php in the default setup wink.gif

 

 

Title tags:

index.php?cPath=46_61

and index.php?cPath=61

 

That says you have multiple paths to sub categories on your site, it would require some code additions to address that, as param values are not touched.


Sam

 

Remember, What you think I ment may not be what I thought I ment when I said it.

 

Contributions:

 

Auto Backup your Database, Easy way

 

Multi Images with Fancy Pop-ups, Easy way

 

Products in columns with multi buy etc etc

 

Disable any Category or Product, Easy way

 

Secure & Improve your account pages et al.

Share this post


Link to post
Share on other sites

 

 

I am yet to be convinced there is any issue that needs addressing, the canonical does not say 'do not visit these pages' nor does it say 'do not follow links' it just says 'this is the page that should appear in the index. smile.gif


Sam

 

Remember, What you think I ment may not be what I thought I ment when I said it.

 

Contributions:

 

Auto Backup your Database, Easy way

 

Multi Images with Fancy Pop-ups, Easy way

 

Products in columns with multi buy etc etc

 

Disable any Category or Product, Easy way

 

Secure & Improve your account pages et al.

Share this post


Link to post
Share on other sites

Thanks for reply

 

Umm.. a bit confused now

 

cpath is removed from the uri for product_info.php in the default setup
That says you have multiple paths to sub categories on your site, it would require some code additions to address that, as param values are not touched.

 

Where do I start and what to look for in code? :o


Thanks to all source contributors, 2.3.4 Edge just works fine

https://github.com/gburton/Responsive-osCommerce/archive/master.zip

Share this post


Link to post
Share on other sites

Where do I start and what to look for in code? ohmy.gif

 

 

do you know php & how to create a regex expression?

 

the current category id is set in $current_category_id so u must search the uri for cpath & if there modify its param with $current_category_id wink.gif

 

 

I will update the code when I`ve time to look at this, but its only a minor issue so its not high on my to do list. smile.gif


Sam

 

Remember, What you think I ment may not be what I thought I ment when I said it.

 

Contributions:

 

Auto Backup your Database, Easy way

 

Multi Images with Fancy Pop-ups, Easy way

 

Products in columns with multi buy etc etc

 

Disable any Category or Product, Easy way

 

Secure & Improve your account pages et al.

Share this post


Link to post
Share on other sites
do you know php & how to create a regex expression?

 

If I see the code I know sort of what it's supposed to do

 

Lets leave it till update is available

The main reason for posting is that only my homepage is ranked at 2, all keywords and descriptions is formed correct for the products but no rank

 

Not really worried much at this stage, my shop is showing along my competitors but obvious the top would be no1 :thumbsup:


Thanks to all source contributors, 2.3.4 Edge just works fine

https://github.com/gburton/Responsive-osCommerce/archive/master.zip

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×