Jump to content
Latest News: (loading..)
webshoptimizer

Canonical tag implementation

Recommended Posts

Ok, tonight I wanted to try to make a contribution for the new tag the major search engines released in the battle against "duplicate content".

 

The idea is that you point the search engine to the right URL instead of an irregular URL over and over again (what causes duplicate content).

 

You only have to adjust two files: index.php and product_info.php.

 

1. Open both files and add before require(’includes/application_top.php’); the following code:

$string = $_SERVER[‘REQUEST_URI’];

$search = ‘&osCsid.*|?osCsid.*’;

$replace = ”;

 

2. Add within your <head> section the following code to generate the correct URL:

<link rel="canonical" href="<?php echo ‘http://www.yourdomain.com’ . ereg_replace( $search, $replace, $string ); ?>" />

 

3. Don’t forget to replace yourdomain.com with your actual domain name.

 

In my opinion the code is not optimal. I'm not a programmer so I was already quite happy with the result.

What I want:

- Replace the hard code URL with a dynamic one (like how the base url is generated)

- optimize the code?

 

Read the full story on my blog (explaining the canonical tag): http://tinyurl.com/br8n8f

 

I would love feedback!

Share this post


Link to post
Share on other sites

Hi

 

Conceptually this is excellent. One issue is that the current code does not cope with catalogs that are not in the root directory.

 

// starts canonical tag function - www.webshoptimizer.com
function CanonicalUrl() {
$domain = substr((($request_type == 'SSL') ? HTTPS_SERVER : HTTP_SERVER), 0, -1); // gets the base URL minus the trailing slash
$string = $_SERVER['REQUEST_URI']; // gets the url 
$search = '\&osCsid.*|\?osCsid.*'; // searches for the session id in the url
$replace = ''; // replaces with nothing i.e. deletes
echo $domain . ereg_replace( $search, $replace, $string ); // merges the variables and echoing them 
}

 

The issue is that the HTTPS_SERVER/HTTP_SERVER variables are from the configure.php file and include the domain URL and the directory in which the catalog is installed.

 

The $_SERVER['REQUEST_URI'] also includes the directory where the catalog is installed. Consequently if the catalog is in a directory instead of the root, then the directory name is doubled up as a result.

 

The solution, I think, is to take the same code and use different variables.

 

function CanonicalUrl() {
$string = $_GET['QUERY_STRING']; // gets everything that is a variable to the right of the ?
$search = '\&osCsid.*|\?osCsid.*'; // searches for the session id in the query string
$replace = ''; // replaces with nothing i.e. deletes
echo $_SERVER[SCRIPT_URI].ereg_replace( $search, $replace, $string ); // uses the script_uri instead and appends the revised query string
}

 

The configure.php settings are not required. Just use the QUERY_STRING which is everything to the right of the ? in the URL in the replace process.

 

Use the SCRIPT_URI which is the server/directory/script name, or everything to the left of the ? and add the QUERY_STRING dropping off the osCsid value.

 

Hope this helps.

Edited by Tony

cheers
Tony
******************************
My oscMax Store RecoverToy :: Antique Toy Car Parts
Tony's Tech Blog
WrenMaxwell WebManagement
******************************

Share this post


Link to post
Share on other sites

Hi

 

Hold that thought. [sCRIPT_URI] is not consistently supported on all servers / PHP combo's. For that matter neither is [REQUEST_URI].

 

More work required.


cheers
Tony
******************************
My oscMax Store RecoverToy :: Antique Toy Car Parts
Tony's Tech Blog
WrenMaxwell WebManagement
******************************

Share this post


Link to post
Share on other sites

The code is creating the Canonical url based on whether the SID is found or not, which isn't the problem with a properly setup shop. The url's that need to be treated are those like

http://somedomain.com/index.php?cPath=1_9

and

http://somedomain.com/index.php?cPath=1_9&...e=1&sort=2d

 

Jack

Share this post


Link to post
Share on other sites

I installed this contribution. Unfortunately, when I tested it and compared the URL in the address bar with the one shown in the source, I found that my domain in the source is shown as (www.MYDOMAIN.co) instead of (www.MYDOMAIN.com) !!

 

Any idea why the "m" was removed and how to get this fixed.

Thanks in advance.

Share this post


Link to post
Share on other sites

Couldn't we somehow just add something like this to product info?

 

<link rel="stylesheet" type="text/css" href="stylesheet.css">
<link rel="canonical" href="automated product link">

 

Is the above possible? If so, does anyone know how to get the href to produce the current product link?

Share this post


Link to post
Share on other sites

If you put:

 

<link rel="canonical" href="<?php echo tep_href_link(FILENAME_PRODUCT_INFO, 'products_id=' . (int)$_GET['products_id'],'NONSSL',false); ?>" />

 

Should achieve that, but I`ve not looked that closely at this new tag yet, so can't be certain there are no issues.


Sam

 

Remember, What you think I ment may not be what I thought I ment when I said it.

 

Contributions:

 

Auto Backup your Database, Easy way

 

Multi Images with Fancy Pop-ups, Easy way

 

Products in columns with multi buy etc etc

 

Disable any Category or Product, Easy way

 

Secure & Improve your account pages et al.

Share this post


Link to post
Share on other sites
If you put:

 

<link rel="canonical" href="<?php echo tep_href_link(FILENAME_PRODUCT_INFO, 'products_id=' . (int)$_GET['products_id'],'NONSSL',false); ?>" />

 

Should achieve that, but I`ve not looked that closely at this new tag yet, so can't be certain there are no issues.

 

Spooks solution works in all cases I've tried. I am wary of the tag, but I need to try to resolve some of the duplicate content that stems from cpath and manufacturers_id. All I can do is wait to see if it works the way it is intended.

 

As far as this contribution is concerned, I do not see the need to modify any files other than product_info or selected pages based on your particular shop setup.

Share this post


Link to post
Share on other sites

Is it possible to implement this canonical tag for categories using the similar type of build Spooks came up with? For use in categories such as http://www.mysite.com/optimized-category-6.html. After just a few days several ?cpath and ?manufacturers_id pages are showing up in Google Webmaster Tools as restricted. I cannot imagine it is coincidental.

Share this post


Link to post
Share on other sites

There are further issues here as categories are output from index.php, which is also yor 'home' page.

 

You could try this, it will remove any sort, language, currancy, page & filter, only output for manufacturer or category, it may need to be developed further.

 

 

<?php 
if (isset($_GET['manufacturers_id']) && tep_not_null($_GET['manufacturers_id'])) { ?>
<link rel="canonical" href="<?php echo tep_href_link(FILENAME_DEFAULT, 'manufacturers_id=' . $_GET['manufacturers_id'],'NONSSL',false); ?>" />
<?php } elseif (isset($_GET['cPath']) && tep_not_null($_GET['cPath'])) { ?>
<link rel="canonical" href="<?php echo tep_href_link(FILENAME_DEFAULT, 'cPath=' . $_GET['cPath'],'NONSSL',false); ?>" />
<?php } ?>

 

;)


Sam

 

Remember, What you think I ment may not be what I thought I ment when I said it.

 

Contributions:

 

Auto Backup your Database, Easy way

 

Multi Images with Fancy Pop-ups, Easy way

 

Products in columns with multi buy etc etc

 

Disable any Category or Product, Easy way

 

Secure & Improve your account pages et al.

Share this post


Link to post
Share on other sites
I installed this contribution. Unfortunately, when I tested it and compared the URL in the address bar with the one shown in the source, I found that my domain in the source is shown as (www.MYDOMAIN.co) instead of (www.MYDOMAIN.com) !!

 

Any idea why the "m" was removed and how to get this fixed.

Thanks in advance.

 

i'm having the same issue.

Share this post


Link to post
Share on other sites
i'm having the same issue.

 

okay i figured out a rough way to fix the problem.

 

it looks like the author was looking to remove trailing slashes in the url. however my server doesn't put this trailing slash so it was making my .com or .net into .co and .ne

 

$domain = substr((($request_type == 'SSL') ? HTTPS_SERVER : HTTP_SERVER), 0, -1); // gets the base URL minus the trailing slash

 

Remove the [, -1] from the code

 

so it looks like this

 

$domain = substr((($request_type == 'SSL') ? HTTPS_SERVER : HTTP_SERVER), 0); // gets the base URL minus the trailing slash

 

 

 

works fine for me now.

Share this post


Link to post
Share on other sites

As jack said, this contrib is fixing a problem that should'nt be an issue, if you apply the tags I detailed below instead that should suffice (they also remove sid anyway)

 

;)


Sam

 

Remember, What you think I ment may not be what I thought I ment when I said it.

 

Contributions:

 

Auto Backup your Database, Easy way

 

Multi Images with Fancy Pop-ups, Easy way

 

Products in columns with multi buy etc etc

 

Disable any Category or Product, Easy way

 

Secure & Improve your account pages et al.

Share this post


Link to post
Share on other sites

Hi.

 

Thanks for the code, however, it may need a little tweak

 

mysite.co.uk/index.php?cPath=36_37

and

mysite.co.uk/index.php?cPath=37

 

I think both should have the same canonical url?

 

<link rel="canonical" href="http://www.mysite.co.uk/index.php?cPath=37">

 

I’ve amended the code with

 

 

$path = $_GET['cPath'];
if (stripos($path, "_") > 0){$path = substr($path, (stripos($path, "_")+1) );}
?>
<link rel="canonical" href="<?php echo tep_href_link(FILENAME_DEFAULT, 'cPath=' . $path,'NONSSL',false);?>">

 

which seems to work.

 

All comments welcome

 

Regards.

 

Ken.

Edited by Ken44

Share this post


Link to post
Share on other sites

There are further issues here as categories are output from index.php, which is also yor 'home' page.

 

You could try this, it will remove any sort, language, currancy, page & filter, only output for manufacturer or category, it may need to be developed further.

 

 

<?php 
if (isset($_GET['manufacturers_id']) && tep_not_null($_GET['manufacturers_id'])) { ?>
<link rel="canonical" href="<?php echo tep_href_link(FILENAME_DEFAULT, 'manufacturers_id=' . $_GET['manufacturers_id'],'NONSSL',false); ?>" />
<?php } elseif (isset($_GET['cPath']) && tep_not_null($_GET['cPath'])) { ?>
<link rel="canonical" href="<?php echo tep_href_link(FILENAME_DEFAULT, 'cPath=' . $_GET['cPath'],'NONSSL',false); ?>" />
<?php } ?>

 

;)

 

 

But i cant figure out where paste this code.

 

Can you explain exactly?

 

Thanks

Share this post


Link to post
Share on other sites

 

 

Yes your right, but a better solution to that & to my ealier one is:

 

<?php if (isset($_GET['manufacturers_id']) && tep_not_null($_GET['manufacturers_id'])) { ?><link rel="canonical" href="<?php echo tep_href_link(FILENAME_DEFAULT, 'manufacturers_id=' . $_GET['manufacturers_id'],'NONSSL',false); ?>" /><?php } elseif ($current_category_id) { ?><link rel="canonical" href="<?php echo tep_href_link(FILENAME_DEFAULT, 'cPath=' . $current_category_id,'NONSSL',false); ?>" /><?php } ?>

 

To gamerstronics and anyone else, this code must be placed between the <head> and </head> tags on your index.php page wink.gif


Sam

 

Remember, What you think I ment may not be what I thought I ment when I said it.

 

Contributions:

 

Auto Backup your Database, Easy way

 

Multi Images with Fancy Pop-ups, Easy way

 

Products in columns with multi buy etc etc

 

Disable any Category or Product, Easy way

 

Secure & Improve your account pages et al.

Share this post


Link to post
Share on other sites

@ Spooks

If you are using STS should this be placed in between the head tags in the sts_template instead of the directory index.php?

Share this post


Link to post
Share on other sites

Hi.

 

Thanks for the code, however, it may need a little tweak

 

mysite.co.uk/index.php?cPath=36_37

and

mysite.co.uk/index.php?cPath=37

 

I think both should have the same canonical url?

 

<link rel="canonical" href="http://www.mysite.co.uk/index.php?cPath=37">

 

I’ve amended the code with

 

 

$path = $_GET['cPath'];
if (stripos($path, "_") > 0){$path = substr($path, (stripos($path, "_")+1) );}
?>
<link rel="canonical" href="<?php echo tep_href_link(FILENAME_DEFAULT, 'cPath=' . $path,'NONSSL',false);?>">

 

which seems to work.

 

All comments welcome

 

Regards.

 

Ken.

 

Shouldn't it be mysite.co.uk/index.php?cPath=36_37 rather than mysite.co.uk/index.php?cPath=37 as this would be the correct path ?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×