Jump to content



Latest News: (loading..)

- - - - -

Meta Robot NoIndex Header Tag Module Proposal for v2.3.3


  • Please log in to reply
33 replies to this topic

#1 ONLINE   Harald Ponce de Leon

Harald Ponce de Leon

    Healthy Giraffe

  • Core Team
  • 3,952 posts
  • Real Name:Harald Ponce de Leon
  • Gender:Male
  • Location:Solingen, Germany

Posted 27 July 2012 - 08:26 PM

Hi All..

I just pushed out a proposal header tag module that adds the meta robot noindex tag to a list of specified pages. By default the pages are:



account.php
account_edit.php
account_history.php
account_history_info.php
account_newsletters.php
account_notifications.php
account_password.php
address_book.php
address_book_process.php
checkout_confirmation.php
checkout_payment.php
checkout_payment_address.php
checkout_process.php
checkout_shipping.php
checkout_shipping_address.php
checkout_success.php
cookie_usage.php
create_account.php
create_account_success.php
login.php
logoff.php
password_forgotten.php
password_reset.php
product_reviews_write.php
shopping_cart.php
ssl_check.php
tell_a_friend.php

This would stop search engine robots indexing those pages.

The commit can be seen here:

https://github.com/haraldpdl/oscommerce2/commit/c19e80ef86fbb7f7202434120618a5cadede7c93

Should this be included in v2.3.3 and should it be installed by default?
Harald Ponce de Leon

#2   FWR Media

FWR Media
  • Community Sponsor
  • 6,837 posts
  • Real Name:Robert Fisher
  • Gender:Male
  • Location:Stowmarket - Suffolk - UK

Posted 27 July 2012 - 09:30 PM

View PostHarald Ponce de Leon, on 27 July 2012 - 08:26 PM, said:

Should this be included in v2.3.3 and should it be installed by default?

Most certainly, very nice!

Not intended to lessen the value of this great core addition, but as an aside, canonical elements would also have been great to handle those nasty listings, or failing that noindex,follow on _GET keys page,sort etc.

Edited by FWR Media, 27 July 2012 - 09:31 PM.


#3   Juto

Juto
  • Members
  • 369 posts
  • Real Name:Sara
  • Gender:Female

Posted 27 July 2012 - 10:08 PM

I agree with Robert, and they should be installed by default.
Sara

#4   kymation

kymation

    Believers

  • Community Sponsor
  • 6,690 posts
  • Real Name:Jim Keebaugh
  • Gender:Male
  • Location:Aberdeen WA USA

Posted 27 July 2012 - 10:34 PM

Why not just include a robots.txt file in the distribution?

Regards
Jim
My Addons

Banners Box 2.3.x  Support
Categories Accordion Box 2.3.x  Support
Categories Images Box 2.2x  2.3.x  Support
Closest Shipper 2.2x  Support
Document Manager 2.2x  Support
Generic Box 2.3.x  Support
Get 1 Free 2.2x  Support
jQuery Banner Rotator 2.2x  2.3.x  Support
Modular Front Page 2.3.x  Support
Modular SEO Header Tags 2.3.x  Support
MVS 2.2x  Support
PDF Datasheet 2.3.x  Support
Price Updater 2.2x
Products Specifications 2.2x  2.3.x  Development Version  Support  Bugs/Suggestions
Request a Review 2.2x - 2.3.x  Support
Similar Products Box 2.2x
Specials Image Overlay 2.3x Support
Theme Switcher 2.3.x  Support

#5 ONLINE   Harald Ponce de Leon

Harald Ponce de Leon

    Healthy Giraffe

  • Core Team
  • 3,952 posts
  • Real Name:Harald Ponce de Leon
  • Gender:Male
  • Location:Solingen, Germany

Posted 27 July 2012 - 11:06 PM

View Postkymation, on 27 July 2012 - 10:34 PM, said:

Why not just include a robots.txt file in the distribution?

If you know what you're doing that would be the better way to go and the module could be deinstalled. However the robots.txt file needs to be in the root directory and the files correctly specified with relative paths - problems will arise if robots.txt already exists or if the installation is moved to another directory.

The only way to achieve that is through the installation routine. Doing this in a module is more flexible, can be used by both new and existing store owners, and can be safely used with existing robot.txt files.
Harald Ponce de Leon

#6   Jack_mcs

Jack_mcs
  • Members
  • 25,310 posts
  • Real Name:Jack York
  • Gender:Male
  • Location:Michigan

Posted 27 July 2012 - 11:24 PM

I think it is a good idea. I suggest adding popup files to the list. Actually, if you're loking for extra work :), it would be nice to list all pages in admin so the shop owner could select which to skip since files get added over time that should not be listed.

#7   burt

burt

    Code Monkey

  • Community Team
  • 7,745 posts
  • Real Name:G Burton
  • Gender:Male
  • Location:UK/DEV/on

Posted 28 July 2012 - 08:16 AM

Definitely should be in 233 by default.   Nice work Harald.

I tinkered with the idea of creating a noindex module for individual products as well some time back,
that also worked very well.  But would not be needed in 99.9% of shops I'd suggest.



Gary

Edited by burt, 28 July 2012 - 08:17 AM.

Dummies guide to designing osCommerce 2.3 Click Me

Or maybe a ready made theme for your shop ??

Warning: My posts may contain Horsemeat.

#8   toyicebear

toyicebear
  • Community Sponsor
  • 6,054 posts
  • Real Name:Nick
  • Location:World Citizen

Posted 28 July 2012 - 09:11 AM

Sounds like a great tool that should definitely be included.
Basics for osC 2.2 Design - Basics for Design V2.3+ - Seo & Sef Url's - Meta Tags for Your osC Shop - Steps to prevent Fraud... - MS3 and Team News... - SEO, Meta Tags, SEF Urls and osCommerce

Check out my profile [click here] for information on professional services, custom coding, templates, SEO optimization, modifications, commercial support and help.

#9   Gergely

Gergely
  • Community Team
  • 529 posts
  • Real Name:Gergely Tóth
  • Gender:Male

Posted 28 July 2012 - 01:56 PM

@Harald Ponce de Leon
I like array edit function. I didnt see this before. It works!

@Jack_mcs
We have to use @foxp2 tep_cfg_select_pages() function for it.
http://addons.oscommerce.com/info/7691
Header Footer Content Modules
SCM
v3

and some rewrites :-)

#10   foxp2

foxp2

    strong as a Twig

  • Members
  • 303 posts
  • Real Name:Laurent
  • Gender:Male
  • Location:France

Posted 28 July 2012 - 10:38 PM

@Gergely
tep_cfg_select_pages() or :
<?php
/*
$Id$
osCommerce, Open Source E-Commerce Solutions
http://www.oscommerce.com
Copyright (c) 2012 osCommerce
Released under the GNU General Public License
*/
  class ht_robot_noindex {
	var $code = 'ht_robot_noindex';
	var $group = 'header_tags';
	var $title;
	var $description;
	var $sort_order;
	var $enabled = false;  
	function ht_robot_noindex() {
	  $this->title = MODULE_HEADER_TAGS_ROBOT_NOINDEX_TITLE;
	  $this->description = MODULE_HEADER_TAGS_ROBOT_NOINDEX_DESCRIPTION;
	
	  if ( defined('MODULE_HEADER_TAGS_ROBOT_NOINDEX_STATUS') ) {
		$this->sort_order = MODULE_HEADER_TAGS_ROBOT_NOINDEX_SORT_ORDER;
		$this->enabled = (MODULE_HEADER_TAGS_ROBOT_NOINDEX_STATUS == 'True');
	  }
	}
	function execute() {
	  global $PHP_SELF, $oscTemplate;
	  if (tep_not_null(MODULE_HEADER_TAGS_ROBOT_NOINDEX_PAGES)) {
		$pages_array = array();
		foreach (explode("\n", MODULE_HEADER_TAGS_ROBOT_NOINDEX_PAGES) as $page) {
		  $page = trim($page);
		  if (!empty($page)) {
			$pages_array[] = $page;
		  }
		}
		if (in_array(basename($PHP_SELF), $pages_array)) {
		  $oscTemplate->addBlock('<meta name="robots" content="noindex" />' . "\n", $this->group);
		}
	  }
	}
	function isEnabled() {
	  return $this->enabled;
	}
	function check() {
	  return defined('MODULE_HEADER_TAGS_ROBOT_NOINDEX_STATUS');
	}
	function install() {
	  tep_db_query("insert into " . TABLE_CONFIGURATION . " (configuration_title, configuration_key, configuration_value, configuration_description, configuration_group_id, sort_order, set_function, date_added) values ('Enable Robot NoIndex Module', 'MODULE_HEADER_TAGS_ROBOT_NOINDEX_STATUS', 'True', 'Do you want to enable the Robot NoIndex module?', '6', '1', 'tep_cfg_select_option(array(\'True\', \'False\'), ', now())");
	  tep_db_query("insert into " . TABLE_CONFIGURATION . " (configuration_title, configuration_key, configuration_value, configuration_description, configuration_group_id, sort_order, use_function, set_function, date_added) values ('Pages', 'MODULE_HEADER_TAGS_ROBOT_NOINDEX_PAGES', '', 'The pages to add the meta robot noindex tag to.', '6', '0', 'ht_robot_noindex_show_pages', 'ht_robot_noindex_edit_pages(', now())");
	  tep_db_query("insert into " . TABLE_CONFIGURATION . " (configuration_title, configuration_key, configuration_value, configuration_description, configuration_group_id, sort_order, date_added) values ('Sort Order', 'MODULE_HEADER_TAGS_ROBOT_NOINDEX_SORT_ORDER', '0', 'Sort order of display. Lowest is displayed first.', '6', '0', now())");
	}
	function remove() {
	  tep_db_query("delete from " . TABLE_CONFIGURATION . " where configuration_key in ('" . implode("', '", $this->keys()) . "')");
	}
	function keys() {
	  return array('MODULE_HEADER_TAGS_ROBOT_NOINDEX_STATUS', 'MODULE_HEADER_TAGS_ROBOT_NOINDEX_PAGES', 'MODULE_HEADER_TAGS_ROBOT_NOINDEX_SORT_ORDER');
	}  
  }
  function get_default_pages() {
	  return array('account.php',
				   'account_edit.php',
				   'account_history.php',
				   'account_history_info.php',
				   'account_newsletters.php',
				   'account_notifications.php',
				   'account_password.php',
				   'address_book.php',
				   'address_book_process.php',
				   'checkout_confirmation.php',
				   'checkout_payment.php',
				   'checkout_payment_address.php',
				   'checkout_process.php',
				   'checkout_shipping.php',
				   'checkout_shipping_address.php',
				   'checkout_success.php',
				   'cookie_usage.php',
				   'create_account.php',
				   'create_account_success.php',
				   'login.php',
				   'logoff.php',
				   'password_forgotten.php',
				   'password_reset.php',
				   'product_reviews_write.php',
				   'shopping_cart.php',
				   'ssl_check.php',
				   'tell_a_friend.php');
  }
  
  function tep_list_catalog_files() {
	$d = dir(DIR_FS_CATALOG);
	$result = array();  
	$exclude = array('redirect.php', 'popup_search_help.php', 'popup_image.php', 'opensearch.php', 'info_shopping_cart.php', 'download.php', 'checkout_process.php');
	  while (false !== ($file = $d->read())) {
		if($file != '.' && $file != '..' && !is_dir($file) && (substr($file, -3, 3) == 'php') && !in_array($file, $exclude)) {
			$result[] = $file;
		}
	}  
	$d->close();  
	return $result;
  }
  
  function ht_robot_noindex_edit_pages($key_value, $key= '') {
	$name = ((tep_not_null($key)) ? 'configuration[' . $key . '][]' : 'configuration_value');
	$default_array = get_default_pages();
	$select_array = tep_list_catalog_files();
	$selected_array = explode(';', $key_value);
	for ($i=0, $n=sizeof($select_array); $i<$n; $i++) {	  
	  $string .= '&nbsp;&nbsp;<input type="checkbox" name="' . $name . '" value="' . $select_array[$i] . ';"';
		  if(isset($selected_array))
			{					
			foreach($selected_array as $value){		  
			   if ($select_array[$i] == $value) $string .= ' CHECKED';			  
			   }
			foreach($default_array as $default) {
			   if ($select_array[$i] == $default) $string .= ' CHECKED class="default"';			  
			}
		  
			}
	  $string .= '>' . $select_array[$i] . '<br />';	
	  }

	 return $string;	
	}

  function ht_robot_noindex_show_pages($text) {	  
		$page = "";	  
		foreach(explode(";", $text) as $file){	  
		$page .= $file . '<br />';
		}
	return $page;
  }
?>

and always in catalog\admin\modules.php, replace :
	  case 'save':
		reset($HTTP_POST_VARS['configuration']);
		while (list($key, $value) = each($HTTP_POST_VARS['configuration'])) {
		  tep_db_query("update " . TABLE_CONFIGURATION . " set configuration_value = '" . $value . "' where configuration_key = '" . $key . "'");
		}
		tep_redirect(tep_href_link(FILENAME_MODULES, 'set=' . $set . '&module=' . $HTTP_GET_VARS['module']));
		break;

with :

	  case 'save':
		reset($HTTP_POST_VARS['configuration']);
		while (list($key, $value) = each($HTTP_POST_VARS['configuration'])) {
					
		  if((is_array($value)) && (!empty($value))){
		  $pages = '';
		  $count = count($value);
		  for($i=0 ; $i<$count; $i++){
		  $pages = "$pages$value[$i]";
		  tep_db_query("update " . TABLE_CONFIGURATION . " set configuration_value = '" . $pages . "' where configuration_key = '" . $key . "'");
			   }
		  }
	  
		  else
		  {		
		  tep_db_query("update " . TABLE_CONFIGURATION . " set configuration_value = '" . $value . "' where configuration_key = '" . $key . "'");
		  }
		}
		tep_redirect(tep_href_link(FILENAME_MODULES, 'set=' . $set . '&module=' . $HTTP_GET_VARS['module']));
		break;
[attachment=1419:RobotNoIndex.PNG]

Edited by foxp2, 28 July 2012 - 10:45 PM.

-------------------

#11   foxp2

foxp2

    strong as a Twig

  • Members
  • 303 posts
  • Real Name:Laurent
  • Gender:Male
  • Location:France

Posted 29 July 2012 - 12:17 AM

hem .. two mistakes :
to get the default list.
	   tep_db_query("insert into " . TABLE_CONFIGURATION . " (configuration_title, configuration_key, configuration_value, configuration_description, configuration_group_id, sort_order, use_function, set_function, date_added) values ('Pages', 'MODULE_HEADER_TAGS_ROBOT_NOINDEX_PAGES', '" . implode(";", get_default_pages()) . "', 'The pages to add the meta robot noindex tag to.', '6', '0', 'ht_robot_noindex_show_pages', 'ht_robot_noindex_edit_pages(', now())");

algorithm comparison :
  function ht_robot_noindex_edit_pages($key_value, $key= '') {
	$name = ((tep_not_null($key)) ? 'configuration[' . $key . '][]' : 'configuration_value');
	$default_array = get_default_pages();
	$select_array = tep_list_catalog_files();
	$selected_array = explode(';', $key_value);
	for ($i=0, $n=sizeof($select_array); $i<$n; $i++) {	   
	  $string .= '&nbsp;&nbsp;<input type="checkbox" name="' . $name . '" value="' . $select_array[$i] . ';"';
		  if(isset($selected_array))
			{					 
			foreach($selected_array as $value){		   
			   if ($select_array[$i] == $value) $string .= ' CHECKED';
			foreach($default_array as $default) {		   
			if ($select_array[$i] == $default && $default == $value) $string .= ' CHECKED';				   
			}			   
		  }
		}
	  $string .= '>' . $select_array[$i] . '<br />';	 
	  }
	 return $string;	 
	}

-------------------

#12 ONLINE   Harald Ponce de Leon

Harald Ponce de Leon

    Healthy Giraffe

  • Core Team
  • 3,952 posts
  • Real Name:Harald Ponce de Leon
  • Gender:Male
  • Location:Solingen, Germany

Posted 29 July 2012 - 01:10 AM

@Jack_mcs @foxp2

Jack, fantastic idea!
Laurent, wow - thanks for the great concept code! :thumbsup: :thumbsup: for the fast proposal and late night coding ;)

This has been added to the module with some jQuery magic to fill a hidden configuration field with the selected files - this avoids changes being made to admin/modules.php and keeps API compatibility.

https://github.com/haraldpdl/oscommerce2/commit/cbd0fb5ac7605dab1e8e66fe82619a287576f093
Harald Ponce de Leon

#13   Juto

Juto
  • Members
  • 369 posts
  • Real Name:Sara
  • Gender:Female

Posted 29 July 2012 - 06:28 AM

Harald, please do not invoke jQuery for this. jQuery is buggy and from v2.0 and beyond there will be no support for legacy browser like IE<10.

I don't think jQuery ever will be fully debugged.

A regular ecmascript can do what jquery can.

Sara

#14   foxp2

foxp2

    strong as a Twig

  • Members
  • 303 posts
  • Real Name:Laurent
  • Gender:Male
  • Location:France

Posted 29 July 2012 - 09:16 AM

@Harald Ponce de Leon : great work ! :thumbsup:

with get_default_pages() method's outside of ht_robot_noindex class, we could separate default files with others files :

	$output = '';
	$output .= 'Pages do not need SEO (by default) :<br />';
	foreach (get_default_pages() as $file) {
	  $output .= tep_draw_checkbox_field('ht_robot_noindex_file[]', $file, in_array($file, $values_array)) . '&nbsp;' . tep_output_string($file) . '<br />';
	}
	$output .= '<br />';   
	$output .= 'Others pages in Catalog :<br />';
	foreach (array_diff($files_array,get_default_pages()) as $file) {
	  $output .= tep_draw_checkbox_field('ht_robot_noindex_file[]', $file, in_array($file, $values_array)) . '&nbsp;' . tep_output_string($file) . '<br />';
	}

[attachment=1421:RobotNoIndexWithArrayDiff.PNG]
o:)
-------------------

#15   burt

burt

    Code Monkey

  • Community Team
  • 7,745 posts
  • Real Name:G Burton
  • Gender:Male
  • Location:UK/DEV/on

Posted 29 July 2012 - 09:35 AM

@Harald Ponce de Leon
@foxp2

Great work!
Dummies guide to designing osCommerce 2.3 Click Me

Or maybe a ready made theme for your shop ??

Warning: My posts may contain Horsemeat.

#16   burt

burt

    Code Monkey

  • Community Team
  • 7,745 posts
  • Real Name:G Burton
  • Gender:Male
  • Location:UK/DEV/on

Posted 29 July 2012 - 09:36 AM

View PostJuto, on 29 July 2012 - 06:28 AM, said:

Harald, please do not invoke jQuery for this. jQuery is buggy and from v2.0 and beyond there will be no support for legacy browser like IE<10.

I don't think jQuery ever will be fully debugged.

A regular ecmascript can do what jquery can.

Sara

How much do I wish there could be a "DISLIKE this" button.
Dummies guide to designing osCommerce 2.3 Click Me

Or maybe a ready made theme for your shop ??

Warning: My posts may contain Horsemeat.

#17   Juto

Juto
  • Members
  • 369 posts
  • Real Name:Sara
  • Gender:Female

Posted 29 July 2012 - 11:25 AM

@burt
Dear Burt, am I not allowed to express my views on this?
Could you please elaborate yours?
Regards
Sara

#18   Gergely

Gergely
  • Community Team
  • 529 posts
  • Real Name:Gergely Tóth
  • Gender:Male

Posted 29 July 2012 - 01:46 PM

Hi @Harald Ponce de Leon

Thanks for works! I think need file filters for this problem. There are displayed some none intresting files witch not use template_top.php i.e: redirect.php

Thanks @foxp2

The file listing is not refered to:
require(DIR_WS_INCLUDES . 'template_top.php'); 

Edited by Gergely, 29 July 2012 - 01:46 PM.

Header Footer Content Modules
SCM
v3

and some rewrites :-)

#19   Gergely

Gergely
  • Community Team
  • 529 posts
  • Real Name:Gergely Tóth
  • Gender:Male

Posted 29 July 2012 - 02:09 PM

I use this code:
  function tep_list_catalog_files () {
	$result = array();
	$exclude = array('redirect.php', 'popup_search_help.php', 'popup_image.php', 'opensearch.php', 'info_shopping_cart.php', 'download.php', 'checkout_process.php');
	if ($handle = opendir(DIR_FS_CATALOG)) {
	  while ($file = readdir($handle)) {
		if(!is_dir($file) && (strtolower(substr($file, -4, 4)) === ".php") && (!in_array($file ,$exclude)) && (IsTemplateFile(DIR_FS_CATALOG.$file))) {
		  $result[] = $file;
		}
	  }
	  closedir($handle);
	}
	return $result;
  }

Header Footer Content Modules
SCM
v3

and some rewrites :-)

#20   germ

germ
  • Members
  • 13,921 posts
  • Real Name:Jim
  • Gender:Male
  • Location:USA (GMT-6)

Posted 29 July 2012 - 05:04 PM

View Postkymation, on 27 July 2012 - 10:34 PM, said:

Why not just include a robots.txt file in the distribution?

Regards
Jim

I'm with the other Jim.

I just think the unparalleled coding talent that comprises the osC project would be better spent if directed at items that can't be solved with a simple text file. (See, I can suck up to the boss as well as anyone else in this thread).

But obviously we are a minority.

Excuse me while I STFU...
:blush:
If I suggest you edit any file(s) make a backup first - I'm not perfect and neither are you.

"Given enough impetus a parallelogramatically shaped projectile can egress a circular orifice."
- Me -

"Headers already sent" - The definitive help

"Cannot redeclare ..." - How to find/fix it

SSL Implementation Help

Like this post? "Like" it again over there >