Jump to content



Latest News: (loading..)

- - - - -

Googlebot ignoring Robots.txt ?!


  • Please log in to reply
3 replies to this topic

#1   chrisab

chrisab
  • Members
  • 4 posts
  • Real Name:Chris

Posted 16 January 2012 - 11:43 AM

Hi,
I've got robots file:
Disallow: /includes
Disallow: /cgi-bin
Disallow: /account.php
Disallow: /account_edit.php
Disallow: /account_history.php
Disallow: /account_history_info.php
Disallow: /account_password.php
Disallow: /add_checkout_success.php
Disallow: /address_book.php
Disallow: /address_book_process.php
Disallow: /advanced_search.php
Disallow: /checkout_confirmation.php
Disallow: /checkout_payment.php
Disallow: /checkout_payment_address.php
Disallow: /checkout_process.php
Disallow: /checkout_shipping.php
Disallow: /checkout_shipping_address.php
Disallow: /checkout_success.php
Disallow: /cookie_usage.php
Disallow: /create_account.php
Disallow: /create_account_success.php
Disallow: /login.php
Disallow: /password_forgotten.php
Disallow: /popup_image.php
Disallow: /shopping_cart.php
Disallow: /product_reviews_write.php

If I go to google webmaster and 'fetch as googlebot' say .co.uk\shopping_cart.php it fetches the whole page, any reason why robots is not working?!

Thanks
Chris

#2   Jack_mcs

Jack_mcs
  • Members
  • 25,318 posts
  • Real Name:Jack York
  • Gender:Male
  • Location:Michigan

Posted 16 January 2012 - 01:19 PM

If that is the whole file, you are missing the user agent line. Google has a robots checker that should show the problem.

#3   chrisab

chrisab
  • Members
  • 4 posts
  • Real Name:Chris

Posted 16 January 2012 - 02:31 PM

Ahh good point! I downloaded the file from the robots contribution on here!

I've changed to this, google still reading it but presume it needs time to re-read robots.txt?

User-agent: *
Disallow: /includes
Disallow: /cgi-bin
Disallow: /account.php
Disallow: /account_edit.php
Disallow: /account_history.php
Disallow: /account_history_info.php
Disallow: /account_password.php
Disallow: /add_checkout_success.php
Disallow: /address_book.php
Disallow: /address_book_process.php
Disallow: /advanced_search.php
Disallow: /checkout_confirmation.php
Disallow: /checkout_payment.php
Disallow: /checkout_payment_address.php
Disallow: /checkout_process.php
Disallow: /checkout_shipping.php
Disallow: /checkout_shipping_address.php
Disallow: /checkout_success.php
Disallow: /cookie_usage.php
Disallow: /create_account.php
Disallow: /create_account_success.php
Disallow: /login.php
Disallow: /password_forgotten.php
Disallow: /popup_image.php
Disallow: /shopping_cart.php
Disallow: /product_reviews_write.php

#4   ErikMM

ErikMM
  • Members
  • 314 posts
  • Real Name:Erik M
  • Gender:Male

Posted 13 August 2012 - 07:07 AM

View Postchrisab, on 16 January 2012 - 02:31 PM, said:

Ahh good point! I downloaded the file from the robots contribution on here!

I've changed to this, google still reading it but presume it needs time to re-read robots.txt?

User-agent: *
Disallow: /includes
Disallow: /cgi-bin
Disallow: /account.php
Disallow: /account_edit.php
Disallow: /account_history.php
Disallow: /account_history_info.php
Disallow: /account_password.php
Disallow: /add_checkout_success.php
Disallow: /address_book.php
Disallow: /address_book_process.php
Disallow: /advanced_search.php
Disallow: /checkout_confirmation.php
Disallow: /checkout_payment.php
Disallow: /checkout_payment_address.php
Disallow: /checkout_process.php
Disallow: /checkout_shipping.php
Disallow: /checkout_shipping_address.php
Disallow: /checkout_success.php
Disallow: /cookie_usage.php
Disallow: /create_account.php
Disallow: /create_account_success.php
Disallow: /login.php
Disallow: /password_forgotten.php
Disallow: /popup_image.php
Disallow: /shopping_cart.php
Disallow: /product_reviews_write.php

I think you need a / after:

change:

Quote

Disallow: /includes
Disallow: /cgi-bin

to

Disallow: /includes/
Disallow: /cgi-bin/

according to Google webmaster support:

Quote

  • To block a directory and everything in it, follow the directory name with a forward slash.
    Disallow: /junk-directory/
  • To block a page, list the page.
    Disallow: /private_file.html

I don't know that I would include
Disallow: /admin/

which should be renamed to something obscure anyway

suppose you can disallow the new obscure name, but you may just be pointing bad bots to the place you don't want them

DO ADD your sitemap(s) to the robots.txt

Sitemap: http:// www. yoursite. com/sitemap.xml
Sitemap: http:// www. yoursite. com/sitemap2.xml

Edited by ErikMM, 13 August 2012 - 07:14 AM.