Jump to content

Archived

This topic is now archived and is closed to further replies.

jasonsfa98

Google (Search Engines) revisited

Recommended Posts

Search Engine Safe Urls are the least of your concern nowadays. As long as the google spider is not tripped up and timed out by session id's your pages from allprods.php should work better than if you served your database with emulated html pages "/".

 

The search engines are aware of this now because webmasters have been employing it for over 2 years now. They are trying to find content worth indexing and less repetitive. Focus on your page content and title and play around with allprods.php to include absolute links.

 

Too many people are worrying about "search engine safe", they've evolved just as we have.

 

H

Share this post


Link to post
Share on other sites

Well,

 

Looking through my raw logs from webalizer, it looks as though googlebot hit my site on August 17, and again Sept 4. Both times it must have found something it didn't like because I'm STILL not listed in their index. :(

 

This time though, I have added Linda's metatag contrib, the robot redirector that jason spoke of, a robot.txt file, and the all prods contrib.

 

If I am understanding correctly, googlebot goes out again in about 2 weeks. If there is anything else I can do to get listed with them, I need to find it before then.


-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Share this post


Link to post
Share on other sites

Jason,

 

I have patience, I just want to make sure I'm doing everything I can in the meantime to expidite my listing.

 

How long does it usually take? I've heard of guys that have gotten listed in just a few days.


-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Share this post


Link to post
Share on other sites

Is is a bad idea t osubmit more than once?

 

And after you get listed, is it helpfull for your ranking to re-submit every so often?

 

Sorry to bombard you with questions. I'm just ready to take my store to the 'next level'.


-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Share this post


Link to post
Share on other sites

First for those of you who are reading now and haven't figured out "Search engine friendly URL's" basically means that php & apache change the url from something like:

 

http://localhost/catalog/product_info.php?...?products_id=16

 

to this:

 

http://rhobbs/catalog/product_info/products_id/16

 

Some search engines can't/don't/etc parse anything after the '?' sign which basically means you site won't get indexed well if at all.

 

Google can and does parse php/asp/cfm/etc pages fairly well.. but ]Google is not the only search engine folks (although it is the best!). You will need to make a (hopefully informed) decision about search engine optimizations because each engine has it's own rules and it's own methodolgy for ranking pages. But to answer your question wizards&wars (for the most part):

 

Is is a bad idea t osubmit more than once?

Yes. That's baaaaadd. It can be very negative for you.

 

And after you get listed, is it helpfull for your ranking to re-submit every so often?

Some engines like don't mind you resubmiting after "X" time has passed. Others want you to resubmit if changes have been made.. Google does not require you to resubmit. If you are in their list to spider, you are in their list and if you haven't screwed something up somewhere.. you will get indexed.

 

I'm not a search engine guru but I've been doing web design & the like for 7 years now (my first copy of photoshop fit on 4 floppy disks).

 

There are tons of articles out there from search engine experts. Most of the time you are better off hand-coding the HTML. There is stuff out there about gateway/doorway pages/ reciprocal links / keyword repititions.. just do a search on ..... Google of course!

Share this post


Link to post
Share on other sites

Hi Met00,

Did a search on this forum for the thread you spoke of. What is the title of the thread? Cannot find it.

 

Can you list the HOW TO portion of your solution to front door allprods.php for Google?

 

Thanks!

 

Bobby


Jared Geesey

Share this post


Link to post
Share on other sites

Hi Met00,

 

Is this the correct code? Shouldn't it have a "<?php"

at the beginning like I have placed here in this script?

 

And if I understand the installation correctly... just make an index.php page of this code. Right?

 

CODE FOR SCRIPT:

 

<?php

// Quick script to 'redirect' spiders/robots to 'search engine friendly' page.

// By Dan Zambonini [dan@boxuk.com], Sep 2000. V 1.0

// Please send alterations/comments to dan@boxuk.com

// Modified from someone else's script, can't remember which one...Sorry!

// Major search engines match either $spider_footprint or $spider_ip.

 

$spider_footprint = array( "rawler", "pider", "obot", "eek", "canner", "lurp",

"cooter", "rachnoidea", "KIT", "ulliver", "arvest");

 

$spider_ip = array( "204.123.", "204.74.103.", "203.108.10.", "195.4.183.", "195.242.46.", "198.3.97.", "204.62.245.", "193.189.227.", "209.1.12.", "204.162.96.", "204.162.98.", "194.121.108.", "128.182.72.", "207.77.91.", "206.79.171.", "207.77.90.", "208.213.76.", "194.124.202.", "193.114.89.", "193.131.74.", "131.84.1.", "208.219.77.", "206.64.113.", "195.186.1.", "195.3.97.", "194.191.121.", "139.175.250.", "209.73.233.", "194.191.121.", "198.49.220.", "204.62.245.", "198.3.99.", "198.2.101.", "204.192.112.", "206.181.238", "208.215.47.", "171.64.75.", "204.162.98.", "204.162.96.", "204.123.9.52", "204.123.2.44", "204.74.103.39", "204.123.9.53", "204.62.245.", "206.64.113.", "204.138.115.", "94.22.130.", "164.195.64.1", "205.181.75.169", "129.170.24.57", "204.162.96.", "204.162.96.", "204.162.98.", "204.162.96.", "207.77.90.", "207.77.91.", "208.200.146.", "204.123.9.20", "204.138.115.", "209.1.32.", "209.1.12.", "192.216.46.49", "192.216.46.31", "192.216.46.30", "203.9.252.2");

 

$agent = getenv('HTTP_USER_AGENT');

$host_ip = getenv('REMOTE_ADDR');

$is_spider = 0;

 

// Is it a spider?

 

$i = 0;

while ($i < (count($spider_footprint)))

{

if (strstr($agent, $spider_footprint[$i]))

{

$is_spider = 1;

break;

}

$i++;

}

 

if (! $is_spider)

{

$i = 0;

while ($i < (count($spider_ip)))

{

if (strstr($host_ip, $spider_ip[$i]))

{

$is_spider = 1;

break;

}

$i++;

}

}

 

// Re-direct to correct page

// Change the files below to your search-engine 'keyword' page and normal index page.

 

if ($is_spider)

{

readfile("allprods_stripped.php");

}

else

{

readfile("default.php");

}

?>

 

 

Thanks

 

Bobby


Jared Geesey

Share this post


Link to post
Share on other sites

<eof>


The king of kluge...

? Do you really know what is and isn't working in your store? ≡ ?

Share this post


Link to post
Share on other sites

Met00,

I replaced the <?php with the <eof> at the beginning of the script.

 

How can I tell if it is installed correctly. When I go the index.php page from my browser I just see the script word for word.

 

Also, I have this index.php file located in my /catalog/ directory. My root has an index.html and from this store front have a link into the catalog/default.php page located in the catalog directory.

 

What should I do? Will this work as I have it now? Or does this redirect script need to me in the root directory?

 

Thanks!

 

Bobby


Jared Geesey

Share this post


Link to post
Share on other sites

<eof> should NOT go at the beginning of the script. I actually forgot what that means but it's Internet lingo.

 

<?

 

or

 

<?php

 

needs to go at the beginning.

 

This page should be at the VERY beginning of your web. The first page ANYONE will go to.


Jason

Share this post


Link to post
Share on other sites

If this index.php file is placed in my root directory, will it pass people on to my store entrance? Right now when I go to this index.php page I just the script you gave. And am not forwarded on to my store entrance. Am I missing something?


Jared Geesey

Share this post


Link to post
Share on other sites

Looks like I need to remove the index.html file from the same directory so that the index.php file will work.

 

ALSO:

 

With the code be as such:

 

if ($is_spider)

{

readfile("allprods.php");

 

CAN I MAKE THE PATH READ

 

readfile("http://www.mydomain.com/catalog/allprods.php");

 

The reason is because allprods.php does not reside in my root. It resides in the catalog directory. How do I make sure I am pointing the google bot in the right direction?

 

Thanks!

 

Bobby


Jared Geesey

Share this post


Link to post
Share on other sites

I tested it and the following path seems to work:

 

if ($is_spider)

{

readfile("http://www.mydomain.com/catalog/allprods_stripped.php");

 

NOW IS THERE A CURRENT LIST OF IP Addresses for the Google Bots?

 

Thanks for the help!

 

Bobby


Jared Geesey

Share this post


Link to post
Share on other sites

Hi Bobby,

 

The full path will always work no matter where you are. It sounds to me like you might not know what the script is doing. Let me try to draw it out in plane text.

 

<?

 

If your a spider crawling my site, do this:

 

CRAWL this page- {insert path to your all products page}

 

OTHERWISE

 

You must be a human browsing my site, so do this:

 

GOTO-{my main site which is usually default.php}

 

?>

 

Now the point of this is to have all that code in a page called index.php (or html) because a web server will usually serve that page up when someone calls www.yourdomain.com. Now that page basically does the detective work to see if your a human being with a browser or a Google spider crawling the site. From there it directs traffic based on what it sees.

 

So as long as you have that page in the ROOT (very beginning) of your website, you will be fine.

 

And FYI: <? and ?> are ambiguous to a capitol letter starting a sentence and a period ending it. So if you want to use any PHP code, enclose it in the <? and ?> so it will work.

 

Now if you still don't understand there are some great books at Barnes and Noble that will teach you allot. If your serious about using this shopping cart I would invest in the books.


Jason

Share this post


Link to post
Share on other sites

Got it clear.

 

But how do you manage the various IP addresses used by Google to spyder your site?

 

Do I just keep an eye on my Who's Online page for multiple IP addresses?

And then add them to the list in the index.php ?

 

Thanks!

 

Bobby


Jared Geesey

Share this post


Link to post
Share on other sites

I am now listing with EACH of my 350 products in the top 3 of Google when I do a search specific to the product name.

 

Very cool.

 

Bobby


Jared Geesey

Share this post


Link to post
Share on other sites

$spider_ip = array( "204.123.", "204.74.103.", "203.108.10.", "195.4.183.", "195.242.46.", "198.3.97.", "204.62.245.", "193.189.227.", "209.1.12.", "204.162.96.", "204.162.98.", "194.121.108.", "128.182.72.", "207.77.91.", "206.79.171.", "207.77.90.", "208.213.76.", "194.124.202.", "193.114.89.", "193.131.74.", "131.84.1.", "208.219.77.", "206.64.113.", "195.186.1.", "195.3.97.", "194.191.121.", "139.175.250.", "209.73.233.", "194.191.121.", "198.49.220.", "204.62.245.", "198.3.99.", "198.2.101.", "204.192.112.", "206.181.238", "208.215.47.", "171.64.75.", "204.162.98.", "204.162.96.", "204.123.9.52", "204.123.2.44", "204.74.103.39", "204.123.9.53", "204.62.245.", "206.64.113.", "204.138.115.", "94.22.130.", "164.195.64.1", "205.181.75.169", "129.170.24.57", "204.162.96.", "204.162.96.", "204.162.98.", "204.162.96.", "207.77.90.", "207.77.91.", "208.200.146.", "204.123.9.20", "204.138.115.", "209.1.32.", "209.1.12.", "192.216.46.49", "192.216.46.31", "192.216.46.30", "203.9.252.2", "209.185.253.167", "209.185.253.168", "209.185.253.169", "209.185.253.170", "209.185.253.171", "209.185.253.172", "209.185.253.173", "209.185.253.174", "209.185.253.175", "209.185.253.176", "209.185.253.177", "209.185.253.178", "209.185.253.179", "209.185.253.180", "209.185.253.181", "209.185.253.182", "209.185.253.183", "209.185.253.184", "209.185.253.185", "209.185.253.186", "209.185.253.187", "209.185.253.188", "209.185.108.138", "209.185.108.139", "209.185.108.140", "209.185.108.141", "209.185.108.142", "209.185.108.143", "209.185.108.144", "209.185.108.145", "209.185.108.146", "209.185.108.147", "209.185.108.148", "209.185.108.149", "209.185.108.150", "209.185.108.151", "209.185.108.152", "209.185.108.153", "209.185.108.154", "209.185.108.155", "209.185.108.156", "209.185.108.157", "209.185.108.158", "209.185.108.159", "209.185.108.160", "209.185.108.161", "209.185.108.162", "209.185.108.163", "209.185.108.164", "209.185.108.165", "209.185.108.134", "209.185.108.135", "64.209.181.52", "64.208.33.33", "64.209.181.53", "64.68.82.22", "64.68.82.23", "64.68.82.24", "64.68.82.25", "64.68.82.26", "64.68.82.27", "64.68.82.28", "64.68.82.29", "64.68.82.30", "64.68.82.1", "64.68.82.2", "64.68.82.3", "64.68.82.4", "64.68.82.5", "64.68.82.6", "64.68.82.7", "64.68.82.8", "64.68.82.9", "64.68.82.10", "64.68.82.11", "64.68.82.12", "64.68.82.13", "64.68.82.14", "64.68.82.15", "64.68.82.16", "64.68.82.17", "64.68.82.18", "64.68.82.19", "64.68.82.20", "64.68.82.21", "64.68.82.50", "64.68.82.51", "64.68.82.52", "64.68.82.53", "64.68.82.54", "64.68.82.55", "64.68.82.56", "64.68.82.57", "64.68.82.58", "64.68.82.59", "64.68.82.60", "64.68.82.31", "64.68.82.32", "64.68.82.33", "64.68.82.34", "64.68.82.35", "64.68.82.36", "64.68.82.37", "64.68.82.38", "64.68.82.39", "64.68.82.40", "64.68.82.41", "64.68.82.42", "64.68.82.43", "64.68.82.44", "64.68.82.45", "64.68.82.46", "64.68.82.47", "64.68.82.48", "64.68.82.49", "64.68.82.76", "64.68.82.77", "64.68.82.78", "64.68.82.79", "64.68.82.80", "64.68.82.61", "64.68.82.62", "64.68.82.63", "64.68.82.64", "64.68.82.65", "64.68.82.66", "64.68.82.67", "64.68.82.68", "64.68.82.69", "64.68.82.70", "64.68.82.71", "64.68.82.72", "64.68.82.73", "64.68.82.74", "64.68.82.75");


Jared Geesey

Share this post


Link to post
Share on other sites

×