Jump to content
  • Checkout
  • Login
  • Get in touch

osCommerce

The e-commerce.

allprods.php re-visited


Recommended Posts

The below mods change spaces, periods and breaks<br> to underscores resulting in:

http://test.solardirect.com/product_info.p...products_id=102

 

My questions:

 

1> Is the "?" after product_info.php in the url needed or harmful to spiders? How can I eliminate it or change it to a "/"?

 

2> Is the "&name=" necessary or can I just remove it from the code?

 

3> Is the "&products_id" necessary? I tried removing it and it results in an error. Can it be removed or changed to something like "id="?

 

4> If the "&product_id" is included, then would it not make sense to have an underscore in front of it (as I have done above) so that the last word in the product description does not run into it, or doesn't it matter?

 

I have allprods installed on a 05/18/03 snapshot. I am using a slightly modified version of code from JenRed (see [/url]http://www.oscommerce.com/forums/viewtopic.php?t=40584&postdays=0&postorder=asc&highlight=prod&start=40).

 

You can see my allprod page at: http://test.solardirect.com/allprods.php

 

My modified code:

$this_id = $products_array[$i]['id'];

$this_name = $products_array[$i]['name'];

$this_model = $products_array[$i]['model'];

$this_price = $products_array[$i]['price'];

$this_special = $products_array[$i]['special'];

$this_tax = $products_array[$i]['tax'];



$this_newname = str_replace

 (array(" ",".","<br>"),

 array("_","_","_"), 

 $this_name);



$this_url = tep_href_link(FILENAME_PRODUCT_INFO, 

 '&name=' . urlencode($this_newname) . '_&products_id=' . $this_id . 

 (($this_language_code == DEFAULT_LANGUAGE) ? '' : ('&language=' . 

 $this_language_code)), 'NONSSL', false);



echo "<tr $row_col>";

echo "<td class='productListing-data' align='left'>

 <a href='$this_url'>$this_name</a></td>";

 

Boom... Big butta boom.

Link to comment
Share on other sites

1> Is the "?" after product_info.php in the url needed or harmful to spiders? How can I eliminate it or change it to a "/"?

 

The '?' indicates the beginning of the parameter section of the url. You can try to turn on the "search engine safe urls" in the admin section is you wish, but it is not harmfull at all to spiders.

 

2> Is the "&name=" necessary or can I just remove it from the code?[/quote]

 

No, it is not necessary. You can remove it. It is only there to improve your index placement for the product name. In other words, if someone searches for the name of the product, you index placement will benifit from having the keyword in the url.

 

3> Is the "&products_id" necessary? I tried removing it and it results in an error. Can it be removed or changed to something like "id="?

Yes it is necessary, this is how it dynamically tells the server to display the right product.

 

4> If the "&product_id" is included, then would it not make sense to have an underscore in front of it (as I have done above) so that the last word in the product description does not run into it, or doesn't it matter?

 

Well parameters in a query string are usually delimited with an '&' or a '/', so, no, it would not make sense to do so.

-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Link to comment
Share on other sites

Thanks Chris. Just one clarification...

 

No, it is not necessary. You can remove it. It is only there to improve your index placement for the product name. In other words, if someone searches for the name of the product, you index placement will benifit from having the keyword in the url.

 

So by this you mean I can remove the "&name=" from the url and since the actual product name is in the url it will get indexed? The "&name=" does not help the index process or will having it (meaning the phrase itself "&name") help?

 

The '?' indicates the beginning of the parameter section of the url. You can try to turn on the "search engine safe urls" in the admin section is you wish, but it is not harmfull at all to spiders.

 

I remember reading something about the spider stops reading after a "=" or "?" or something character???

Boom... Big butta boom.

Link to comment
Share on other sites

So by this you mean I can remove the "&name=" from the url and since the actual product name is in the url it will get indexed? The "&name=" does not help the index process or will having it (meaning the phrase itself "&name") help?

 

You can remove the "&name=<product_name>" or you can remove the "name=" leaving the "&<product_name>". But if you leave the actual product name, you have to leave the amperstnad as well.

 

This whold 'Fake' parameter was an idea of mine to try to improve inde ranking on google. One highly considered criteria of your indesx ranking is "keywords in url". This parameter is not currently used by any process in OSC. However, to include the product name, you have to delimit the parameter with the '&'

 

I remember reading something about the spider stops reading after a "=" or "?" or something character???

 

No true. Some older spiders have this problem, but most are fixed now. Here's proof. I think we're about 4th on this page.

 

http://www.google.com/search?hl=en&lr=&ie=...y+frozen+throne

-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Link to comment
Share on other sites

I think this is a wonderful contribution. It looks like it could really help impove rankings. I am considering useing this contribution but I want to make sure I know any possible bugs or complication prior to doing it.

 

Are there any common bugs that have come up?

Will this work with all different type of urls? (I have close to 9000 different product pages. Some which included ' in the title)

 

What are the advatages of an _ as apposed to +?

 

Thank you and great work with this contrib.

Link to comment
Share on other sites

Well, if you are talking about the All-Prods contribution itself, I'm sorry to say that I have found it to be somewhat counter productive.

 

It seems that Google does not like a page with nothing but links on it. If the page does not have a certain percentage of regular content, it will ignor every link on the page.

-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Link to comment
Share on other sites

It seems that Google does not like a page with nothing but links on it. If the page does not have a certain percentage of regular content, it will ignor every link on the page.

 

Chris, never thought of that, good idea. Would you mind sharing your allprods.php code to show how you did it? Of course it would not be that difficult to do, just would make it easier to see yours. Thanks.

Boom... Big butta boom.

Link to comment
Share on other sites

In the above url, the "&" does not show up before "pc_game_title=". How did you do it? Maybe I am being too picky??? But it looks wierd in mine: ...product_info.php?&12_mil... with the ?&.

 

OK, in any given URL there are basically 2 parts, the domain and the parameters. The domain and the parameters are seperated by a '?'. Each of the parameters are seperated by a "&"

 

So, in my urls, I have my domain...

 

www.wizardsandwars.com

 

afterwhich always has to be a question mark....

 

?

 

And then the parameters.....

 

product_name=whatever&product_color=red&product_id=5

 

Notice that the parameters are al seperated by amperstands. You can change the order of the parametersto anythign you like.

 

Basically, my URL looks like that one becasue I made it the first parameter in the url.

-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Link to comment
Share on other sites

Chris,

 

Can't you add some content to the allprods page to make it appear as a true content page?

 

Are there any know issues where putting the product title into the URL can cause problems? I am not a programmer but it seems that this is a significant change and I am worried that with it would come some minor or major glictches.

 

thanks agian

Link to comment
Share on other sites

yes.

 

no.

-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Link to comment
Share on other sites

1> Is the "?" after product_info.php in the url needed or harmful to spiders? How can I eliminate it or change it to a "/"?

 

The '?' indicates the beginning of the parameter section of the url. You can try to turn on the "search engine safe urls" in the admin section is you wish, but it is not harmfull at all to spiders.

 

2> Is the "&name=" necessary or can I just remove it from the code?

 

No, it is not necessary. You can remove it. It is only there to improve your index placement for the product name. In other words, if someone searches for the name of the product, you index placement will benifit from having the keyword in the url.

 

I've used search engine safe URLs & Greg's suggestion of raw urlencoding sitewide to add keywords to my URLs with no adverse effect on my cart. However, Google is just starting to index category & product pages after 1 month, and then, only a few at a time. Inktomi & Fast/Lycos crawler (which I paid for automatic inclusion) have yet to index the site beyond my home pages. When experimenting with an email extractor spider today, I discovered that this particulat spider doesn't see my individual product pages or categories, but it does index other OSC sites & perl-based sites with a "?" in the URL. So now, after a month of diligent optimization & submissions, I'm thinking of turning SES off. I've done it on my test server, but still see "/"s everywhere, even on a test page with an unmodified module.

 

Question: Where is the "engine safe URL" funciton defined?

 

Has anyone else gotten a deep crawl from Inktomi or FAST with search engine safe URLs? I know others have reported good results with Google, and I do seem to be getting some success on that engine with my current settings, but it's taking SOOO LONG....!

 

P.S. The Poodle predictor & other spider sims has no problem seeing my SES URLs, so I presumed it was OK to proceed.

 

 

3> Is the "&products_id" necessary? I tried removing it and it results in an error. Can it be removed or changed to something like "id="?

Yes it is necessary, this is how it dynamically tells the server to display the right product.

 

4> If the "&product_id" is included, then would it not make sense to have an underscore in front of it (as I have done above) so that the last word in the product description does not run into it, or doesn't it matter?

 

Well parameters in a query string are usually delimited with an '&' or a '/', so, no, it would not make sense to do so.

Link to comment
Share on other sites

Sorry about the duplicate post, but I saw that my reply got lost in a quotes tag...

 

I've used search engine safe URLs & Greg's suggestion of raw urlencoding sitewide to add keywords to my URLs with no adverse effect on my cart. However, Google is just starting to index category & product pages after 1 month, and then, only a few at a time. Inktomi & Fast/Lycos crawler (which I paid for automatic inclusion) have yet to index the site beyond my home pages. When experimenting with an email extractor spider today, I discovered that this particulat spider doesn't see my individual product pages or categories, but it does index other OSC sites & perl-based sites with a "?" in the URL. So now, after a month of diligent optimization & submissions, I'm thinking of turning SES off. I've done it on my test server, but still see "/"s everywhere, even on a test page with an unmodified module.

 

Question: Where is the "engine safe URL" funciton defined?

 

Has anyone else gotten a deep crawl from Inktomi or FAST with search engine safe URLs? I know others have reported good results with Google, and I do seem to be getting some success on that engine with my current settings, but it's taking SOOO LONG....!

 

P.S. The Poodle predictor & other spider sims has no problem seeing my SES URLs, so I presumed it was OK to proceed.

Link to comment
Share on other sites

The most important thing to remember about search engine optimization is that it takes *a* *very* *long* *time*.

 

It can take Google *severl* months before it has all of your links in its index. And if you do thingls like change URLs to a product, or change the meta titles, those listings can be dropped for a few months before showing up again.

 

Once you get them set, Google like to see URLs and meta Tags to remain constant. Any changes can result in losing the listing for several months.

 

Inktomi (Slurp), FastCrawl, and Alexa and all of the other dialy spidering bots will have no problems spidering the SES urls whatsoever. You may want ot make sure that each page you want to be indexed has unique set of meta tags, and that those pages are not disallowed in your robots.txt file.

-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Link to comment
Share on other sites

Thanks for the reply. My header tags are different for all category and product pages but the same for most of the other pages. I used Linda's header tags thoughout. But it's these dynamic product & category pages that my extractor spider doesn't see & which worries me, especially since Inktomi & Fast aren't picking them up.

 

Here's part of my robots.txt file. Aside from the global folders restricted at the start, all individual spiders have a blank 'Disallow' line. The file verifies as OK. Do you see anything out of place here?

 

User-agent: *

Disallow: /cgi-bin/

Disallow: /images/

Disallow: /admin/

Disallow: /googlestats/

 

User-agent: Mozilla/3.0 (compatible;miner;mailto:[email protected])

Disallow:

Link to comment
Share on other sites

Thanks for the reply. My header tags are different for all category and product pages but the same for most of the other pages. I used Linda's header tags thoughout. But it's these dynamic product & category pages that my extractor spider doesn't see & which worries me, especially since Inktomi & Fast aren't picking them up.

 

Here's part of my robots.txt file. Aside from the global folders restricted at the start, all individual spiders have a blank 'Disallow' line. The file verifies as OK. Do you see anything out of place here?

 

User-agent: *

Disallow: /cgi-bin/

Disallow: /images/

Disallow: /admin/

Disallow: /googlestats/

 

User-agent: Mozilla/3.0 (compatible;miner;mailto:[email protected])

Disallow:

Link to comment
Share on other sites

Thanks for the reply. My header tags are different for all category and product pages but the same for most of the other pages. I used Linda's header tags thoughout. But it's these dynamic product & category pages that my extractor spider doesn't see & which worries me, especially since Inktomi & Fast aren't picking them up.

 

Here's part of my robots.txt file. Aside from the global folders restricted at the start, all individual spiders have a blank 'Disallow' line. The file verifies as OK. Do you see anything out of place here?

 

User-agent: *

Disallow: /cgi-bin/

Disallow: /images/

Disallow: /admin/

Disallow: /googlestats/

 

User-agent: Mozilla/3.0 (compatible;miner;mailto:[email protected])

Disallow:

Link to comment
Share on other sites

Thanks for the reply. My header tags are different for all category and product pages but the same for most of the other pages. I used Linda's header tags thoughout. But it's these dynamic product & category pages that my extractor spider doesn't see & which worries me, especially since Inktomi & Fast aren't picking them up.

 

Here's part of my robots.txt file. Aside from the global folders restricted at the start, all individual spiders have a blank 'Disallow' line. The file verifies as OK. Do you see anything out of place here?

 

User-agent: *

Disallow: /cgi-bin/

Disallow: /images/

Disallow: /admin/

Disallow: /googlestats/

 

User-agent: Mozilla/3.0 (compatible;miner;mailto:[email protected])

Disallow:

Link to comment
Share on other sites

No, that robots file look fine to me.

 

What is this 'extractor spider' that you are referring to?

 

Normally, if an older spider has trouble picking up a site with dynamic content, it will be becasue of the '?' and '&', although nearly all of the newer spiders have now corrected that problem.

 

I've never hear of a spider having trouble picking up a 'search Engine Safe' URL.

-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Link to comment
Share on other sites

It's called Web Data Extract. It grabs meta tags & contact info from sites.

 

I can't remember, does the URL encoding (Greg's contribution) work with search engine safe turned OFF? I turned it off on my test server, but can't seem to get a "?" to show up because all the links have this rawurlencode implemented. I was thinking of setting my real server back to the old way for a day or so to see if Fast or Inktomi goes deeper.

 

Here's the other relevant tags I added to all pages. Maybe that FirstPage should only be on index.php (my renamed deafult.php)?

 

<META NAME="robot" CONTENT="index,follow">

<META NAME="FirstPage" content="Y">

<META NAME="revisit-after" CONTENT="2">

<meta name="MSSmartTagsPreventParsing" content="TRUE">

Link to comment
Share on other sites

I'm not familiar with the URL encoding, so I'm not sure what your url's look like.

 

I'm also not familiar with *any* of those meta tags, although I know that there are several that you can use to benifit.

 

I do know that I have SEF URLS turned off, URL encoding, and Intomi has spidered all 350+ of my pages everyday for the past 7 or 8 months, and I generally enjoy terrific Google index placement.

-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Link to comment
Share on other sites

Wow - daily deep crawls? What could be my problem??? My site is www.taylormadetreasures.net. Look at any of my links & you'll see the keywords separated by underscores.

 

Do you submit all your pages to Google or did it find them all? With Inktomi & Fast, you have to pay for each URL, as you probably know, so multiple page submission isn't practical. How long did it take before you started getting deep crawls? I've asked Lycos/Fast support about the deep crawl several times, and they just say that a site with a good link structure should get crawled. I submitted to Fast & Inktomi on 6/20, so it hasn't been a month yet. Google was submitted in early June.

 

What other tags do you suggest? And how often do you submit? Do you use an auto-submitter.

 

Sorry for all the questions, but I really need to get this site to start making some money!

Link to comment
Share on other sites

Well, i'll take a look at your site, but first, I should mention that it took over 3 full months before I was fully indexed by anyone. I don't pay a dime ot Slurp(inktomi) or Fastbot, and they crawl me everyday. It took thme about 2 full months to get indexed. Google was even longer, at 3 months, and I submitted every week. dmoz is even harder to get into it seems, I've submitted weekly for about 4 months, and still no sign of my url. I've never used an auto-submitter, and I think that they are mostly just a waste of money.

 

The thing you are going to need the most in dealing with search engines is patience.

 

Now, I'm off to look at your website.

-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Link to comment
Share on other sites

Well, it loks to me like you are fine. Inktomi does have a couple of your urls in place, and I don't see any problems with your urls or your meta tags.

 

Just have some patience, and you'll get indexed properly. I'd give it at least 3 months.

-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...