Jump to content
  • Checkout
  • Login
  • Get in touch

osCommerce

The e-commerce.

The Google sitemap


Recommended Posts

I have searched but I do not see a thread for the contribution--I am sorry if I missed it. Anyhow, I have installed the Google Sitemap contribution (http://www.oscommerce.com/community/contributions,3226/category,all/search,sitemap), and it runs perfectly. The problem is that I want to use a cron job to run the google_sitemap.php script every night, which I would like to do by exictuing

 

/usr/bin/php /html/root/path/google_sitemap.php | /bin/gzip > /html/root/path/sitemap.gz

This causes application_top to get included, which triggers the session, so I get...

 

Content-type: text/html
X-Powered-By: PHP/4.3.2
Set-Cookie: osCsid=7af8dd2b6e8b13c43522952465b06a34; path=/; domain=www.rangerjoes.com
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache

Added to the top of the file.

 

I have tried the -q switch, but that gives...

 

<br />
<b>Warning</b>:  session_start(): Cannot send session cookie - headers already sent in <b>/html/root/path/includes/functions/sessions.php</b> on line <b>67</b><br />
<br />
<b>Warning</b>:  session_start(): Cannot send session cache limiter - headers already sent in <b>/var/www/html/includes/
functions/sessions.php</b> on line <b>67</b><br />

Does anyone have a suggestion?

Link to comment
Share on other sites

Raphael,

 

Nice coding for the contribution but it would be best to get away from the tep API and roll your own database abstraction layer. Then, skip the usual includes for app top/bottom which will get away from the sessions all together.

 

I've been working on a Google sitemap class for a day or so now (great minds think alike) and will have something out there pretty soon. I plan on integrating the solution to the adminCP and havings options such as sitmap index files (if more than one is needed - i.e. separate products versus categories to get around that 50K limit), automatic GZ compression format file save, and automatic HTTP request submission (webmasters/sitemaps/ping?sitemap=sitemap_url).

 

However, the overall point is that unless you get away from the tep API on this one the CRON generation will be fairly difficult to get around.

 

Bobby

Link to comment
Share on other sites

Hi,

 

However, the overall point is that unless you get away from the tep API on this one the CRON generation will be fairly difficult to get around.

 

There is no problem with the sessions if you use the CLI SAPI for PHP. This is the only correct way to access php from shell.

If you don't have shell access you could use the script over http.

 

... havings options such as sitmap index files (if more than one is needed - i.e. separate products versus categories to get around that 50K limit), automatic GZ compression format file save...

 

The new version supports an new option auto.

 

Usage:

shell: php google_sitemap.php -za

OR

http://domain.com/google_sitemap.php?auto=true&gzip=true

 

This generates multiple (if necessary [> 50000URLs or > 10MB filesize]) compressed sitemaps in the catalog folder and a corresponding index file.

 

Regards,

Raphael

Link to comment
Share on other sites

Hi,

 

I've added a new option p to notify Google about your new sitemap.

 

Usage:

shell: php google_sitemap.php -zap

 

http://domain.com/google_sitemap.php?auto=true&gzip=true&ping=true

 

 

All Features:

- supports multilangual categories and products

- supports Search-Engine Safe URLs (osC default)

- could be accessed by http or command line

- writes to file or standard output

- autogenerates multiple sitemaps for sites with over 50.000 URLs

- autogenerates multiple sitemaps if filesize exceeded 10MB

- autogenerates a index file if necessary

- writes files compressed or uncompressed

- auto-notify Google about sitemap or index file

 

see readme.txt for details.

 

Regards,

Raphael

Edited by rvullriede
Link to comment
Share on other sites

I'm experiencing the following error when it tries to submit to google:

 

Warning: fopen() expects at least 2 parameters, 1 given in /xxxxx/xxxxxxxx/public_html/google_sitemap.php on line 301

 

Any ideas?

Link to comment
Share on other sites

Hi,

 

thanks for your feedback.

 

replace Line 301

 

    fopen('http://www.google.com/webmasters/sitemaps/ping?sitemap='.urlencode($notify_url);

 

with

 

    fopen('http://www.google.com/webmasters/sitemaps/ping?sitemap='.urlencode($notify_url), 'r');

 

Regards,

Raphael

Link to comment
Share on other sites

WOW! It is great seeing just how active this community is. Anyhow, I was just curious if Google has actually requested anyone's sitemap? I have had mine in place for about 48 hours, but no request in my apache logs.

Link to comment
Share on other sites

Hi,

 

Anyhow, I was just curious if Google has actually requested anyone's sitemap? I have had mine in place for about 48 hours, but no request in my apache logs.

 

I submitted my sitemap with my Google-Account (https://www.google.com/webmasters/sitemaps/login) and Google downloaded the file within 2 hours. A few hours later the sitemap was parsed and the status has changed from 'pending' to 'o.k.'

 

Regards,

Raphael

Link to comment
Share on other sites

Hi,

I submitted my sitemap with my Google-Account (https://www.google.com/webmasters/sitemaps/login) and Google downloaded the file within 2 hours. A few hours later the sitemap was parsed and the status has changed from 'pending' to 'o.k.'

 

Regards,

Raphael

 

Same happened to me.

Now Should I expect to see Google deeply crawling ALL the urls in the sitemap within a very short time ?

 

Cheers

Franco

Outside links in signatures are not allowed!

Link to comment
Share on other sites

  • 3 weeks later...

I was wondering if anyone has had problems with this contrib and "big" stores.

 

I have *almost* 26,000 products, so my sitemap generates to about 9MB uncompressed.

 

I also have a smaller store, with about 55 products.

 

I submitted the sitemap for the smaller file, and google gave it the "OK", however I've RE-submitted my LARGER store's sitemap multiple times, with different options (gzip, not, etc) and I keep getting an "Invalid Date" error for that sitemap.

 

And of course, google doesn't give a line number or anything, so I can't just go in and fix it if it's one page or something coming up wrong.

 

Any ideas, suggestions, or similar problems?

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...