Jump to content



Photo
- - - - -

What's Google doing?


This topic has been archived. This means that you cannot reply to this topic.
16 replies to this topic

#1   Juto

Juto
  • Members
  • 369 posts

Posted 21 July 2012 - 13:05

Hi, for some days now Google have tried to access html pages like so:

www.mysite.com/veuxkhbgzwab.html

The file name varies (seems to be random) but it's always a html type.

This of course gives a 404 error.

So, what's Google upto?

Sara

#2   Jack_mcs

Jack_mcs
  • Members
  • 26,409 posts

Posted 21 July 2012 - 16:24

How do you know it is google?

#3 ONLINE   kymation

kymation

    Code Monkey

  • Community Sponsor
  • 8,019 posts

Posted 21 July 2012 - 16:56

Google follows links, so somewhere there is a link to those pages. Use your Google Webmaster Tools page to find where the links are coming from.

That type of random file is typical of one type of hack. Check your site for hacked files and changes to the root .htaccess file.

Regards
Jim

My Addons

Banners Box Download Support
Categories Accordion Box Download Support
Closest Shipper 2.2x Support
Document Manager 2.2x Support
Generic Box Download Support
Get 1 Free 2.2x Support
Price in Cart Only/MAPP Download Support
Modular Front Page Download Support
Modular SEO Header Tags Download Support
MVS 2.2x Support
PDF Datasheet Download Support
Price Updater 2.2x
Products Specifications 2.3.x Development Version Support Bugs/Suggestions
Request a Review Download Support

Shopping List Download Support New!
Specials Image Overlay Download Support
Superfish Categories Box Download Support
Theme Switcher 2.3+ Support  Updated


#4   Juto

Juto
  • Members
  • 369 posts

Posted 21 July 2012 - 17:13

Hi, and thanks for your answer. The ip is Google's.

O my is it really a hack? In that case I've got three sites affected.

I don't have Googls webmaster tools, never thought it could be of use to me.
So I don't know how to access it or use it.

Could you please advice?

Sara

#5 ONLINE   kymation

kymation

    Code Monkey

  • Community Sponsor
  • 8,019 posts

Posted 21 July 2012 - 17:44

Google Webmaster Tools. BTW, the Google search engine is really good at finding web pages that Google owns.

Once you set up an account, you need to verify your site. The easiest way is to add a file to your site. Then you can look under Health -> Crawl Errors -> Not Found to see those 404 pages. Click on a link and select the Linked From tab to see where it's coming from. That may or may not be helpful, but it's worth trying.

Regards
Jim

My Addons

Banners Box Download Support
Categories Accordion Box Download Support
Closest Shipper 2.2x Support
Document Manager 2.2x Support
Generic Box Download Support
Get 1 Free 2.2x Support
Price in Cart Only/MAPP Download Support
Modular Front Page Download Support
Modular SEO Header Tags Download Support
MVS 2.2x Support
PDF Datasheet Download Support
Price Updater 2.2x
Products Specifications 2.3.x Development Version Support Bugs/Suggestions
Request a Review Download Support

Shopping List Download Support New!
Specials Image Overlay Download Support
Superfish Categories Box Download Support
Theme Switcher 2.3+ Support  Updated


#6   Juto

Juto
  • Members
  • 369 posts

Posted 21 July 2012 - 18:33

Hi Jim, I have created an account, verified it and all that.
I have tried for ½hour to find "Health"... Couldn't... /sad.png' class='bbc_emoticon' alt=':(' />

So more specific, where is it?

You are worth a hug! /smile.png' class='bbc_emoticon' alt=':)' />
Sara

#7 ONLINE   kymation

kymation

    Code Monkey

  • Community Sponsor
  • 8,019 posts

Posted 21 July 2012 - 19:05

From the tools home page, you may need to click the Manage Site button in the far right. I have multiple sites, so my page may be different. Then look in the left column for Health.

Regards
Jim

My Addons

Banners Box Download Support
Categories Accordion Box Download Support
Closest Shipper 2.2x Support
Document Manager 2.2x Support
Generic Box Download Support
Get 1 Free 2.2x Support
Price in Cart Only/MAPP Download Support
Modular Front Page Download Support
Modular SEO Header Tags Download Support
MVS 2.2x Support
PDF Datasheet Download Support
Price Updater 2.2x
Products Specifications 2.3.x Development Version Support Bugs/Suggestions
Request a Review Download Support

Shopping List Download Support New!
Specials Image Overlay Download Support
Superfish Categories Box Download Support
Theme Switcher 2.3+ Support  Updated


#8   Juto

Juto
  • Members
  • 369 posts

Posted 21 July 2012 - 20:04

Hi Jim, I have tried again.
Google reported no errors, and no 404 problems. Also my webhost have checked the site, using SSH, and could not find any problems.: "A common hack is to inject code which is only shown to the googlebot, but your homepage does not appear to have this hack". /smile.png' class='bbc_emoticon' alt=':)' />

A while ago someone tried this (It was stopped)

guid%E2%80%8Bes.php

What does that mean?

So, al in all I need to amend something to my htaccess which redirects any random hmtl page to index.php

How should I do that?

Then, someone is trying to reach this file: /1 which of course doesn't exist. How do I redirect that sort to index.php?

Thanks for your help!

Sara

#9   DunWeb

DunWeb

    The Censored One

  • Members
  • 13,084 posts

Posted 21 July 2012 - 20:34

@Juto

I use this in my .htaccess file


RewriteEngine On
ErrorDocument 404 /index.php


Chris
:|: Was this post helpful ? Click the LIKE THIS button :|:

See my Profile to learn more about add ons, templates, support plans and custom coding (click here)

#10   Juto

Juto
  • Members
  • 369 posts

Posted 21 July 2012 - 21:44

Hi Chris and thanks for your answer. The thing is that I have a custom error page, which of course I'd like to keep.

So I need something exclusive for that kind of random html pages. I found this on stack overflow:

I see 4 main approaches (you choose which one suits you more - it depends on your website rewrite rules logic):
1. Add global exclusion rule that will prevent ANY further rewrite operations on that file:
RewriteCond %{HTTP_HOST} ^website.com
RewriteRule (.*) http://www.website.com/$1 [R=301,L]
# do not do any rewriting to this file
RewriteRule somefile\.html$ - [L]
RewriteRule ^(.+)\.html$ http://www.website.com/$1.php [R=301,L]
If you wish you can specify full path to the file to be more specific (useful to exclude only 1 specific file if there are more than 1 file with such name but in different folders -- such URL should start with NO leading slash):
# do not do any rewriting to this file
RewriteRule ^full/path/to/somefile\.html$ - [L]
2. Add global exclusion rule that will prevent ANY further rewrite operations on ANY existing file or folder:
RewriteCond %{HTTP_HOST} ^website.com
RewriteRule (.*) http://www.website.com/$1 [R=301,L]
# do not do anything for already existing files
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule .+ - [L]
RewriteRule ^(.+)\.html$ http://www.website.com/$1.php [R=301,L]
3. Add exclusion condition that will deny rewriting this particular .html file to .php:
RewriteCond %{REQUEST_URI} !somefile\.html$
RewriteRule ^(.+)\.html$ http://www.website.com/$1.php [R=301,L]
If you wish you can specify full path to the file to be more specific (useful to exclude only 1 specific file if there are more than 1 file with such name but in different folders -- such URL should start with leading slash):
RewriteCond %{REQUEST_URI} !^/full/url/path/to/somefile\.html$
4. Add exclusion condition that will only allow rewriting .html to .php if such .html file does not exist:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.+)\.html$ http://www.website.com/$1.php [R=301,L]
ALL RULES ABOVE intended to be placed in .htaccess in website root folder. If placed elsewhere some small tweaking may be required.

A nice set of rules. But, which one is appropriate for me?

Sara

#11 ONLINE   kymation

kymation

    Code Monkey

  • Community Sponsor
  • 8,019 posts

Posted 21 July 2012 - 23:01

guid%E2%80%8Bes.php translates to guid​es.php. Probably another hacker file. The links Google is reporting could be someone checking for the result of a hack that failed. In any case, you're on the hacker's list, so make certain you have all of your security patches up to date. That includes the ones in osCommerce 2.3.2 that just came out.

I would not redirect 404s to any normal site page. Normally Google will see that veuxkhbgzwab.html returns a 404 Not Found and will remove it from their index. If you redirect that page to index.php, then Google will see it as a normal web page and keep it in their index. Since it looks just like your store's front page, it is ranked as a duplicate. Depending on how it is ranked, that page may have a higher ranking than the regular front page, or high enough to damage the real page's ranking. It's better to redirect to a custom 404 page, and make certain that page returns a 404 in the header.

I'm not an expert on Apache rewrite rules, so I'm not going to give an opinion on that batch.

Regards
Jim

My Addons

Banners Box Download Support
Categories Accordion Box Download Support
Closest Shipper 2.2x Support
Document Manager 2.2x Support
Generic Box Download Support
Get 1 Free 2.2x Support
Price in Cart Only/MAPP Download Support
Modular Front Page Download Support
Modular SEO Header Tags Download Support
MVS 2.2x Support
PDF Datasheet Download Support
Price Updater 2.2x
Products Specifications 2.3.x Development Version Support Bugs/Suggestions
Request a Review Download Support

Shopping List Download Support New!
Specials Image Overlay Download Support
Superfish Categories Box Download Support
Theme Switcher 2.3+ Support  Updated


#12   Juto

Juto
  • Members
  • 369 posts

Posted 22 July 2012 - 08:38

Hi Jim, and thanks. I think I have a very tight security, but who knows?

The only hack attempt prior to the garbage html requests was in the file "guides".
Conclusively, it was that hack attempt which the hacker tried to verify see the result of.

So, what I will do is to keep an eye on Googles webmaster tools, just to find out the ip of the hacker.

A hug to you /smile.png' class='bbc_emoticon' alt=':)' />
Sara

#13   MrPhil

MrPhil
  • Members
  • 5,149 posts

Posted 22 July 2012 - 13:27

guid%E2%80%8Bes.php translates to guid​es.php.

That would appear to be a UTF-8 sequence for a U+200B Zero Width Space hidden within a filename. I don't think that's normally a valid character for a file name. It's possible that it's an innocent mistake in transcribing a name, but it's also possible that the intent is to hide something.

Probably another hacker file.

You should check if such a file exists on your site. If not (the name probably isn't even valid), a hacker is probing your defenses, looking for a vulnerability on the server or in your applications. You should certainly block their IP address(es) and inform your host.

#14   Juto

Juto
  • Members
  • 369 posts

Posted 22 July 2012 - 21:06

Hi Phil, thanks for your guidance. I have search my site and there was no guid​es.php file. Also I search for guidâ etc.,
no such string were to be found either. So, if it was a hack attempt, it failed. /smile.png' class='bbc_emoticon' alt=':)' />

Kindest
Sara

Edited by Juto, 22 July 2012 - 21:06.


#15   MrPhil

MrPhil
  • Members
  • 5,149 posts

Posted 23 July 2012 - 00:07

Did you search for guid and es.php separately? Both as strings in files and as file names? There's no telling how the stuff in between would be treated in a search. The intent of using a ZWSP in the middle of a name is that presumably it would look in a log like an innocent guides.php. I don't know if they were trying to access a file by such a name that they had planted earlier, or were sending such bogus file names in an effort to break code (application or PHP itself). Either way, it would be good to let your host know about it, so they can try blocking these guys system-wide.

#16   Juto

Juto
  • Members
  • 369 posts

Posted 23 July 2012 - 09:00

Hi Phil, now I have... There was no file guid.php or es.php

This morning I Googled and found this: http://ghh.sourceforge.net/userfaq.php

As I understand it, the purpose is to find vulnerabilities for an attack.

Since this failed, I am secured... Confirmed by a hacker! /smile.png' class='bbc_emoticon' alt=':)' />


Kindest

Sara

#17   MrPhil

MrPhil
  • Members
  • 5,149 posts

Posted 23 July 2012 - 16:13

I meant to search for filenames starting with guid or ending with es.php. Likewise, to search in your files (at least, the .php files and .htaccess) for what appear to be filenames with the same characteristics.