I recently installed IP_Trap on a client's 2.2rc2 site. The installation went well; but when I tried to test I found that (as others have reported), I was also being redirected to the store's ../index.php whenever I tried to visiit "catalog/personal" rather than being banned. After analyzing the code a bit, I realized there was a loop through the whitelist file built into "catalog/personal/index.php" that redirects the user back to "../index.php" if a match on the IP address is found in the "banned/Whitelist.txt" file. Yet, I was SURE my IP address must NOT be in that file and a quick check confirmed I was NOT there.
Puzzled by this situation, I then made a small patch to the "foreach" loop code which originally looked like this:
foreach( $ipw as $whiteip )
{
$test = strcmp($whiteip,$ip);
if($test == 1)
{
header ("location:"."../index.php");
exit;
}
}
foreach( $ipw as $whiteip )
{
$test = strcmp($whiteip,$ip);
if($test == 1)
{
// header ("location:"."../index.php");
print $whiteip . ' ' . $ip;
exit;
}
}
To my great surprise when I tested using this code I found my new "print" command produced the following...
8.6.48 75.161.49.58
In short, my IP address (75.161.49.58) was producing a false positive match in the Whitelist.txt file with the IP string (8.6.48) ?
Say WHAT?
So I then removed the (8.6.48) IP address from Whitelist.txt and tested again. To my surprise I was banned as I should have been earlier in the very next test.
I noted that one user theorized he thought the cause of false positives in the Whitelist.txt file issue was ANY 3 part IP address, while another suspected the cause of false positives might be the failure to include a standard DOS line-ending CR/LF pair at the end of each line in the Whitelist file. So checking further, I found that the foreach loop had NOT gotten false postives on ANY 3 part IPs that occured in Whitelist.txt BEFORE the 8.6.48 IP address appeared, I also double-checked using a hex-file-editor and confirmed that there was a standard DOS CR/LF pair character on the line in Whitelist.txt containing 8.6.48 AND on all lines before and after that.
After checking and confirming 8.6.48 was one of the IP blocks used by the google spider, I decided I could NOT leave that IP block out of Whitelist.txt. Lacking any further explanation for why the FOREACH loop concluded 8.6.48 was the same as 75.161.49.58, I decided to relace 8.6.48 with the 255 IP addresses from 8.6.48.1 to 8.6.48.255. That didn't work either because 8.6.48.1 was "judged" the same as 75.161.49.58! So much for the theory the problem had something to do with 3-part IP addresses!
Next, I tried changing my IP address to see what happened. That resulted in my NEW IP address matching an entirely different 3 part IP number in Whitelist.txt. Arghhhhh!
Then I took a deep breath, stopped, sat back and thought about the situation for a minute. By definition, if a visitor is wandering through these directories they're
disobeying my stated robots.txt rules. AFAIK, NO spiders should EVER be allowed to
disobey those rules. Frankly, I don't want to grant ANY spider... even Google's spider -- a "pass" if it's trying to search my forbidden areas. So in the end, I decided it was best to delete ALL rows in Whitelist.txt and thus block every visitor that tries to access those areas.
Conclusion for FIMBLE: At best, this Whitelist code is buggy as hell and needs to be totally rewritten so that it's reliable.
Question: Am I overlooking something critical here or thinking about this blocking disobedient spiders situation wrong? If so, can someone please explain why I need this Whitelist feature at all?
Thanks!