chris23, on Mar 3 2008, 01:13 PM, said:
I agree with you about getting rid of % from the allowed characters. As you were asking for ideas, rather than get rid of it completely, would a new regexp of allowed encoded characters be any better, say allowing %20 for space but explicitly excluding %3C and %3E (<> definite no-nos)?
This is very much in mind, however %3C isn't enough you need %253E as well because of double encoding.
str_replace(array('<', '%3C', '%253C', '>', '%3E', '%253E'), array('', '', '', '', '', ''), $get_var);
and then there's octals and then there's hexadecimal .. I'm still looking at these.
Quote
I appreciate this would be difficult in view of the different payment modules but just getting rid of the worst suspects would be an improvement IMHO.
This wont really be worthy of its name until it DOES start breaking more scripts. The idea should be that it sanitizes as much as possible and only "trusted" scripts get to be allowed through as an exception.
The % should definately be removed from the preg_replace otherwise all you need do (well nearly) is urlencode a querystring to bypass cleansing. That or urldecode before cleansing.