comment on

When reading in CGI form fields from a multilingual, utf8 web application, it is not feasible to use the standard idiom for stripping evil characters:

$string =~ s/[^\w\s\.\,]//g; #plus any other metachars you want
# OR
$rawstring = m/([\w\s\.\,]+)/; #plus any others...
$string = $1;
[download]

Since users will be giving me all kinds of high bytes in order to give double-byte utf8 stuff, I need to be more accepting, as I understand it.

However, there persists the CGI Security and the null byte problem issue. Since the null byte can be used to fool various resources, I am tempted to subclass CGI and have the param() method do a s/\x00//g on *everything*. Is this ill-advised -- meaning, might the null byte ever show up in valid utf-8 text?

Remember, CGI uploads of binary files are handled through a different mechanism, so those would not be affected by overriding param(). Does a wise man always strip null bytes from param() returns, and if so, why isn't that the default behavior?

In reply to Extra CGI.pm safety by stripping \x00 bytes? by rlucas

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.