The concept of 'bad' characters should be avoided. If you have reason to believe that there are characters that may cause problems, you will do better to make sure that you only keep characters that you know to be good. This keeps you from accidentally letting through characters that you hadn't thought of.
It doesn't need to be a subroutine (as it's a whole one line), but you can do it so you're consistent between various points in the program (or cross program, depending on how you handle it.)
There are three ways to handle this normally -- removal, replacement, and reversable:
sub remove {
my $string = shift;
$string =~ s/[^a-zA-Z0-9.\-_]//g;
return $string;
}
sub replace {
my $string = shift;
$string =~ s/[^a-zA-Z0-9.\-_]/_/g;
return $string;
}
sub reversable {
my $string = shift;
$string =~ s/([^a-zA-Z0-9.\-_])/sprintf('=%x',unpack('C',$1))/eg;
return $string;
}
|