in reply to Remove u200b unicode From String

If you can print the character to somewhere where you can copy it, e.g. to an xterm, you can just paste it into your regular expression and it should work. For example, using the codepoint 478 which is an A with some dots above:
perl -we '$chr = "Ǟ"; $s = "abc" . $chr . "xyz"; print "$s\n"; $s =~ s/$chr/ /g; print "$s\n"'
outputs
abcǞxyz
abc xyz
Alternatively, you can do something like the following to find characters outside the ascii range:
use Encode; my $s = get_s_from_somewhere(); my $chars = decode("UTF-8", $s); my %non_ascii; for my $i (0..length($chars)-1) { if( ord(substr($chars, $i, 1)) > 127 ) { $non_ascii{ substr($chars, $i, 1) }++; } } do_something_with_non_ascii(\%non_ascii)