in reply to Re^2: noobie control char removal
in thread noobie control char removal

No idea, short of printing out every character. There's millions of them, though, so going through the list could take time.

Isn't that kind of arbitrary? Why would you remove characters if you have no idea what those characters are? It would make more sense to find out what the character is and add support for it.

You could do that as follows:

open(my $fh, '<:encoding(UTF-8)', $ARGV[0]) or die("Can't open input file \"$ARGV[0]\": $!\n"); $_ = do { local $/; <$fh> }; s/([^\x0A\x20-\x7E])/ sprintf '<U+%04X>', $1 /eg; print;
My name is Éric.
I don't speak 日本語.

would show up as

My name is <U+00C9>ric. I don't speak <U+65E5><U+672C><U+8A9E>.

(Replace the encoding as appropriate.)

Update: Added means of identifying characters.