Not exactly on topic, but frequently when I'm dealing with lots of old and weird files and data with characters that are killing my scipts or causing other behavior, I just eliminate all the characters I do not need. Faster than trying to pinpoint which character is causing the problem.
$string =~ s/[^A-Za-z0-9]//g;
If all I need is letters and numbers.