The page in the URL above gives an encoding of iso8859-1, but contains some characters from the cp-1252 character set (specifically the quote signs, hex values 91-94, and minus sign, x96 and x97). Swap those out for ASCII characters and your problem should disappear:
tr/\x93-\x94/\x22/;
tr/\x91-\x92/\x27/;
tr/\x96-\x97/\x2d/;
Note: you should AFAIK be able to do this with the Encode or Text::Iconv modules instead of messing with the character values directly, but somehow this didn't work for me when trying it on the text (possibly because of the mixed encoding).
Update: added minus sign.
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan
|