in reply to Re^2: Strange Characters - Different Encoding?
in thread Strange Characters - Different Encoding?

The page in the URL above gives an encoding of iso8859-1, but contains some characters from the cp-1252 character set (specifically the quote signs, hex values 91-94, and minus sign, x96 and x97). Swap those out for ASCII characters and your problem should disappear:

tr/\x93-\x94/\x22/; tr/\x91-\x92/\x27/; tr/\x96-\x97/\x2d/;

Note: you should AFAIK be able to do this with the Encode or Text::Iconv modules instead of messing with the character values directly, but somehow this didn't work for me when trying it on the text (possibly because of the mixed encoding).

Update: added minus sign.


Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan

Replies are listed 'Best First'.
Re^4: Strange Characters - Different Encoding?
by JukeBox (Initiate) on Nov 06, 2005 at 02:14 UTC
    Thanks for your help tirwhan, much appreciated.