This subject is discussed in the Perl XML FAQ. Given that you don't know what encoding(s?) the original data used, you might find the 'sanitise' function in the FAQ useful.
I regularly hit this problem when people paste stuff from MSWord since the 'smart quote' characters are not in the ISO-8859-1 set.
In reply to Re: Generating UTF-8 from nasty high ASCII input
by grantm
in thread Generating UTF-8 from nasty high ASCII input
by samtregar
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |