in reply to EncodingConversion in perl
It will help to have tools that allow you to view utf8 text data in a more detailed, explicit manner -- for example (shameless plug), I posted a couple command-line scripts that can help for both confirming valid utf8 data and diagnosing faulty data: tlu -- TransLiterate Unicode, and unichist -- count/summarize characters in data. They might help you in figuring out a suitable idiom for handling your text data in a way that avoids corruption.
|
|---|