in reply to How best to avoid mojibake, when attempting to automatically convert documents to utf-8?

Chris,

You may want to read both of these two threads:  What's the best way to detect character encodings, Windows-1252 v. UTF-8? and What's the best way to detect character encodings? (Redux).

Unfortunately, I haven't yet found a Perl module to do character encoding detection (guessing) that I like or can use.

Jim

  • Comment on Re: How best to avoid mojibake, when attempting to automatically convert documents to utf-8?

Replies are listed 'Best First'.
Re^2: How best to avoid mojibake, when attempting to automatically convert documents to utf-8?
by taint (Chaplain) on Dec 23, 2013 at 06:23 UTC
    Jim, thank you very much for the reply.

    Indeed. Those were very pertinent nodes. Seems you also struggle with the lack of such a utility/module/{...}. :)

    Honestly. I can't for the life of me, understand why this sort of thing hasn't already been solved. Which is why I felt it worth all the work, and research likely involved. After all, the world is now very much a world of utf-8. It's no longer a "concept".

    Thank you again, for the resources, and reply, Jim ++

    --Chris

    ¡λɐp ʇɑəɹ⅁ ɐ əʌɐɥ puɐ ʻꜱdləɥ ꜱᴉɥʇ ədoH