http://qs1969.pair.com?node_id=870650


in reply to Problem with Encode::Guess

Newer versions should return

Encodings too ambiguous: iso-8859-1 or utf8

instead of

iso-8859-1 or utf8

Keep in mind that valid UTF-8 is also valid iso-latin-1. It favours UTF-8 if the document starts with a BOM encoded using UTF-8, Otherwise, I think valid UTF-8 will be considered possible iso-latin-1.

but in practice, only accented Latin letters characters in the 80 to A5 range are common in cp437 text files and only characters in the C0 to FF range are common in Latin-1 files.

Encode::Guess just isn't that fuzzy. It's actually very simplistic. It does not appear to be suitable for your task.