in reply to Re^5: convert files to ansi (8859-1)
in thread convert files to ansi (8859-1)

Okay. So perl works internal using single-byte code?

But why does eval decode iso-8859-1 don't throw an error if the input file is utf8?

Replies are listed 'Best First'.
Re^7: convert files to ansi (8859-1)
by Corion (Patriarch) on Mar 29, 2017 at 08:40 UTC

    Every file is valid ISO-8859-1, because ISO-8859-1 is a single-byte encoding.

      Well, that explains alot ... so I need to look for another way to validate the encoding. Is there any known way to do this?

      I read about Encode::Guess, maybe I have to take a look on it?

        My approach to guessing the encoding would be to look for well-known phrases/trigrams. For example, if you know the language of the text, look for trigrams (or longer sequences) that indicate the encoding.

        "über" would be a good German word which commonly (enough) appears in the text and if you get

        "\xFCber" # ANSI / Latin-1 "\xC3\xBCber" # UTF-8