ThreeMonks has asked for the wisdom of the Perl Monks concerning the following question:

Hi, dear monks. When I read a notepad generated utf8 text file, I found that there is something cannot be decoded correctly at the beginning of the file. So I wrote a script to generate another utf8 text file that have the same contents as the previous one, and I compared the two files in hex editor. The notepad generated file has 3 extra bytes at the beginning: ef bb bf What are these three bytes, and how to handle them in perl?

Replies are listed 'Best First'.
Re: How to read notepad generated utf8 files?
by almut (Canon) on May 31, 2009 at 13:50 UTC
    What are these three bytes,

    That's a BOM.

    and how to handle them in perl?

    Use File::BOM.

Re: How to read notepad generated utf8 files?
by Anonymous Monk on May 31, 2009 at 13:51 UTC