uva has asked for the wisdom of the Perl Monks concerning the following question:

dear monks,
whenever i opened the utf8 encoded files, I manually skipped the BOM .
Similarly if I write some utf8 files, I used to write BOM manually in that file.
Is there way to do that automatically?
i used the following code to open the utf8 encoded files which is not skipping the BOM if i read from the file handle,
open FH,"<:utf8",$filename; print <FH> ; close FH; __DATA__ the above code printing the BOM along with the text inside the file.
UPDATE : I tried the following code again , but the same problem
open FH,"<:encoding(utf8)",$filename; print <FH> ; close FH; __DATA__ this code also printing the BOM along with the text inside the file.
Can anyone explain in detail whats wrong with the code and whats happening in that code??

Replies are listed 'Best First'.
Re: help needed in unicode files
by idsfa (Vicar) on Mar 28, 2006 at 15:57 UTC

    Yes. File::BOM will do this for you. Not that you should be writing a BOM for UTF-8 (kind of difficult to get confused about the byte order on one byte wide characters), but I expect you are dealing with (broken) MicroSoft apps.

    Updated: You should really read the documentation. There are several examples in it of how to do what you appear to be asking for. Here's one:

    # Read open(HANDLE, '<:via(File::BOM)', $filename) # Write open(HANDLE, '>:encoding(UTF-8):via(File::BOM)', $filename)

    The intelligent reader will judge for himself. Without examining the facts fully and fairly, there is no way of knowing whether vox populi is really vox dei, or merely vox asinorum. — Cyrus H. Gordon

      While there's only one byte order for utf8 (unlike utf16), the BOM might still make sense to easily distinguish utf8 from other encodings.