in reply to Re^6: UTF-8 text files with Byte Order Mark
in thread UTF-8 text files with Byte Order Mark
This is a BOM for UTF-16 Big Endian-encoded files.
You are mistaken. It's the BOM, period. It can be encoded using UTF-8 and UTF-16le just as easily as with UTF-16be.
$ perl -MEncode -e'print encode("UTF-8", chr(0xFEFF))' | od -t x1 0000000 ef bb bf 0000003 $ perl -MEncode -e'print encode("UTF-16be", chr(0xFEFF))' | od -t x1 0000000 fe ff 0000002 $ perl -MEncode -e'print encode("UTF-16le", chr(0xFEFF))' | od -t x1 0000000 ff fe 0000002
| FEFF | BOM |
|---|---|
| 2B,2F,76,38,2D | BOM encoded using UTF-7 |
| EF,BB,BF | BOM encoded using UTF-8 |
| FE,FF | BOM encoded using UTF-16be |
| FF,FE | BOM encoded using UTF-16le |
| 00,00,FE,FF | BOM encoded using UTF-32be |
| FF,FE,00,00 | BOM encoded using UTF-32le |
So you won't find FE,FF in a UTF-8 file, but just like in a UTF-16be file, you can find an encoded FEFF in a UTF-8 file.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^8: UTF-8 text files with Byte Order Mark
by silentq (Novice) on May 27, 2013 at 13:44 UTC | |
by ikegami (Patriarch) on May 29, 2013 at 20:18 UTC | |
by Anonymous Monk on May 29, 2013 at 08:19 UTC |