in reply to Re^2: Conversion from UTF-8 to windows-1256 encoding
in thread Conversion from UTF-8 to windows-1256 encoding
"\x{feff}" does not map to cp1256, <IN> line 1. And the character \x{feff} is displayed at the beginning of the file
FEFF is the unicode character code of the BOM (Byte Order Mark). You just have to ignore it (i.e skip over or remove it from the input).
(With UTF-8, the BOM has no real use (the byte order is always the same), but on Windows the BOM is generally used to identify the file as being unicode encoded.)
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^4: Conversion from UTF-8 to windows-1256 encoding
by ikegami (Patriarch) on Oct 29, 2007 at 17:48 UTC |