Re^3: Conversion from UTF-8 to windows-1256 encoding

"\x{feff}" does not map to cp1256, <IN> line 1. And the character \x{feff} is displayed at the beginning of the file

FEFF is the unicode character code of the BOM (Byte Order Mark). You just have to ignore it (i.e skip over or remove it from the input).

(With UTF-8, the BOM has no real use (the byte order is always the same), but on Windows the BOM is generally used to identify the file as being unicode encoded.)

Comment on Re^3: Conversion from UTF-8 to windows-1256 encoding Download Code

Replies are listed 'Best First'.
Re^4: Conversion from UTF-8 to windows-1256 encoding by ikegami (Patriarch) on Oct 29, 2007 at 17:48 UTC
Not just Windows. At least one World Wide Web Consortium spec has a similar convention. CSS agents should check for various encoding of the BOM, including its UTF-8 encoding.	[reply]