in reply to Re^2: Unicode2ascii
in thread Unicode2ascii
jbert, when you open a file in your editor, how does the editor know whether two bytes next to each other represent 2 separate characters or one "utf8" character?
Simple Answer: It doesn't. Depending on the editor, it either needs to be told, requires a specific format, or requires the file to be in the encoding used by the system.
Complex Answer: Editors can tell the difference between the different unicode encodings (but not non-unicode encodings) *if* the file starts with a Byte Order Mark. File::BOM can help you in that case.
Update: Added to the simple answer.
|
|---|