in reply to Re^2: Unicode2ascii
in thread Unicode2ascii

jbert, when you open a file in your editor, how does the editor know whether two bytes next to each other represent 2 separate characters or one "utf8" character?

Simple Answer: It doesn't. Depending on the editor, it either needs to be told, requires a specific format, or requires the file to be in the encoding used by the system.

Complex Answer: Editors can tell the difference between the different unicode encodings (but not non-unicode encodings) *if* the file starts with a Byte Order Mark. File::BOM can help you in that case.

Update: Added to the simple answer.