Everyone seems to have lept to the assumption that your "text file with some weird characters" in it—an
, which is not so weird really—is in the Unicode coded character set. It may be, or it may be in the Windows-1252 character encoding. The former is a multi-byte encoding and the latter is a single-byte encoding. The difference is fundamental. So, first, you need to know whether your text file is in some encoding form of Unicode (e.g., UTF-8) or in the Windows-1252 character encoding—or even possibly in some other legacy encoding.