Re^4: Encoding changed from Greek to somethign else

Replies are listed 'Best First'.
Re^5: Encoding changed from Greek to somethign else by daxim (Curate) on Jul 20, 2007 at 18:14 UTC
Encoding(conversion from bytes to chars) Nah, that's decoding. :) Anyway, that file has been double encoded. Unfortunately, that was a lossy conversion, it is not possible to get 100% of the original text back. Run these two commands: `iconv -c -f utf8 -t windows-1253 < problem.txt > problem-1253.txt iconv -c -f utf8 -t ISO-8859-7 < problem.txt > problem-7.txt` [download] iconv for Windows: http://gnuwin32.sourceforge.net/packages/libiconv.htm The two resulting files are in UTF-8. Open them as UTF-8 in your editor. I can recognise some words like Νίκος, Νικόλα in fragments. You have to spend some time to piece the two files together. As for your editor, Windows editors generally write out a BOM for UTF-8, too, even though it is only necessary for the two kinds of UTF-16 encoding. On Windows, an UTF-8 BOM has even some usefulness on normal text files. However, on Unix and on source code files in general an UTF-8 BOM is unwanted, mostly because it interferes with the shebang. You should switch off BOM for UTF-8 files in your editor, if that is not possible, get a better one.	[reply] [d/l]
A reply falls below the community's threshold of quality. You may see it by logging in.