Encoding(conversion from bytes to chars)
Nah, that's decoding. :)
Anyway, that file has been double encoded. Unfortunately, that was a lossy conversion, it is not possible to get 100% of the original text back.
Run these two commands:
iconv -c -f utf8 -t windows-1253 < problem.txt > problem-1253.txt
iconv -c -f utf8 -t ISO-8859-7 < problem.txt > problem-7.txt
iconv for Windows: http://gnuwin32.sourceforge.net/packages/libiconv.htm
The two resulting files are in UTF-8. Open them as UTF-8 in your editor. I can recognise some words like Νίκος, Νικόλα in fragments. You have to spend some time to piece the two files together.
As for your editor, Windows editors generally write out a BOM for UTF-8, too, even though it is only necessary for the two kinds of UTF-16 encoding. On Windows, an UTF-8 BOM has even some usefulness on normal text files. However, on Unix and on source code files in general an UTF-8 BOM is unwanted, mostly because it interferes with the shebang. You should switch off BOM for UTF-8 files in your editor, if that is not possible, get a better one. |