in reply to A Character Set Enquiry
What character set does perl use?
When you read strings in perl, shuffle them around and don't do much more, perl treats the strings as binary data.
Your Ω in UTF-8 is looks like this:
echo -n "Ω"|hexdump -C 00000000 e2 84 a6
(The Omega character in the paste isn't showing correctly in code examples, imagine it being there instead of the HTML escape sequence)
When you import that into a Latin1 database, it interprets that as a sequnce of Latin1 characters, which is "âè¦" in your case.
Now you said you converted that to utf-8. A Latin1 "\x{e2}" becomes c3 a2, or â as a character.
Now you have to reverse that process step by step. I wish you much patience, and a good read of Encode, perluniintro and perlunicode.
Or if you have the chance, restore your data from a backup, and dump it into an utf8 database in the first place.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: A Character Set Enquiry
by Godsrock37 (Sexton) on Jul 11, 2008 at 12:50 UTC | |
by moritz (Cardinal) on Jul 13, 2008 at 16:38 UTC | |
by Godsrock37 (Sexton) on Jul 11, 2008 at 14:18 UTC | |
by massa (Hermit) on Jul 11, 2008 at 16:07 UTC |