Locutus has asked for the wisdom of the Perl Monks concerning the following question:
given a text file encoded in UTF-8 I have to replace each pair of combined characters (e.g., 0x61 0xCC 0x88 = LATIN SMALL LETTER A + COMBINING DIAERESIS) by the corresponding single pre-combined character (in the example: 0xC3 0xA4 = LATIN SMALL LETTER A WITH DIAERESIS) if such exists. This problem sounds like made for Perl but I haven't been able to find something useful in CPAN, yet. Can you point me to the right direction, please?
Best regards
Locutus
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Conversion of combined into pre-combined Unicode characters
by ikegami (Patriarch) on Mar 25, 2010 at 16:34 UTC | |
by Locutus (Beadle) on Mar 25, 2010 at 16:44 UTC |