given a text file encoded in UTF-8 I have to replace each pair of combined characters (e.g., 0x61 0xCC 0x88 = LATIN SMALL LETTER A + COMBINING DIAERESIS) by the corresponding single pre-combined character (in the example: 0xC3 0xA4 = LATIN SMALL LETTER A WITH DIAERESIS) if such exists. This problem sounds like made for Perl but I haven't been able to find something useful in CPAN, yet. Can you point me to the right direction, please?
Best regards
Locutus
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |