in reply to What Voodoo Encoding does RTF use for > ASCII Chars?
RTF is fundamentally a ANSI text file. A character set specification is used to set the code page used and may be one of:
However the encoded characters you show suggests that the code page being used is compatible with cp1252 (the wikipedia article's mention of cp1256 is misleading and wrong as far as I can tell). The following may be the voodoo you are looking for:
use strict; use warnings; use Encode; my $x = "foo à, è, ì, ò, ù bar"; my $x1 = encode('cp1252', $x); $x1 =~ s<([\x00-\x1F\x7F-\xFF])> <"\\'" .(unpack("H2",$1))>eg; print $x1;
|
---|