in reply to What Voodoo Encoding does RTF use for > ASCII Chars?
Old thread, but just in case anyone happens to stumble against this thread as I did, here is an addition that might be useful.
RTF specs admit non-ansi Unicode characters, by escaping them in decimal format. So in case your string has some non-ansi characters, you will also need to convert them to escaped. Fortunately, Encode can do it for you. Here is an example:
use Modern::Perl; use utf8; use Encode; my $text = "Gen 1:1 Ἐν ἀρχῇ ἐ&# +960;οίησεν ὁ θεὸ +ς τὸν οὐρανὸ_ +7; καὶ τὴν γῆν."; $text = encode('cp1253', $text, sub{ sprintf "\\u%d\?", shift }); # cp +1253 is for Greek) $text =~ s/([\x00-\x1F\x7F-\xFF])/"\\'" .(unpack("H2",$1))/eg; say $text;
Output will be:
Gen 1:1 \u7960?\'ed \u7936?\'f1\'f7\u8135? \u7952?\'f0\'ef\u8055?\'e7\'f3\'e5\'ed \u8001? \'e8\'e5\u8056?\'f2 \'f4\u8056?\'ed \'ef\u8016?\'f1\'e1\'ed\u8056?\'ed \'ea\'e1\u8054? \'f4\u8052?\'ed \'e3\u8134?\'ed.
|
|---|