The man page for RTF::Writer doesn't discuss this, but what if you want to create an rtf file containing unicode text in some obscure language that uses a wacko writing system? I played with a sample unicode file (in Thai), managed to load it into MS Word and "Save as ... RTF"; this showed me the rtf expression for unicode characters. Well, all we have to do is express our non-ascii unicode characters that way, pass the result to RTF::Writer, and we're done...
#!/usr/bin/perl use strict; use RTF::Writer; die "Usage: $0 file.txt\n (this will create file.rtf)\n" unless ( @ARGV and $ARGV[0] =~ /\.txt$/ and -f $ARGV[0] ); # input file is expected to be utf8 open( I, "<:utf8", $ARGV[0] ) or die "$ARGV[0]: $!"; my $utf = do { local $/; <I> }; # slurp it # here's the magic part: replace each wide character with # "\uN\5f", where "N" is the decimal numeric codepoint: $utf =~ s/([^[:ascii:]])/sprintf("\\u%d\\'5f",ord($1))/eg; ( my $out = $ARGV[0] ) =~ s/txt$/rtf/; my $rtf = RTF::Writer->new_to_file( $out ); my @pars = split( /\n+/, $utf ); $rtf->prolog( title => $out ); for my $par ( @pars ) { $rtf->paragraph( \$par ); # need to pass $par by reference } $rtf->close;
Is that easy, or what?

Update: Caveat emptor: YMMV, especially depending on what language you're dealing with (I've only tried Thai so far). I don't know how/whether RTF handles bidirectional text (Arabic- or Hebrew-based writing systems), and various Indic scripts (Hindi, Tamil, etc) might have weird problems if the rtf recipient doesn't know the "special rules" for rendering those languages. But the "normal, tidy, linear" languages (Cyrillic, Greek, Chinese, Korean, etc) should be fine (assuming you have the appropriate unicode fonts available).


In reply to Writing Unicode to an RTF file by graff

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.