in reply to Re: RTF'ing unicode
in thread RTF'ing unicode

It doesn't seem to for me. I get

<u>&#141;&#76;&#141;&#144;OE&#102;&#141;&#218;</u> - <u>f&#114;f&#87;f +&#108;f&#88; f&#92;fSf...&#129;&#91;f&#86;f++f``</u><u> </u>S&#252;mf +in special !</body>

which is the html entity version of what rtf2text spits out:
LOEfÚ - frfWflfX f\fSf...[fVf++f`` Sümfin special !

Replies are listed 'Best First'.
Re^3: RTF'ing unicode
by choroba (Cardinal) on May 04, 2010 at 23:45 UTC
    Oh, you are right. I see a slightly different output:
    <u>&#141;&#76;&#141;&#144;&#140;&#102;&#141;&#218;</u> - <u>&#131;&#11 +4;&#131;&#87;&#131;&#108;&#131;&#88; &#131;&#92;&#131;&#138;&#131;&#1 +33;&#129;&#91;&#131;&#86;&#131;&#135;&#131;&#147;</u><u> </u>S&#252;m +fin special !
    but it is probably not utf8 either. Why does the RTF header state it is cp 1252?
      I think that may be the default encoding for the text, but it shouldn't matter because the unicode sequences are declared with the various \u commands. That is the only way to include unicode in RTF, AFAIK.