Ok, I'm a bit confused here.

I am trying to create a feed for an iPhone app but I need utf8 values for the title field of the feed and html entities for the description field of the feed.

I am reading a few feeds and compressing them into on larger feed that the app reads. The description is already encoded into it's html entities, so that part is done.

However, the title needs to be converted to utf-8 as there are many latin characters (accute, etc) for our Spanish content. I use the decode_entities function from the Entities.pm lib, but I'm noticing that the char values that it converts to are breaking my feed -- what I mean is that the feed is not valid, even though there are CDATA tags around the title field. Upon further investigation, I've discovered that this lib is converting from an html entity to it's corresponding unicode value but I think it is using the ISO-8859 instead of utf8.

What am I doing wrong? How do I convert from an html-entity to utf8? I actually don't need the feeds to be valid -- my objective-c parser has no problem reading the feed as ascii, etc. but I need others to use this feed and need them to be valid.


In reply to Confused about using Entities.pm by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.