in reply to Spreadsheet::XLSX returning &lt; &gt; and &amp; instead of < > &

I would assume this has something to do with Open Office XML. http://en.wikipedia.org/wiki/Office_Open_XML

Microsoft states that there are 4 reserved characters: <, >, &, and %. (http://msdn.microsoft.com/en-us/library/ms145315%28v=sql.90%29.aspx). I can't find a W3C document that calls this out specifically, but I deal with XML on a regular basis and use this to sanitize user input and it works properly with XML::Twig.

Those 4 characters are probably transformed into entities as listed on the link above when they were saved into the document.
  • Comment on Re: Spreadsheet::XLSX returning &lt; &gt; and &amp; instead of < > &

Replies are listed 'Best First'.
Re^2: Spreadsheet::XLSX returning &lt; &gt; and &amp; instead of < > &
by psynk (Initiate) on Mar 07, 2013 at 16:24 UTC
    Thanks for the references. I agree, xlsx format is storing the XML data.

    I'm a bit of a newbie to perl, so rather than try to figure out how to use XLM::Twig, I'm just going to do a couple regexp substitution strings and call it a day. Thanks you all for the quick replies.