in reply to Re: large hash of regex substitution strings
in thread large hash of regex substitution strings

Hmmmph. I hadn't looked at HTML::Entities before. I'm already used to using CGI (or CGI::Pretty) and its encodeHTML function, which seems to do pretty much the same thing – (Take a string and substitute escaped HTML for the nonstandard characters.) Is there an advantage to using HTML::Entities? Or is it just that it's a smaller standalone module?

throop

  • Comment on Re^2: large hash of regex substitution strings

Replies are listed 'Best First'.
Re^3: large hash of regex substitution strings
by ikegami (Patriarch) on Oct 06, 2007 at 04:54 UTC

    I never looked at CGI's escapeHTML, so I took a peek.

    escapeHTML/unescapeHTML only converts a few characters.
    That means you you can't place unicode characters in an iso-latin-1 document, only iso-latin-1 characters.
    That means any but a few entities won't be understood. For example, it's unable to unescape é, even if it maps to a character in the specified character set.

    HTML::Entities is familiar with all entities.
    HTML::Entities can numerically encode any range of characters.
    HTML::Entities can decode any range of characters.

    escapeHTML has some workarounds for browser issues and for " being accidentally omitted from HTML 3.2.