laziness, impatience, and hubris | |
PerlMonks |
Control Characters (\xNN) in HTMLby garliqua (Novice) |
on Oct 18, 2001 at 20:15 UTC ( [id://119722]=perlquestion: print w/replies, xml ) | Need Help?? |
garliqua has asked for the wisdom of the Perl Monks concerning the following question: Wise monks, I've got one for you. I'm developing a content management system where the data are all stored in XML files. Everything is groovy with one exception: if a user tries to submit a web page with control characters (such as \x92 for single right quote) in it, then the XML Parser (XML::Simple, which uses XML::Parser) coughs, sputters and dies. So, what I'd like to do is have a single regexp just go through and change all the \xNN characters to their XHTML entity equivalent. For instance, the single character \x92 would become ’. My problem is that I can't seem to get something along these lines to work: s/\x(\d+)/'&#' . hex($1) . ';'/ge I think I know why this doesn't work (because the \d+ is searching for multiple digit characters whereas what I want is to find the single character specified by an expression like \x92 or \x93). If I can avoid doing it, I'd rather not do something like:
Perhaps there is a solution involving pack(), though it hasn't occurred to me yet. Any ideas? Thanks.
Back to
Seekers of Perl Wisdom
|
|