asifk1981 has asked for the wisdom of the Perl Monks concerning the following question:

I need a small help. I am parsing an xml file using XML::Parser.I could see patterns like & in the parser output. Can somebody please tell me what are all the characters(like ampersand) do i need to convert back to original form? I don't want the output strings to have any of these encoding patterns.hope this is clear.....

Replies are listed 'Best First'.
Re: xml formatting
by Aristotle (Chancellor) on Dec 26, 2003 at 12:14 UTC

    You mean your parser unescapes your input automatically. You probably want them escaped again for output. The list of characters is potentially very long though, depending on what way you output data. F.ex, if you're outputting XML yourself, any character not included in the document's specified charset has to be escaped.

    How to do it depends on what output you produce. Most modules that build XML trees will automatically escape any text for you. For HTML, there's the HTML::Entities module. For other cases, you need to provide more info.

    Makeshifts last the longest.

Re: xml formatting
by asifk1981 (Novice) on Dec 27, 2003 at 09:49 UTC
    I have a problem with decode_entities now. I am passing an xml file contents through decode_entities function before it is parsed using XML :: Parser.This is to decode all HTML encoded characeters in the xml file(For eg. XML file has "&" for "&"). But when I parse the xml file after that, the parser thows an error saying "non_well formed at line ". Can someone tell me why this happens? I want the output generated by XML::parser not to contain any HTML encodings. Hope this is clear... Thanks in advance!