in reply to (jeffa) Re: An ampersand is not well-formed XML data?
in thread An ampersand is not well-formed XML data?


Okay, right on. The problem now is where I should do the encoding. XML::Parser bombs out and dies as soon as it sees the ampersand, before it gets passed to the handler.

I want to be able to either scan the XML from a file or get it from a socket. Am I going to have to read the data from one of those two places first, do the encoding, then have XML::Parser parse the results? That seems hard, because I'd have to decide before parsing what should be parsed (I don't want to go replacing the quotes around XML attributes with " - the XML parser wouldn't be able to parse).

Is there some easier way to do the encoding? Is there any way at all I can keep XML::Parser from crapping out before I get a chance to replace the ampersand?

Thanks...
---
donfreenut

Replies are listed 'Best First'.
Re: Re: (jeffa) Re: An ampersand is not well-formed XML data?
by merlyn (Sage) on Apr 30, 2001 at 21:14 UTC
    Okay, right on. The problem now is where I should do the encoding. XML::Parser bombs out and dies as soon as it sees the ampersand, before it gets passed to the handler.
    It needs to get done before it ends up as so-called XML. It's not XML if the encoding hasn't been done. Go upstream and fix the problem there. If you are getting files in that format, scream at the provider. For them to call it XML is doing a disservice to the meaning of what XML's about.

    -- Randal L. Schwartz, Perl hacker


      I will go scream at nate, but I don't think he'll listen to me :)

      Anyway, are you saying that the Everything Engine should be encoding the special characters into HTML entities before spitting them out as XML?

      ---
      donfreenut
        Anyway, are you saying that the Everything Engine should be encoding the special characters into HTML entities before spitting them out as XML?
        Precisely. It's not XML unless it's well-formed XML. That's the definition! If something is spitting out bad non-XML, then get them to fix it. The whole point of XML is that if everyone plays by the rules, it's all clean and easy.

        -- Randal L. Schwartz, Perl hacker