in reply to XML / regex - cleaning up attributes

The XML in question comes from my code so complaints are typically ignored. The attribute in question comes from paths to windows apps so just about anything can (and does) appear. I run a series of regex commands on the XML before I pass it to the parser. This particular situation just popped up recently though.
  • Comment on Re: XML / regex - cleaning up attributes

Replies are listed 'Best First'.
Re^2: XML / regex - cleaning up attributes
by halfcountplus (Hermit) on Oct 01, 2010 at 17:27 UTC
    >>The XML in question comes from my code so complaints are typically ignored.

    :LOL: Okay, so you are saying you produced the XML in the first place? Why don't you clean the path first then -- not only easier, but more efficient than getting some module to parse the bad xml afterward.

    If that field can really contain anything you can't just swap ' for " as a delimiter, and CDATA has the same delimiter issue. That is the crux of the issue: you need a delimiter, either ' or " or CDATA. Choose one (IMO: stick with ') and replace that delimiter in the data before you create the xml.

    s/'/'/g

      You also need to replace & with &.