in reply to Re: Preferred Methods (again)
in thread Preferred Methods (again)

<!-- notroot: <Root> -->
Or
<root><foo/><root><bar/></root></root> <!-- Yes, your regex will take +the second <root> -->


If parsing XML data could be done with a simple regex, those modules would probably not exist.

2;0 juerd@ouranos:~$ perl -e'undef christmas' Segmentation fault 2;139 juerd@ouranos:~$

Replies are listed 'Best First'.
Re: Re: Re: Preferred Methods (again)
by perrin (Chancellor) on Jan 17, 2002 at 01:33 UTC
    Get off your high horse about XML compliance. He gave a sample input format and asked how to grab pieces of it. If he changes the input format or wants it to handle broken input, he has to change the way he parses. That's true with an XML parser too.
      seattlejohn already commented on xml compliance.

      With an XML parser you don't have to change your parsing for grabbing the root element when the input changes, with a regex you (probably) do.

      2;0 juerd@ouranos:~$ perl -e'undef christmas' Segmentation fault 2;139 juerd@ouranos:~$

        With an XML parser you don't have to change your parsing for grabbing the root element when the input changes

        Not when the content changes, but when the format changes you do. You example was a format change: <root><foo/><root><bar/></root></root> <!-- Yes, your regex will take the second <root> --> If that's even legal, it would certainly require changes in your code to get the right part.

Re: Re: Re: Preferred Methods (again)
by BMaximus (Chaplain) on Jan 17, 2002 at 02:21 UTC
    If parsing XML data could be done with a simple regex, those modules would probably not exist.

    It's possible. Just that it doesn't have any error checking.
    See: Parsing pseudo XML files

    BMaximus
      XML parsing is not possible with a *simple* regexp, it requires a proper parser. Having said that, it's possible with lots of complex regexps, or at least mostly possible. See Paul Kulchenko's XML::Parser::Lite for that.

      Now parsing a subset of XML, well that's a totally different matter. And entirely appropriate in certain situations. Yes, that is me saying that.

      Oh, and XML::SAX::PurePerl has plenty of error checking. But it's likely way too slow for the questioner's problem.