in reply to Preferred Methods (again)

Why loop?
# trim to just the Root node $xmlIn =~ s/^.*(<Root.*?>).*$/$1/s; # grab the key/value pairs %rootAttr = ($xmlIn =~ m#(\S+?)="(\S+?)"#g);
(Untested. Not certain that second regex returns a list. Might need to be in a loop after all.)

Replies are listed 'Best First'.
Re: Re: Preferred Methods (again)
by Juerd (Abbot) on Jan 17, 2002 at 01:25 UTC
    <!-- notroot: <Root> -->
    Or
    <root><foo/><root><bar/></root></root> <!-- Yes, your regex will take +the second <root> -->


    If parsing XML data could be done with a simple regex, those modules would probably not exist.

    2;0 juerd@ouranos:~$ perl -e'undef christmas' Segmentation fault 2;139 juerd@ouranos:~$

      Get off your high horse about XML compliance. He gave a sample input format and asked how to grab pieces of it. If he changes the input format or wants it to handle broken input, he has to change the way he parses. That's true with an XML parser too.
        seattlejohn already commented on xml compliance.

        With an XML parser you don't have to change your parsing for grabbing the root element when the input changes, with a regex you (probably) do.

        2;0 juerd@ouranos:~$ perl -e'undef christmas' Segmentation fault 2;139 juerd@ouranos:~$

      If parsing XML data could be done with a simple regex, those modules would probably not exist.

      It's possible. Just that it doesn't have any error checking.
      See: Parsing pseudo XML files

      BMaximus
        XML parsing is not possible with a *simple* regexp, it requires a proper parser. Having said that, it's possible with lots of complex regexps, or at least mostly possible. See Paul Kulchenko's XML::Parser::Lite for that.

        Now parsing a subset of XML, well that's a totally different matter. And entirely appropriate in certain situations. Yes, that is me saying that.

        Oh, and XML::SAX::PurePerl has plenty of error checking. But it's likely way too slow for the questioner's problem.