in reply to Re: Regex et XML
in thread Regex et XML

Technically XML::Simple operates as a kind of (if not actually) a DOM parser (evidentally with some SAX or SAX-like options if you want those) -- but I agree with you. It's canon that parsing XML with regexes is a path to destruction.

Perhaps the OP has severe memory concerns (i.e. over bloated giant XML file)? If so, writing a SAX parser only to remove pieces of XML would be painful and DOM would not be a good choice. Yet line-oriented parsing isn't going to work with XML anyway, so you are slurping -- hence memory issues again. Yep, it would be best to pick one of the other (DOM-ish or SAX-ish), despite the tradeoffs. Maintaining Yet-Another-XML-Manipulator would be quite painful. If the file is small by machine standards, absolutely, XML::Simple is the easiest way to go. Do it, and you can still think mostly in Perl!

Replies are listed 'Best First'.
Re: Re: Re: Regex et XML
by waswas-fng (Curate) on Feb 25, 2004 at 23:05 UTC
    I agree. I think most projects that start off with homebrewed regex XML parsers tend to work fine at first. When the project starts to mature, you end up writing more conditionals into your parser. Over time you will find that you have just written something that is not-so-pretty, not-so-flexible, and not-so-supportable as XML::Simple. for most projects it makes sense to bail out early and use one of the XML parser/manipulators on cpan. Even when I have a very strong feeling my projects will not creap in terms of what I do with XML, I tend to use a cpan module for the manip and parse.


    -Waswas