in reply to Re^5: Removing XML comments with regex
in thread Removing XML comments with regex

so I am hearing that xml;;twig is a performance pig.

Replies are listed 'Best First'.
Re^7: Removing XML comments with regex
by runrig (Abbot) on Dec 28, 2007 at 22:16 UTC
    As Jenda mentions below, XML::Twig is the wrong solution for this problem (update: when exceptional performance is an issue -- or if you just mean memory consumption, that was my own fault, easily corrected by mirod below). I pretty much knew as much before I started, but it was an easily available solution from this thread, so I tried it.

      Maybe you and Jenda should avoid benchmarking tools that you don't really master.

      Your first attempt was perfectly valid as a solution when performance is not an issue. But if it becomes one, then loading the entire document in memory when XML::Twig is specifically designed to avoid this, is kinda lame don't you think?

      The code below is probably not faster than what you have, but a least it should not use too much memory.

      #!/usr/bin/perl use strict; use warnings; use XML::Twig; my $t= XML::Twig->new( keep_spaces => 1, comments => 'drop', twig_handlers => { _all_ => sub { $_[0]->flush +}} ) ->parsefile( "test_comments.xml") ;
        Thanks. I knew that there must be a way to print as you go, but, not ever needing that feature, couldn't pull it out of my hat (and I meant to mention that earlier, but oh well...and I think I have actually used flush() by accident when I was looking for purge() ). Ok, "wrong solution" was maybe too strong..."not the quickest" or "not if performance is a big concern" maybe would've been more appropriate. But I don't think not mastering something will ever keep me from attempting to benchmark it :-)

      I would rather say wrong tool than wrong solution. Which (as mirod's response shows) doesn't mean it's not possible to use XML::Twig efectively, but rather that it was designed for a different kind of tasks. Which definitely doesn't mean the module itself is bad. Far from that and sorry if my comment sounded that way.