in reply to Re^6: Removing XML comments with regex
in thread Removing XML comments with regex

As Jenda mentions below, XML::Twig is the wrong solution for this problem (update: when exceptional performance is an issue -- or if you just mean memory consumption, that was my own fault, easily corrected by mirod below). I pretty much knew as much before I started, but it was an easily available solution from this thread, so I tried it.

Replies are listed 'Best First'.
Re^8: Removing XML comments with regex
by mirod (Canon) on Dec 29, 2007 at 03:51 UTC

    Maybe you and Jenda should avoid benchmarking tools that you don't really master.

    Your first attempt was perfectly valid as a solution when performance is not an issue. But if it becomes one, then loading the entire document in memory when XML::Twig is specifically designed to avoid this, is kinda lame don't you think?

    The code below is probably not faster than what you have, but a least it should not use too much memory.

    #!/usr/bin/perl use strict; use warnings; use XML::Twig; my $t= XML::Twig->new( keep_spaces => 1, comments => 'drop', twig_handlers => { _all_ => sub { $_[0]->flush +}} ) ->parsefile( "test_comments.xml") ;
      Thanks. I knew that there must be a way to print as you go, but, not ever needing that feature, couldn't pull it out of my hat (and I meant to mention that earlier, but oh well...and I think I have actually used flush() by accident when I was looking for purge() ). Ok, "wrong solution" was maybe too strong..."not the quickest" or "not if performance is a big concern" maybe would've been more appropriate. But I don't think not mastering something will ever keep me from attempting to benchmark it :-)
Re^8: Removing XML comments with regex
by Jenda (Abbot) on Jan 02, 2008 at 23:37 UTC

    I would rather say wrong tool than wrong solution. Which (as mirod's response shows) doesn't mean it's not possible to use XML::Twig efectively, but rather that it was designed for a different kind of tasks. Which definitely doesn't mean the module itself is bad. Far from that and sorry if my comment sounded that way.