in reply to parsing xml

If you're open to another way of parsing XML, XML::Twig is a good choice. I added a top-level element to your XML snippet:
use warnings; use strict; use XML::Twig; my $xmlstr = <<EOF; <top> <COMPLETE> <T>test</T> <L>light</L> <INFO>information</INFO> </COMPLETE> <COMPLETE> <T>test</T> <L>light</L> <INFO>informa</INFO> </COMPLETE> </top> EOF my $twig = XML::Twig->new( twig_handlers => {INFO => sub {$_->delete()}}, pretty_print => 'indented' ); $twig->parse($xmlstr); $twig->print(); __END__ <top> <COMPLETE> <T>test</T> <L>light</L> </COMPLETE> <COMPLETE> <T>test</T> <L>light</L> </COMPLETE> </top>

Replies are listed 'Best First'.
Re^2: parsing xml
by mirod (Canon) on Apr 07, 2011 at 07:37 UTC

    To filter out parts of the XML, I usually use a combination of twig_roots on the bits I want to skip, and twig_print_outside_roots to output the rest of the input.

    The only problem with this is that if the XML is indented the way the example is, it leaves empty lines where the discarded part was. I'll have to figure something out to deal with this.

    #!/usr/bin/perl use strict; use warnings; use XML::Twig; my $xmlstr = <<EOF; <top> <COMPLETE> <T>test</T> <L>light</L> <INFO>information</INFO> </COMPLETE> <COMPLETE> <T>test</T> <L>light</L> <INFO>informa</INFO> </COMPLETE> </top> EOF my $twig = XML::Twig->new( twig_roots => {INFO => 1}, twig_print_outside_roots => 1, ); $twig->parse($xmlstr); __END__ <top> <COMPLETE> <T>test</T> <L>light</L> </COMPLETE> <COMPLETE> <T>test</T> <L>light</L> </COMPLETE> </top>