pinnacle has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to remove tag "INFO" and it's content from the file and print the remaining file but can't get result, I am not sure what going wrong please help!!

<COMPLETE> <T>test</T> <L>light</L> <INFO>information</INFO> </COMPLETE> <COMPLETE> <T>test</T> <L>light</L> <INFO>informa</INFO> </COMPLETE> Above xml is in file 'test.xml' open(OUT,"/home/test.xml"); while(<OUT>){ $line = $_; if($line =~ m#<INFO>(.+?)</INFO>#ig) { next; } print "$line\n"; }

When I run the above code I only get:

<COMPLETE> <T>test</T> <L>light</L> <INFO>information</INFO> </COMPLETE>

Replies are listed 'Best First'.
Re: parsing xml
by toolic (Bishop) on Apr 06, 2011 at 18:39 UTC
    If you're open to another way of parsing XML, XML::Twig is a good choice. I added a top-level element to your XML snippet:
    use warnings; use strict; use XML::Twig; my $xmlstr = <<EOF; <top> <COMPLETE> <T>test</T> <L>light</L> <INFO>information</INFO> </COMPLETE> <COMPLETE> <T>test</T> <L>light</L> <INFO>informa</INFO> </COMPLETE> </top> EOF my $twig = XML::Twig->new( twig_handlers => {INFO => sub {$_->delete()}}, pretty_print => 'indented' ); $twig->parse($xmlstr); $twig->print(); __END__ <top> <COMPLETE> <T>test</T> <L>light</L> </COMPLETE> <COMPLETE> <T>test</T> <L>light</L> </COMPLETE> </top>

      To filter out parts of the XML, I usually use a combination of twig_roots on the bits I want to skip, and twig_print_outside_roots to output the rest of the input.

      The only problem with this is that if the XML is indented the way the example is, it leaves empty lines where the discarded part was. I'll have to figure something out to deal with this.

      #!/usr/bin/perl use strict; use warnings; use XML::Twig; my $xmlstr = <<EOF; <top> <COMPLETE> <T>test</T> <L>light</L> <INFO>information</INFO> </COMPLETE> <COMPLETE> <T>test</T> <L>light</L> <INFO>informa</INFO> </COMPLETE> </top> EOF my $twig = XML::Twig->new( twig_roots => {INFO => 1}, twig_print_outside_roots => 1, ); $twig->parse($xmlstr); __END__ <top> <COMPLETE> <T>test</T> <L>light</L> </COMPLETE> <COMPLETE> <T>test</T> <L>light</L> </COMPLETE> </top>
Re: parsing xml
by wind (Priest) on Apr 06, 2011 at 18:35 UTC
    Take off the 'g' modifier. Otherwise, your code is fine:
    # open my $fh, '/home/test.xml' or die $!; my $fh = \*DATA; while (<$fh>) { next if m{<INFO>(.*?)</INFO>}i; print; } __DATA__ <COMPLETE> <T>test</T> <L>light</L> <INFO>information</INFO> </COMPLETE> <COMPLETE> <T>test</T> <L>light</L> <INFO>informa</INFO> </COMPLETE>
    Would be better if your used an xml parser like XML::Twig though.
Re: parsing xml
by Jenda (Abbot) on Apr 07, 2011 at 13:24 UTC

    And then ... two years from now ... someone puts two lines of text into the <INFO> ...

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.

Re: parsing xml
by locked_user sundialsvc4 (Abbot) on Apr 07, 2011 at 15:30 UTC

    What I would suggest is ... “if it is XML, then treat it stem-to-stern as XML.”   Parse it using a tool like XML::Twig, and use XPath expressions to (effortlessly...) locate all of the <INFO> tags.   Remove the nodes, then transform back into text for printing.

    Although this might sound like “extra work,” IMHO it really isn’t, because it thoroughly solves the problem, both in the short-run and in the future.   And it does so by pushing the hard work onto the backs of CPAN modules.

Re: parsing xml
by perl_addict (Initiate) on Apr 07, 2011 at 05:46 UTC
    I just tried following code and it works for me:-
    open FH, "test.xml"; while (<FH>) { next if ($_ =~ /\<INFO/ig); print $_; }