oilerfan21 has asked for the wisdom of the Perl Monks concerning the following question:

Hey there, I'm trying to parse an XML file that looks roughly like this:
<level1> <level2> <level3> <name>blah</name> <rule>...</rule> <rule>...</rule> </level3> <level3> <name>blah2</name> <rule>...</rule> <rule>...</rule> </level3> </level2> </level1>
I suspect that what I'm trying to do should be easy, but I'm struggling a little based on the doc and the examples I can find with google. What I need is the full 'element' ( I think that the right term) level3, but only if the 'name' of the element begins with a known string? I also need the output to stay in XML format. I've been playing with XML::Twig and I can with twig_roots get all of the level3 elements, but how can I filter out the ones that aren't named according to my known pattern? Thanks in advance for you time! Oilerfan21

Replies are listed 'Best First'.
Re: XML::Twig parsing
by mirod (Canon) on Jun 12, 2008 at 15:41 UTC

    You can use ignore on an element in a handler to skip the element. The element needs to be the current one or one of its ancestors. That means that you can set a handler on level3/name, check whether you want to keep the element or not there, and skip the entire level3 element if you want to:

    #!/usr/bin/perl use strict; use warnings; use XML::Twig; XML::Twig->new( twig_handlers => { 'level3/name' => sub { if( $_->text + !~ m{^include}) { $_->parent->ignore; } }, level3 => sub { print "eleme +nt processed:\n", $_->sprint, "\n"; } }, ) ->parse( \*DATA); __END__ <level1> <level2> <level3> <name>include</name> <rule>included rule 1</rule> <rule>included rule 2</rule> </level3> <level3> <name>exclude</name> <rule>excluded rule</rule> <rule>excluded rule</rule> </level3> </level2> </level1>

    Does this help?

      Perfect! Very helpful and timely! I've still got some tweaking to do, but your code has helped me tremendously. Thank you very much!