rjkoop has asked for the wisdom of the Perl Monks concerning the following question:

I'm a little confused using the Perl XML::Twig module. What I'm trying to do is output a subset of the matched twig_root entries in an XML file. Essentially I call a method for the 'definition' twig_root and then I only want to output some of the definitions based on a condition. Of course I also want the root of the XML document to be output. It isn't. What am I doing wrong? I expect the <oval> tag and the selected <definition> and nothing else. I'm using the root <oval> tag.

XML

===

<?xml version="1.0" encoding="UTF-8"?> <oval xsi:schemaLocation="http://oval.mitre.org/XMLSchema/oval#redhat redhat-schema.xsd http://oval.mitre.org/XMLSchema/oval#windows windows-schema.xsd http://oval.mitre.org/XMLSchema/oval#unix unix-schema.xsd http://oval.mitre.org/XMLSchema/oval#independent independent-schema.xsd http://oval.mitre.org/XMLSchema/oval#solaris solaris-schema.xsd http://oval.mitre.org/XMLSchema/oval oval-schema.xsd" xmlns:oval="http://oval.mitre.org/XMLSchema/oval" xmlns="http://oval.mitre.org/XMLSchema/oval" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:redhat="http://oval.mitre.org/XMLSchema/oval#redhat" xmlns:windows="http://oval.mitre.org/XMLSchema/oval#windows"xmlns:sola +ris="http://oval.mitre.org/XMLSchema/oval#solaris"> <generator> <schema_version>4.2</schema_version> <timestamp>20051212102623</timestamp> </generator> <definitions> <definition id="OVAL2" class="vulnerability"> <affected family="redhat"> <redhat:platform>Red Hat Linux 9</redhat:platform> <product>Mutt</product> </affected> </definition> <definition id="OVAL3" class="vulnerability"> <affected family="windows"> <windows:platform>Microsoft Windows 2000</windows:platform> <windows:platform>Microsoft Windows Server 2003</windows:platform> <product>Microsoft Exchange Server 2003</product> </affected> </definition> <definition id="OVAL6" class="vulnerability"> </definition> </definitions> <crap>This is some crap</crap> </oval>

Perl code to extract a subset of the definitions

==================================

use XML::Twig; my $doc=new XML::Twig ( twig_roots => { "definition" => \&parseEntry } ); sub parseEntry { my ($twig,$element)=@_; my $id=$element->att("id"); if ($id eq "OVAL3") { $element->flush(); return 1; } $twig->purge(); return 0; } $doc->parsefile("test.xml"); $doc->flush();

The output

========

<definition class="vulnerability" id="OVAL3"> <affected family="windows"> <windows:platform>Microsoft Windows 2000</windows:platform> <windows:platform>Microsoft Windows Server 2003</windows:platform> <product>Microsoft Exchange Server 2003</product> </affected> </definition> </oval>

Edit: g0n - readmore tags

Replies are listed 'Best First'.
Re: XML::Twig question
by mirod (Canon) on Dec 20, 2005 at 15:21 UTC

    I think the $twig->purge is a little too strong for what you are doing. It seems to interfere with the way the twig is stored. Using $element->delete instead when the element is not one you want to output would work better.

    I will investigate some more to understand better what's going on, but in the mean time a handler like this seems to work:

    sub parseEntry { my ($twig,$element)=@_; my $id=$element->att("id"); if ($id eq "OVAL3") { $twig->flush(); return 1; } else { $element->delete; } return 0; }
      That worked perfectly. Thanks.