jdbaldwin23 has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to use XML::Twig to convert a policy (call it a firewall policy, but that's not strictly true) from one format to another. Call the existing file format "OLDFW" and the new format "NEWFW." I have boiled the problem down to bare-essentials XML below.

OLDFW has a root called "policy" which contains one element and one only: "policyrules". That element contains many "Rule" elements which in turn have sub-elements, etc.

NEWFW should be the same (actually there are changes, but I don't need help with that), except the <policyrules> element is eliminated -- that is, all the "Rule" elements are directly underneath "<policy>".

I know that if I simply ignore or delete <policyrules>, I lose all the children. I want to build a handler to skip over it, but that's the problem.

I have kludged around the problem by filtering out <policyrules> before XML-parsing the file, but I am now wondering whether there is a more elegant solution where I can simply tell XML::Twig (or whatever) to "promote" all the "Rule" elements to the same level as "policyrules" and then delete the latter.

Here's a sample OLDFW:

&lt;?xml version="1.0" encoding="UTF-8"?&gt; &lt;policy name="Cur_policy" version="3.2.1"&gt; &lt;policyrules&gt; &lt;Rule Action="allow" Enabled="true" RuleID="F68E32"&gt; &lt;source addr="10.4.5.3" srcport="any"/&gt; &lt;dest addr="199.5.4.3" destport="80"/&gt; &lt;/Rule&gt; &lt;Rule Action="allow" Enabled="true" RuleID="78E21D"&gt; &lt;source addr="10.4.0.0-10.4.255.255" srcport="any"/&gt; &lt;dest addr="any" destport="80"/&gt; &lt;/Rule&gt; &lt;/policyrules&gt; &lt;/policy&gt;

The NEWFW I'd like to get:

&lt;?xml version="1.0" encoding="UTF-8"?&gt; &lt;policy name="Cur_policy" version="3.2.1"&gt; &lt;Rule Action="allow" Enabled="true" RuleID="F68E32"&gt; &lt;source addr="10.4.5.3" srcport="any"/&gt; &lt;dest addr="199.5.4.3" destport="80"/&gt; &lt;/Rule&gt; &lt;Rule Action="allow" Enabled="true" RuleID="78E21D"&gt; &lt;source addr="10.4.0.0-10.4.255.255" srcport="any"/&gt; &lt;dest addr="any" destport="80"/&gt; &lt;/Rule&gt; &lt;/policy&gt;

The code I'm playing with -- what I need is "what goes into policyrules_handler?"

#!/usr/bin/perl use warnings; use strict; use XML::Twig; my ( $in_XML_file, $out_XML_file ) = @ARGV; # create objects my $ref_reader = XML::Twig-&gt;new( twig_handlers =&gt; { 'policyrules' =&gt; \&policyrules_ +handler, 'Rule' =&gt; \&rule_handler } ); $ref_reader-&gt;parsefile( $in_XML_file ) or die " $in_XML_file is not parsable\n"; open my $out_fh, '&gt;', $out_XML_file or die " $out_XML_file cannot be written: $!"; $ref_reader-&gt;print( $out_fh ); close $out_fh; sub policyrules_handler { my( $t, $elt)= @_; # arguments for all twig_handlers # $elt-&gt;delete; } sub rule_handler { my( $t, $elt)= @_; # arguments for all twig_handlers $elt-&gt;set_att( 'new_attr' =&gt; 'dummyValue' ); }

Thank you for any assistance you can provide here.

jd

Replies are listed 'Best First'.
Re: Remove level of elements (preserving their children) in XML::Twig?
by BrowserUk (Patriarch) on Mar 01, 2012 at 04:35 UTC

    Now the Simple way :)

    #! perl -slw use strict; use XML::Simple; my $xml = XMLin( \*DATA, KeepRoot => 1 ); $xml->{policy}{Rule} = $xml->{policy}{policyrules}{Rule}; delete $xml->{policy}{policyrules}; print XMLout( $xml, KeepRoot => 1 ); __END__ <?xml version="1.0" encoding="UTF-8"?> <policy name="Cur_policy" version="3.2.1"> <policyrules> <Rule Action="allow" Enabled="true" RuleID="F68E32"> <source addr="10.4.5.3" srcport="any"/> <dest addr="199.5.4.3" destport="80"/> </Rule> <Rule Action="allow" Enabled="true" RuleID="78E21D"> <source addr="10.4.0.0-10.4.255.255" srcport="any"/> <dest addr="any" destport="80"/> </Rule> </policyrules> </policy>

    Produces:

    C:\test>xmljunk.pl <policy name="Cur_policy" version="3.2.1"> <Rule Action="allow" Enabled="true" RuleID="F68E32"> <dest addr="199.5.4.3" destport="80" /> <source addr="10.4.5.3" srcport="any" /> </Rule> <Rule Action="allow" Enabled="true" RuleID="78E21D"> <dest addr="any" destport="80" /> <source addr="10.4.0.0-10.4.255.255" srcport="any" /> </Rule> </policy>

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

Re: Remove level of elements (preserving their children) in XML::Twig?
by derby (Abbot) on Feb 29, 2012 at 22:57 UTC

    Not sure about XML::Twig but here's how I would do it with XML::LibXML:

    #!/usr/bin/env perl use warnings; use strict; use XML::LibXML; my $file = shift || die "usage $0 <xmlfile>\n"; my $parser = XML::LibXML->new(); my $dom = $parser->parse_file( $file ); my $doc = $dom->documentElement(); my @rules = $doc->findnodes( '/policy/policyrules/Rule' ); my $newdom = XML::LibXML::Document->new( '1.0', 'UTF-8' ); my $newroot = $dom->createElement( 'policy' ); $newroot->setAttribute( 'name', 'Cur_policy' ); $newroot->setAttribute( 'version', '3.2.1' ); $newdom->setDocumentElement( $newroot ); foreach my $rule ( @rules ) { $newroot->appendChild( $rule ); } print $newdom->toString(1);

    -derby

    update: Doh! ... no need for a new document:

    #!/usr/bin/env perl use warnings; use strict; use XML::LibXML; my $file = shift || die "usage $0 <xmlfile>\n"; my $parser = XML::LibXML->new(); my $dom = $parser->parse_file( $file ); my $doc = $dom->documentElement(); my @rules = $doc->findnodes( '/policy/policyrules/Rule' ); my $parent = $rules[0]->parentNode->parentNode; $parent->removeChildNodes(); $parent->appendChild( $_ ) foreach @rules; print $dom->toString(1);

Re: Remove level of elements (preserving their children) in XML::Twig?
by mirod (Canon) on Mar 01, 2012 at 03:58 UTC

    There is a method in XML::Twig that does just what you want: erase.

    Below is an example, with a couple of ways to solve the problem. I am sure there are others.

    #!/usr/bin/perl use strict; use warnings; use autodie qw(open); use Test::More; use XML::Twig; my( $in, $expected)= do { local $/="\n\n"; <DATA>; }; is( remove_policy_rule( $in), $expected, 'simplest code'); is( with_flush( $in), $expected, 'with flush'); done_testing(); sub remove_policy_rule { my( $in)= @_; my $t= XML::Twig->new( twig_handlers => { policyrules => sub { $_- +>erase; } }, pretty_print => 'indented', ) ->parse( $in); return $t->sprint; } sub with_flush { my( $in)= @_; my $out; open( my $fh, '>', \$out); my $t= XML::Twig->new( twig_handlers => { policyrules => sub { $_- +>erase; $_[ +0]->flush( $fh); } }, pretty_print => 'indented', ) ->parse( $in); return $out; } __DATA__ <?xml version="1.0" encoding="UTF-8"?> <policy name="Cur_policy" version="3.2.1"> <policyrules> <Rule Action="allow" Enabled="true" RuleID="F68E32"> <source addr="10.4.5.3" srcport="any"/> <dest addr="199.5.4.3" destport="80"/> </Rule> <Rule Action="allow" Enabled="true" RuleID="78E21D"> <source addr="10.4.0.0-10.4.255.255" srcport="any"/> <dest addr="any" destport="80"/> </Rule> </policyrules> </policy> <?xml version="1.0" encoding="UTF-8"?> <policy name="Cur_policy" version="3.2.1"> <Rule Action="allow" Enabled="true" RuleID="F68E32"> <source addr="10.4.5.3" srcport="any"/> <dest addr="199.5.4.3" destport="80"/> </Rule> <Rule Action="allow" Enabled="true" RuleID="78E21D"> <source addr="10.4.0.0-10.4.255.255" srcport="any"/> <dest addr="any" destport="80"/> </Rule> </policy>

      Gobsmacked at the awesomeness of all the replies. But the one with the erase method (used in a way I hadn't tried) is it and my preliminary result on the (very big and complex) real-world data is encouraging. Thank you all.

      jd

Re: Remove level of elements (preserving their children) in XML::Twig?
by tobyink (Canon) on Feb 29, 2012 at 23:32 UTC

    Here's how I'd do it with XML::LibXML:

    use XML::LibXML 1.93; use XML::LibXML::PrettyPrint qw/print_xml/; my $doc = XML::LibXML->load_xml(IO => \*DATA); $doc -> findnodes('//policyrules') -> foreach(sub { my $elem = shift; $elem->parentNode->appendChild($_) foreach $elem->childNod +es; $elem->parentNode->removeChild($elem); }); print_xml($doc); __DATA__ <?xml version="1.0" encoding="UTF-8"?> <policy name="Cur_policy" version="3.2.1"> <policyrules> <Rule Action="allow" Enabled="true" RuleID="F68E32"> <source addr="10.4.5.3" srcport="any"/> <dest addr="199.5.4.3" destport="80"/> </Rule> <Rule Action="allow" Enabled="true" RuleID="78E21D"> <source addr="10.4.0.0-10.4.255.255" srcport="any"/> <dest addr="any" destport="80"/> </Rule> </policyrules> </policy>

      With PerlX::MethodCallWithBlock it can get even prettier...

      use XML::LibXML 1.93; use XML::LibXML::PrettyPrint qw/print_xml/; use PerlX::MethodCallWithBlock; my $doc = XML::LibXML->load_xml(IO => \*DATA); $doc -> findnodes('//policyrules') -> foreach { my $elem = shift; $elem->parentNode->appendChild($_) foreach $elem->childNodes; $elem->parentNode->removeChild($elem); }; print_xml($doc);

      It almost makes me want to weep.

       

      I can get it down to a single chain, but then it starts looking a little obfuscated...

      use XML::LibXML 1.93; use XML::LibXML::PrettyPrint qw/print_xml/; use PerlX::MethodCallWithBlock; XML::LibXML -> load_xml(IO => \*DATA) -> findnodes('//policyrules') -> foreach { my $elem = shift; $elem->parentNode->appendChild($_) foreach $elem->childNodes; $elem->parentNode->removeChild($elem); } -> map { $_->ownerDocument } -> foreach { print_xml $_ and exit };
Re: Remove level of elements (preserving their children) in XML::Twig?
by choroba (Cardinal) on Mar 01, 2012 at 11:22 UTC