ateague has asked for the wisdom of the Perl Monks concerning the following question:
UPDATE:
This appears to have been fixed at or before XML::Twig version 3.52. Thank you Mirod for your work!
Good evening!
I am using XML::Twig to transform some simple XML data with ActivePerl 5.020 x64.
I am turning data like this:and "flattening" (I think that is the correct term) into something that looks like this:<?xml version="1.0" encoding="UTF-8"?> <shipment> <box> <packing> <slip>potato</slip> <slip>pear</slip> <slip>peach</slip> </packing> </box> <box> <packing> <slip>apple</slip> </packing> </box> </shipment>
<?xml version="1.0" encoding="UTF-8"?> <shipment> <slip _TYPE="produce">potato</slip> <slip _TYPE="produce">pear</slip> <slip _TYPE="produce">peach</slip> <slip _TYPE="produce">apple</slip> </shipment>
The issue I am coming up against is that the final "</shipment>" tag gets duplicated whenever I process a document that has a "<box>...</box>" section without any <slip> sections in it.
Thank you for your time.
Sample code and output:
Output:#!/usr/bin/perl use 5.020; use strict; use warnings; use XML::Twig; ## ## This works as expected ## my $working_xml = XML::Twig->new( twig_handlers => { '/shipment/box' => sub { _move(@_); 1; }, }, pretty_print => 'indented', )->safe_parse( <<"XML" <?xml version="1.0" encoding="UTF-8"?> <shipment> <box> <packing> <slip>potato</slip> <slip>pear</slip> <slip>peach</slip> </packing> </box> <box> <packing> <slip>apple</slip> </packing> </box> </shipment> XML ); say "\n\n---------------------\n\n"; ## ## This 'breaks', printing the final '</shipment>' tag twice ## my $broken_xml = XML::Twig->new( twig_handlers => { '/shipment/box' => sub { _move(@_); 1; }, }, pretty_print => 'indented', )->safe_parse( <<"XML" <?xml version="1.0" encoding="UTF-8"?> <shipment> <box> <packing> <slip>potato</slip> <slip>pear</slip> <slip>peach</slip> </packing> </box> <box> <packing> <note>nothing here!</note> </packing> </box> </shipment> XML ); # ---------- sub _move { foreach my $descendant ( $_[1]->descendants('slip') ) { $descendant->set_att('_TYPE' => 'produce'); # Move the <slip> out of the <box> section # # This appears to be the problem, even though # this should never be reached in the second # document (no <slip> descendants) $descendant->move('before', $_[1]); } $_[1]->delete; $_[0]->flush(); 1; }
<?xml version="1.0" encoding="UTF-8"?> <shipment> <slip _TYPE="produce">potato</slip> <slip _TYPE="produce">pear</slip> <slip _TYPE="produce">peach</slip> <slip _TYPE="produce">apple</slip> </shipment> --------------------- <?xml version="1.0" encoding="UTF-8"?> <shipment> <slip _TYPE="produce">potato</slip> <slip _TYPE="produce">pear</slip> <slip _TYPE="produce">peach</slip> </shipment> </shipment>
|
|---|