Re: Is it possible to parse an XML file recursively using XML::Twig?

"XML::Twig" is implictly recursive. You don't really need to do any more recursion. But I also don't think you're doing anything anywhere near as complicated as that:

#!/usr/bin/env perl
use strict;
use warnings;

use XML::Twig;
use Data::Dumper;

sub print_book {
    my ( $twig, $book ) = @_; 
    my %this_book = map { $_ -> tag, $_ -> text } $book -> children;
    print Dumper \%this_book;
    
    $twig -> purge; 
}

my $twig = XML::Twig -> new ( 'twig_handlers' =>
                                 { 'Book' => \&print_book } );
   $twig ->  parsefile ( 'your_xml_file' );
[download]

This runs through your your XML; and turns each `Book` into a hash. (And then Dumps it, but you can do something more useful). Or have I missed something profound about what you're trying to accomplish? Output from the above looks a bit like:

$VAR1 = {
          'Title' => 'The age of rocks',
          'Author' => 'Mary',
          'Released' => '8/16/1944'
        };
[download]

Alternatively, you can simply use the "_all_" handler, and test each node for having children:

sub handle_node { 
    my ( $twig, $element ) = @_;
    unless ( $element -> has_children ) { 
        print "(", $element -> parent -> tag, ") ", 
                   $element -> tag, ": ", $element -> text,"\n";
    }
    $twig -> purge; 
}

my $twig = XML::Twig -> new ( 'twig_handlers' => 
                                { '_all_' => \&handle_node } );
   $twig  ->  parsefile ( 'yourfile');
[download]

This will traverse all the nodes, printing any that don't have children, and purging to free up memory. With your sample data, this prints:

(Book) Title: The book of books
(Book) Author: Sally
(Book) Released: 1/2/2008
(Book) Title: The page of pages
(Book) Author: Amanda
(Book) Released: 6/3/1998
(Book) Title: The book of pages
(Book) Author: John
(Book) Released: 6/22/1963
(Book) Title: The rock of ages
(Book) Author: Frank
(Book) Released: 5/21/2004
(Book) Title: The age of rocks
(Book) Author: Mary
(Book) Released: 8/16/1944
[download]

Comment on Re: Is it possible to parse an XML file recursively using XML::Twig? Select or Download Code

Replies are listed 'Best First'.
Re^2: Is it possible to parse an XML file recursively using XML::Twig? by mr_ron (Deacon) on Oct 24, 2015 at 16:33 UTC
I had trouble with your second solution that involved " the '_all_' handler, and test each node for having children". When I ran it, as written, I got output like: `(ArrayOfBooks) Book: (ArrayOfBooks) Book: (Book) Released: (ArrayOfBooks) Book: (ArrayOfBooks) Book: (ArrayOfBooks) Book: Can't call method "tag" on an undefined value at monk_twig_xml_leaf2.p +l line 11. at monk_twig_xml_leaf2.pl line 19. at monk_twig_xml_leaf2.pl line 19.` [download] I tried commenting out the "purge" call and got empty output with no errors, seemingly because `$element->has_children` was returning true for "#PCDATA" text nodes. I am new to XML:Twig, but not so new to XML, and am starting to appreciate XML::Twig's potential for optimization. I did come up with some working code as well but would first be interested in what I might be doing wrong that Preceptor's example wouldn't run. Ron	[reply] [d/l] [select]
Re^3: Is it possible to parse an XML file recursively using XML::Twig? by Preceptor (Deacon) on Oct 24, 2015 at 23:23 UTC
Calling tag on undefined value is probably the parent call. Adding a "defined" test there will probably do the trick. But I will suggest that the strength of the module is in using xpath so you rarely need to do a traverse in the first place. ≤/P>	[reply]
Re^4: Is it possible to parse an XML file recursively using XML::Twig? by mr_ron (Deacon) on Oct 26, 2015 at 14:56 UTC
The following more cautious code seemed to work for me and should purge memory regularly. I worry that calling purge on every element which might purge something you still need around. use strict; use warnings; use XML::Twig; $\|++; my $twig = XML::Twig->new( twig_handlers => { # as noted in the documentation for end_tag_handlers ... # "twig_handlers are called when an element is completely pars +ed" # so should be safe to purge here 'Book' => sub { my ($twig, $el) = @_; # print "purging ...\n"; $twig->purge; }, 'Book//*' => sub { # see http://search.cpan.org/~mirod/XML-Twig-3.49/Twig.pm# +cond # for #ELT which is an element print $_->tag, ': ', $_->text, $/ unless ($_->has_children('#ELT')); } } ); $twig->parsefile('books.xml'); [download] Ron	[reply] [d/l]
Re^5: Is it possible to parse an XML file recursively using XML::Twig? by Ppeoc (Beadle) on Oct 30, 2015 at 18:40 UTC