in reply to Is it possible to parse an XML file recursively using XML::Twig?

"XML::Twig" is implictly recursive. You don't really need to do any more recursion. But I also don't think you're doing anything anywhere near as complicated as that:

#!/usr/bin/env perl use strict; use warnings; use XML::Twig; use Data::Dumper; sub print_book { my ( $twig, $book ) = @_; my %this_book = map { $_ -> tag, $_ -> text } $book -> children; print Dumper \%this_book; $twig -> purge; } my $twig = XML::Twig -> new ( 'twig_handlers' => { 'Book' => \&print_book } ); $twig -> parsefile ( 'your_xml_file' );

This runs through your your XML; and turns each `Book` into a hash. (And then Dumps it, but you can do something more useful). Or have I missed something profound about what you're trying to accomplish? Output from the above looks a bit like:

$VAR1 = { 'Title' => 'The age of rocks', 'Author' => 'Mary', 'Released' => '8/16/1944' };

Alternatively, you can simply use the "_all_" handler, and test each node for having children:

sub handle_node { my ( $twig, $element ) = @_; unless ( $element -> has_children ) { print "(", $element -> parent -> tag, ") ", $element -> tag, ": ", $element -> text,"\n"; } $twig -> purge; } my $twig = XML::Twig -> new ( 'twig_handlers' => { '_all_' => \&handle_node } ); $twig -> parsefile ( 'yourfile');

This will traverse all the nodes, printing any that don't have children, and purging to free up memory. With your sample data, this prints:

(Book) Title: The book of books (Book) Author: Sally (Book) Released: 1/2/2008 (Book) Title: The page of pages (Book) Author: Amanda (Book) Released: 6/3/1998 (Book) Title: The book of pages (Book) Author: John (Book) Released: 6/22/1963 (Book) Title: The rock of ages (Book) Author: Frank (Book) Released: 5/21/2004 (Book) Title: The age of rocks (Book) Author: Mary (Book) Released: 8/16/1944

Replies are listed 'Best First'.
Re^2: Is it possible to parse an XML file recursively using XML::Twig?
by mr_ron (Deacon) on Oct 24, 2015 at 16:33 UTC

    I had trouble with your second solution that involved " the '_all_' handler, and test each node for having children". When I ran it, as written, I got output like:

    (ArrayOfBooks) Book: (ArrayOfBooks) Book: (Book) Released: (ArrayOfBooks) Book: (ArrayOfBooks) Book: (ArrayOfBooks) Book: Can't call method "tag" on an undefined value at monk_twig_xml_leaf2.p +l line 11. at monk_twig_xml_leaf2.pl line 19. at monk_twig_xml_leaf2.pl line 19.

    I tried commenting out the "purge" call and got empty output with no errors, seemingly because $element->has_children was returning true for "#PCDATA" text nodes. I am new to XML:Twig, but not so new to XML, and am starting to appreciate XML::Twig's potential for optimization. I did come up with some working code as well but would first be interested in what I might be doing wrong that Preceptor's example wouldn't run.

    Ron

      Calling tag on undefined value is probably the parent call. Adding a "defined" test there will probably do the trick. But I will suggest that the strength of the module is in using xpath so you rarely need to do a traverse in the first place. ≤/P>

        The following more cautious code seemed to work for me and should purge memory regularly. I worry that calling purge on every element which might purge something you still need around.

        use strict; use warnings; use XML::Twig; $|++; my $twig = XML::Twig->new( twig_handlers => { # as noted in the documentation for end_tag_handlers ... # "twig_handlers are called when an element is completely pars +ed" # so should be safe to purge here 'Book' => sub { my ($twig, $el) = @_; # print "purging ...\n"; $twig->purge; }, 'Book//*' => sub { # see http://search.cpan.org/~mirod/XML-Twig-3.49/Twig.pm# +cond # for #ELT which is an element print $_->tag, ': ', $_->text, $/ unless ($_->has_children('#ELT')); } } ); $twig->parsefile('books.xml');
        Ron