Hello, so if I understood your desired output, you can simply get the first_children sequentially; like :

# into the sub 'twig_handler' '/DatatoParse/elt' $_[1]->first_child('Nest1')->first_child('elt')->first_child('Junk1 +')->text;
This becomes very prolix and repetitive soon, in fact you only need an xpath so for Junk1 you can also:
my @junk1 = $_[1]->get_xpath('./Nest1/elt/Junk1'); print $junk1[0]->text;
So having a lot of xpath to process the same way you can compatc the code a lot, ending with the following twig_handler
sub elt_map{ my $elt = $_[1]; print join ',', map { my @cur = $elt->get_xpath($_); $cur[0]->text; }(qw( d1 d2 ./Nest1/elt/Junk1 ./Nest1/elt/Junk2 ./Nest1/elt/Nest2/elt/d5/X ./Nest1/elt/Nest2/elt/d5/Y ./Nest1/elt/Nest2/elt/d6/X ./Nest1/elt/Nest2/elt/d6/Y ./Nest1/elt/Nest2/elt/Nest3/Nest4/d7/d9/d10/ +d11 )); print "\n" }

The whole code will be:

use strict; use warnings; use XML::Twig; my $field = "Nest1"; my $twig = XML::Twig->new( twig_handlers => {'/DatatoParse/elt' => \&el +t_map,} ); $/=''; $twig->parse(<DATA>); sub elt_map{ my $elt = $_[1]; print join ',', map { my @cur = $elt->get_xpath($_); $cur[0]->text; }(qw( d1 d2 ./Nest1/elt/Junk1 ./Nest1/elt/Junk2 ./Nest1/elt/Nest2/elt/d5/X ./Nest1/elt/Nest2/elt/d5/Y ./Nest1/elt/Nest2/elt/d6/X ./Nest1/elt/Nest2/elt/d6/Y ./Nest1/elt/Nest2/elt/Nest3/Nest4/d7/d9/d10/ +d11 )); print "\n" } __DATA__ <DatatoParse> <elt> <d1>TV show 1</d1> ....

with the following output

TV show 1,Heroes,FULL,Page 65,-2,-3,5,8,yipppeee TV show 2,Prison Break,FULL,Page 65,-2,-3,5,8,yipppeee TV show 4,Alias,FULL,Page 65,-2,-3,5,8,yipppeee

In addition, when you need to write everytimes to a destination file, you can profit of select $filehandle; Is very useful also because while debugging you can comment it to see at screen the output.

HtH

L*

There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

In reply to Re: Parsing a highly nested XML file correctly and efficiently -- XML::Twig by Discipulus
in thread Parsing a highly nested XML file correctly and efficiently by Ppeoc

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.