Hi folks,
I've been using XML::Simple for all my XML parsing needs in the past, but I need to start parsing complicated XML files >50MB in size, and well, I run out of system memory before I get anywhere.

So I've been looking into Twig, but I ran out of brain memory before I got anywhere.

Can someone please explain how I extract all <node> elements in this XML eg knowing that <node id="xx"> can have any number of <node> children, recursively.
I basically need to get to the first <node>, extract all the data from the elements and when I get to a child <node> do the same. Rinse&Repeat


eg, a snippet from my xml:
<data> <geography> <node id="0" hidden="0"> <description>By Geography</description> <ticketsavailable>1</ticketsavailable> <parent_id>-1</parent_id> <parent_path>,0</parent_path> <node id="709" hidden="0"> <description>Alabama</description> <title>Tickets for Events at Alabama Venues</title> <meta_description>Tickets for Events at Alabama Venues</me +ta_description> <seo_title>Tickets for Events at Alabama Venues</seo_title +> <seo_description>Tickets for Events at Alabama Venues</seo +_description> <ticketsavailable>1</ticketsavailable> <parent_id>0</parent_id> <parent_path>,0,709</parent_path> <node id="4945" hidden="0"> <description>Birmingham</description> <title>Birmingham Tickets</title> <meta_description>Birmingham Tickets</meta_description +> <keywords>Birmingham tickets, Alabama tickets, Birming +ham events, tickets for events in Birmingham and Alabama</keywords> <seo_title>Birmingham Tickets</seo_title> <seo_description>Birmingham Tickets</seo_description> <name_primary>Birmingham</name_primary> <ticketsavailable>1</ticketsavailable> <parent_id>709</parent_id> <parent_path>,0,709,4945</parent_path> <node id="7983" hidden="0"> <description>Alabama Theatre</description> <title>Alabama Theatre Tickets at StubHub!</title> + <meta_description>Alabama Theatre Tickets - Buy an +d sell tickets to events at Alabama Theatre in Birmingham, AL at Stub +Hub!</meta_description> <keywords>Alabama Theatre seating chart,AlabamaThe +atre, Alabama Theatre tickets</keywords> <seo_title>Alabama Theatre Tickets</seo_title> <name_primary>Alabama Theatre</name_primary> <ticketsavailable>1</ticketsavailable> <parent_id>4945</parent_id> <parent_path>,0,709,4945,7983</parent_path> <venue_addr1>1811 3rd Ave. North</venue_addr1> <venue_city>Birmingham</venue_city> <venue_state>AL</venue_state> <venue_zip>35201</venue_zip> <venue_phone>2052522862</venue_phone> <venue_genre genre_id="1632"> <venue_genre_event event_id="337496"/> </venue_genre> <venue_genre genre_id="75215"> <venue_genre_event event_id="331248"/> </venue_genre> <venue_genre genre_id="99217"> <venue_genre_event event_id="330232"/> </venue_genre> <venue_genre genre_id="130035"> <venue_genre_event event_id="342530"/> </venue_genre> </node> ...etc </node> </node> </node> </geography> </data>

As I don't know how many <node> elements are nested, I am completely lost.

Any help very much appreciated!

Edited by planetscape - added readmore tags

( keep:5 edit:17 reap:0 )


In reply to Recursive XML navigation XML::Twig help! by inputsprocket

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.