slugger415 has asked for the wisdom of the Perl Monks concerning the following question:
Hi monks, I'm using XML::Twig to parse an XML document, with dates and events, that is not as nicely structured as I'd like. Here's a pseudo-code example:
<div id="calendar"> <h3 class="current-day">Wednesday, February 1</h3> <div class="event">Event 1</div> <div class="event">Event 2</div> <div class="event">Event 3</div> <h3 class="current-day">Thursday, February 2</h3> <div class="event">Event 1</div> <div class="event">Event 2</div> <h3 class="current-day">Friday, February 3</h3> <div class="event">Event 1</div> <div class="event">Event 2</div> </div>
The problem is the div-events are not contained within the h3 elements, so I can't figure out how to associate the events with each date. I can get all the h3 children and all the div-event children with a div event handler at the top level:
my($twig, $div) = @_; if($div->att('id') eq 'calendar'){ my(@dates) = $div->children('h3'); my(@events) = $div->children('div'); }
But obviously that just gives me two unconnected lists. Is there some clever way I can associate these elements, perhaps in the order they appear? Doesn't seem to be a "next_child" function in XML::Twig.
Thanks for any advice.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: XML::Twig parsing poorly structured content
by choroba (Cardinal) on Jan 24, 2017 at 16:38 UTC | |
by slugger415 (Monk) on Jan 25, 2017 at 00:36 UTC | |
by choroba (Cardinal) on Jan 25, 2017 at 08:26 UTC | |
by kcott (Archbishop) on Jan 25, 2017 at 08:21 UTC | |
by slugger415 (Monk) on Jan 25, 2017 at 16:08 UTC |