I find that when I have data that is unpredictable (such as is the case here with respect to not knowing the potential nodes that might come up) its better to parse it from scratch.

I might be totally wrong but I am not sure if the XML you have described is well formed as is required.

Either way here is a hopefully readable but not the most efficient peice of code that achieves what you are aiming to do:

use strict; use warnings; use Data::Dump qw( dump ); my $data = do{local $/;<DATA>}; my @people_educations; while( $data =~ m/<person>(.*?)<\/person>/gis ) { my $one_persons_info = $1; while( $one_persons_info =~ m/<education>(.*?)<\/education>/gis ) +{ my $this_guys_education_details_string = $1 ; my %this_guys_education_details_hash ; my @this_guys_education_sections ; while( $this_guys_education_details_string =~ /<([^\/]*?)>/gis ) { push @this_guys_education_sections, $1 ; } foreach my $single_section ( @this_guys_education_sections ) { if( $this_guys_education_details_string =~ /<$single_section>( +.*?)<\/$single_section>/gis ) { $this_guys_education_details_hash{ $single_section } = $1; } } push @people_educations, \%this_guys_education_details_hash; } } dump( \@people_educations ); __DATA__ <Person> <Address> <name>xxx</name> <mobile>xxx</mobile> </Address> <Education> <Degree>xxx</Degree> <Major>xxx</Major> </Education> </Person> <Person> <Address> <name>xxx</name> <mobile>xxx</mobile> </Address> <Education> <Degree>xxx</Degree> <Minor>xxx</Minor> <Grade> ggg </Grade> </Education> </Person> <Person> <Address> <name>xxx</name> <mobile>xxx</mobile> </Address> <Education> <Degree>xxx</Degree> </Education> </Person>


This will produce the following output:

[ { Degree => "xxx", Major => "xxx" }, { Degree => "xxx", Grade => " ggg ", Minor => "xxx" }, { Degree => "xxx" }, ]





In reply to Re: Regarding XML::DOM::Parser by tmharish
in thread Regarding XML::DOM::Parser by mecrazycoder

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.