dwhitney has asked for the wisdom of the Perl Monks concerning the following question:

Hello all, I have a problem I'm hoping to get some help with.
I have a script that will create an N-tire hash, i.e. a hash of hashes { of hashes of hashes, you get the idea } after parsing an XML file and grabbing the attributes which become the keys of the hash.
An example XML file:
<?xml version="1.0" standalone="no"?> <!DOCTYPE datacapture SYSTEM "datacapture5.0.dtd"> <data-capture-requirements type="content"> <ruleset name="Foo"> <item name="A"/> <item name="B"/> <item name="C"/> <item name="D"> <replicant min="0" max="15"> <item name="1"/> </replicant> </item> <container name="E"> <item name="1"/> <item name="2"/> <item name="3"/> <item name="4"/> </container> </ruleset> </data-capture-requirements>
Would create the hash:
$VAR1 = { 'Foo' => { 'D' => { '1' => {} }, 'A' => {}, 'E' => { '1' => {}, '2' => {}, '3' => {}, '4' => {}, }, 'B' => {}, 'C' => {}, } };
Now, my problem comes along when I try to traverse this hash.
I thought that if I did use Tie::IxHash package and then tied it via: tie %HASH, "Tie::IxHash"; that I would be ok.
Well, I was wrong.
So then I tried to make each sub hash tied to IxHash, which also failed in a big way.
Either way I do this, the order gets all blown up and becomes useless.
I've looked around here for something, anything but I can't find any love there. So, what can I do? I'm going crazy tring to figure this out!!!!
Thanks all for your help.

<update>Well, after hearing from St. Randal, I'm gonna try for the arrays of arrays method and report back what I came up with.
Any hints?</update>

<update>Thanks to all for your hints, for I think I have solved my problem using a mismash of the ideas. This code will return an array of hashes ( of arrays of hashes...)
Check out the following code:

sub returnRecordHeadings { my ( $self, $xml ) = @_; return undef if ( not $xml ); undef $HEADER; # Reset the global HEADER var my @header; # Declare the header object my $rootnode; # Declare the starting point of t +he XML doc my $parser = new XML::DOM::Parser; # Create new DOM Pars +er object # Parse the file or the string representing it $parser = ( -e "$xml" ) ? $parser->parsefile("$xml") : $parser->pa +rse("$xml"); ### Traverse, Breath first ### my $rootnodes = $parser->getElementsByTagName ( "ruleset" ); my $rn = $rootnodes->getLength; for ( my $i = 0; $i < $rn; $i++ ) { my $root = $rootnodes->item ($i); # Gather data for this +root node my $rootname = $root->getAttributeNode ("name")->getValu +e; # Get the root element name printf "%s %s name=>%s is of type=>%s\n", $root->getNodeN +ame(), $i, $rootname, $root->getNodeType if ( $DEBUG ); if ( $root->getNodeType =~ m/1|ELEMENT_NODE/i ) { # + Check for correct elemet type before parsing push @{ $HEADER }, { "$rootname" => [] } if ( $rootname ); + # Save rootname in HEADER if I have a rootname my $rootref = \$HEADER->[$i]; # update +array ref print "---RootRef Dump---\n", Dumper( $rootref ), "\n" if +( $DEBUG ); ### iterate over the result nodes ### for my $elem ( $root->getElementsByTagName( "*" ) ) { if ( ( scalar( @{ $elem->getChildNodes() } ) > 1 ) and + ( $elem->getParentNode()->getAttribute("name") eq "$rootname" ) ) { if ( $DEBUG ) { printf "Element=>%s\n", $elem; printf "Has kids=>%s\tHow many=>%s\n", ( $elem +->hasChildNodes() ) ? "Yes" : "No", scalar( @{ $elem->getChildNodes() + } ); printf "\t%s ::> %s\n", $root->getChildIndex( +$elem ), $elem->getNodeName(); printf "\t\tname='%s'\n", $elem->getAttribute( +"name") || "No Child"; } &_traverse_element( $elem, $rootref, 2 ); print '*' x 5 . "\n" if ( $DEBUG ); } else { next } } } print '~*' x 10, "\n" if ( $DEBUG ); } # Avoid memory leaks - cleanup circular references for garbage col +lection $parser->dispose; return $HEADER; sub _traverse_element { my( $elem, $rootref, $count ) = @_; my $rootname = ( keys( %{ $$rootref } ) )[0] || "NO ROOT NA +ME!"; my $parent = $elem->getParentNode()->getAttribute("name"); my $grandparent = $elem->getParentNode()->getParentNode()-> +getAttribute("name"); my $current = $elem->getAttribute("name"); if ( $DEBUG ) { printf "\t" x $count . "Current node=>%s\n", $elem->get +NodeName(); printf "\t" x $count . "Root Name=>%s\n", $rootname +|| "No Root Name Value"; printf "\t" x $count . "Current Parent=>%s\n", $parent + || $grandparent || "No Parent"; printf "\t" x $count . "Current GrandParent=>%s\n", $grand +parent|| "No Grandparent"; printf "\t" x $count . "Current Node=>%s\n", $current + || "No Current node name"; } # If my parent is my root if ( $parent eq $rootname ) { printf "\t" x ++$count . "--->'%s' child element of '%s' w +hich is a %s\n", $current, $rootname, $elem->getNodeName() if ( $DEBUG +); ### Decide where to place new element ### ### If current element has a name attribute ### ### Add current element to the array ref ### if ( $current ) { push @{ $$rootref->{"$parent"} }, { $cur +rent => [] } } printf "\t" x $count . "RootRef defined?=>%s\n", ( defined + $$rootref->{"$parent"}->[$#{ $$rootref->{"$parent"} }] ) ? "Yes" : " +No" if ( $DEBUG ); ### Update array ref to new location ### if ( ( defined $$rootref->{"$parent"}->[$#{ $$rootref->{"$ +parent"} }] ) and ( $current ) ) { $rootref = \$$rootref->{"$parent"}->[$#{ $$rootref->{" +$parent"} }]; print "\t" x $count . "---Updated RootRef---\n" if ( $ +DEBUG ); } # If this element has kids, traverse them $count++; printf "\t" x $count . "Has kids=>%s\tHow many=>%s\n", ( $ +elem->hasChildNodes() ) ? "Yes" : "No", scalar @{ $elem->getChildNode +s() } if ( $DEBUG ); for my $child ( $elem->getChildNodes() ) { if ( ( scalar @{ $child->getChildNodes() } > 1 ) and ( + $child->getNodeName() !~ m/^radio|select|checkbox/i ) ) { ### Add current element to the array ref here, if +not current ### if ( not $current ) { push @{ $$rootref->{"$parent +"} }, { $child->getAttribute("name") => [] } } if ( $DEBUG ) { printf "\t" x $count . "%s :::> %s name='%s'\n +", $elem->getChildIndex( $child ), $child->getNodeName(), $child->get +Attribute("name") || "No Name"; printf "\t" x $count . "Has kids=>%s\tHow many +=>%s\n", ( $child->hasChildNodes() ) ? "Yes" : "No", scalar @{ $child +->getChildNodes() }; } # Update rootref, and pass &_traverse_element( $child, $rootref, $count ); print "\t" x $count . '*-' x 10, "\n" if ( $DEBUG +); } } } } } # End Sub

Replies are listed 'Best First'.
•Re: Maintaing the insertion order of an N-tier hash
by merlyn (Sage) on Feb 28, 2004 at 02:26 UTC
      If you really need a hash, you could always keep the order information in arrays in the hash like this:
      $VAR1 = { 'order' => [ 'Foo' ], 'data' => { 'Foo' => { 'order' => [ 'D', 'A', 'E', 'B', 'C' ], 'data' => { 'D' => { 'order' => [ '1' ], 'data' => { '1' => {} }, 'A' => {}, 'E' => { 'order' => [ '1', '2', '3', '4' ], 'data' => { '1' => {}, '2' => {}, '3' => {}, '4' => {} } } ...etc.
      Of course, maintaining that structure could be a chore. Java has what you're looking for - a hash that maintains order - it's called a LinkedHashMap. Basically it's a hash with a linked list running through it. <handwaving>I leave the Perl implementation as an exercise for the reader...</handwaving>

      Update

      Ok, I feel like an idiot. Ignore the Java ramblings - IxHash is the same as a LinkedHashMap. The data structure works, however (even if it is a pain).
      What I did in Excel::Template was to create a hash as the root node. It has an array (called children) of children nodes, each represented as a hash. And, so forth. Some intelligent coding in the baseclass and everything came together.

      ------
      We are the carpenters and bricklayers of the Information Age.

      Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Re: Maintaing the insertion order of an N-tier hash
by matija (Priest) on Feb 28, 2004 at 08:08 UTC
    I think you should look at XML::Smart. Any point in the tree parsed by XML::Smart can be accessed as an array, or as a tree - whichever is convenient for you at that point.

    In addition, XML::Smart already has methods for writing out the tree (in the original order), and for parsing the XML (with pluggable parsers), so it would seem you would be able to concentrate on the important stuff (like what to do with the data once you have it parsed).

      Thanks for all of your help.
      Take a look at the code I came up with. I know it's not quite what I said, but it gives me what I need!
      Thanks all!!!!!