Maintaing the insertion order of an N-tier hash

dwhitney has asked for the wisdom of the Perl Monks concerning the following question:

Hello all, I have a problem I'm hoping to get some help with.
I have a script that will create an N-tire hash, i.e. a hash of hashes { of hashes of hashes, you get the idea } after parsing an XML file and grabbing the attributes which become the keys of the hash.
An example XML file:

<?xml version="1.0" standalone="no"?>
<!DOCTYPE datacapture SYSTEM "datacapture5.0.dtd">
<data-capture-requirements type="content">
    <ruleset name="Foo">

        <item name="A"/>
        <item name="B"/>
        <item name="C"/>

        <item name="D">
            <replicant min="0" max="15">
                <item name="1"/>
                        </replicant>
                </item>

        <container name="E">
                <item name="1"/>
                <item name="2"/>
                    <item name="3"/>
            <item name="4"/>
        </container>

    </ruleset>
</data-capture-requirements>
[download]

Would create the hash:

$VAR1 = {
    'Foo' => {
        'D' => {
            '1' => {}
        },
        'A' => {},
        'E' => {
            '1' => {},
            '2' => {},
            '3' => {},
            '4' => {},
        },
        'B' => {},
        'C' => {},
    }
};
[download]

Now, my problem comes along when I try to traverse this hash.
I thought that if I did use Tie::IxHash package and then tied it via: tie %HASH, "Tie::IxHash"; that I would be ok.
Well, I was wrong.
So then I tried to make each sub hash tied to IxHash, which also failed in a big way.
Either way I do this, the order gets all blown up and becomes useless.
I've looked around here for something, anything but I can't find any love there. So, what can I do? I'm going crazy tring to figure this out!!!!
Thanks all for your help.

<update>Well, after hearing from St. Randal, I'm gonna try for the arrays of arrays method and report back what I came up with.
Any hints?</update>

<update>Thanks to all for your hints, for I think I have solved my problem using a mismash of the ideas. This code will return an array of hashes ( of arrays of hashes...)
Check out the following code:

sub returnRecordHeadings {
    my ( $self, $xml ) = @_;
    return undef if ( not $xml );

    undef $HEADER;                    # Reset the global HEADER var
    my @header;                    # Declare the header object

    my $rootnode;                    # Declare the starting point of t
+he XML doc
    my $parser        = new XML::DOM::Parser;    # Create new DOM Pars
+er object

    # Parse the file or the string representing it
    $parser = ( -e "$xml" ) ? $parser->parsefile("$xml") : $parser->pa
+rse("$xml");

    ### Traverse, Breath first ###
     my $rootnodes = $parser->getElementsByTagName ( "ruleset" );
     my $rn = $rootnodes->getLength;

    for ( my $i = 0; $i < $rn; $i++ ) {

             my $root = $rootnodes->item ($i); # Gather data for this 
+root node
             my $rootname = $root->getAttributeNode  ("name")->getValu
+e; # Get the root element name
             printf "%s %s name=>%s is of type=>%s\n", $root->getNodeN
+ame(), $i, $rootname, $root->getNodeType if ( $DEBUG );

        if ( $root->getNodeType =~ m/1|ELEMENT_NODE/i ) {            #
+ Check for correct elemet type before parsing
            push @{ $HEADER }, { "$rootname" => [] } if ( $rootname );
+    # Save rootname in HEADER if I have a rootname
            my $rootref = \$HEADER->[$i];                    # update 
+array ref

            print "---RootRef Dump---\n", Dumper( $rootref ), "\n" if 
+( $DEBUG );

            ### iterate over the result nodes ###
            for my $elem ( $root->getElementsByTagName( "*" ) ) {

                if ( ( scalar( @{ $elem->getChildNodes() } ) > 1 ) and
+ ( $elem->getParentNode()->getAttribute("name") eq "$rootname" ) ) {

                    if ( $DEBUG ) {
                        printf "Element=>%s\n", $elem;
                        printf "Has kids=>%s\tHow many=>%s\n", ( $elem
+->hasChildNodes() ) ? "Yes" : "No", scalar( @{ $elem->getChildNodes()
+ } );
                        printf "\t%s ::> %s\n", $root->getChildIndex( 
+$elem ), $elem->getNodeName();
                        printf "\t\tname='%s'\n", $elem->getAttribute(
+"name") || "No Child";
                    }

                    &_traverse_element( $elem, $rootref, 2 );

                    print '*' x 5 . "\n" if ( $DEBUG );

                } else { next }

            }

        }

        print '~*' x 10, "\n" if ( $DEBUG );

     }

    # Avoid memory leaks - cleanup circular references for garbage col
+lection
    $parser->dispose;

    return $HEADER;

    sub _traverse_element {
        my( $elem, $rootref, $count ) = @_;
        my $rootname    = ( keys( %{ $$rootref } ) )[0] || "NO ROOT NA
+ME!";
        my $parent    = $elem->getParentNode()->getAttribute("name");
        my $grandparent    = $elem->getParentNode()->getParentNode()->
+getAttribute("name");
        my $current    = $elem->getAttribute("name");

        if ( $DEBUG ) {
            printf "\t" x $count . "Current node=>%s\n",    $elem->get
+NodeName();
            printf "\t" x $count . "Root Name=>%s\n",    $rootname    
+|| "No Root Name Value";
            printf "\t" x $count . "Current Parent=>%s\n",    $parent 
+     || $grandparent    || "No Parent";
            printf "\t" x $count . "Current GrandParent=>%s\n", $grand
+parent|| "No Grandparent";
            printf "\t" x $count . "Current Node=>%s\n",    $current  
+  || "No Current node name";
        }

        # If my parent is my root
        if ( $parent eq $rootname ) {
            printf "\t" x ++$count . "--->'%s' child element of '%s' w
+hich is a %s\n",
                $current, $rootname, $elem->getNodeName() if ( $DEBUG 
+);

            ### Decide where to place new element ###
            ### If current element has a name attribute ###
            ### Add current element to the array ref ###
            if ( $current ) { push @{ $$rootref->{"$parent"} }, { $cur
+rent => [] } }
            printf "\t" x $count . "RootRef defined?=>%s\n", ( defined
+ $$rootref->{"$parent"}->[$#{ $$rootref->{"$parent"} }] ) ? "Yes" : "
+No" if ( $DEBUG );

            ### Update array ref to new location ###
            if ( ( defined $$rootref->{"$parent"}->[$#{ $$rootref->{"$
+parent"} }] ) and ( $current ) ) {
                $rootref = \$$rootref->{"$parent"}->[$#{ $$rootref->{"
+$parent"} }];
                print "\t" x $count . "---Updated RootRef---\n" if ( $
+DEBUG );
            }

            # If this element has kids, traverse them
            $count++;
            printf "\t" x $count . "Has kids=>%s\tHow many=>%s\n", ( $
+elem->hasChildNodes() ) ? "Yes" : "No", scalar @{ $elem->getChildNode
+s() } if ( $DEBUG );
            for my $child ( $elem->getChildNodes() ) {

                if ( ( scalar @{ $child->getChildNodes() } > 1 ) and (
+ $child->getNodeName() !~ m/^radio|select|checkbox/i ) ) {

                    ### Add current element to the array ref here, if 
+not current ###
                    if ( not $current ) { push @{ $$rootref->{"$parent
+"} }, { $child->getAttribute("name") => [] } }
                    if ( $DEBUG ) {
                        printf "\t" x $count . "%s :::> %s name='%s'\n
+", $elem->getChildIndex( $child ), $child->getNodeName(), $child->get
+Attribute("name") || "No Name";
                        printf "\t" x $count . "Has kids=>%s\tHow many
+=>%s\n", ( $child->hasChildNodes() ) ? "Yes" : "No", scalar @{ $child
+->getChildNodes() };
                    }

                    # Update rootref, and pass
                    &_traverse_element( $child, $rootref, $count );
                    print "\t" x $count . '*-' x 10, "\n" if ( $DEBUG 
+);
                }
            }
        }
    }

} # End Sub
[download]

Comment on Maintaing the insertion order of an N-tier hash Select or Download Code

Replies are listed 'Best First'.
•Re: Maintaing the insertion order of an N-tier hash by merlyn (Sage) on Feb 28, 2004 at 02:26 UTC
If order matters, you have an array, not a hash. And since order matters in XML, you'll want arrays of arrays of arrays, not hashes of hashes of hashes. -- Randal L. Schwartz, Perl hacker Be sure to read my standard disclaimer if this is a reply.	[reply]
Re: •Re: Maintaing the insertion order of an N-tier hash by bean (Monk) on Feb 28, 2004 at 04:11 UTC
If you really need a hash, you could always keep the order information in arrays in the hash like this: `$VAR1 = { 'order' => [ 'Foo' ], 'data' => { 'Foo' => { 'order' => [ 'D', 'A', 'E', 'B', 'C' ], 'data' => { 'D' => { 'order' => [ '1' ], 'data' => { '1' => {} }, 'A' => {}, 'E' => { 'order' => [ '1', '2', '3', '4' ], 'data' => { '1' => {}, '2' => {}, '3' => {}, '4' => {} } } ...etc.` [download] Of course, maintaining that structure could be a chore. Java has what you're looking for - a hash that maintains order - it's called a LinkedHashMap. Basically it's a hash with a linked list running through it. <handwaving>I leave the Perl implementation as an exercise for the reader...</handwaving> Update Ok, I feel like an idiot. Ignore the Java ramblings - IxHash is the same as a LinkedHashMap. The data structure works, however (even if it is a pain).	[reply] [d/l]
Re: •Re: Maintaing the insertion order of an N-tier hash by dragonchild (Archbishop) on Feb 28, 2004 at 18:52 UTC
What I did in Excel::Template was to create a hash as the root node. It has an array (called children) of children nodes, each represented as a hash. And, so forth. Some intelligent coding in the baseclass and everything came together. ------ We are the carpenters and bricklayers of the Information Age. Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.	[reply]
Re: Maintaing the insertion order of an N-tier hash by matija (Priest) on Feb 28, 2004 at 08:08 UTC
I think you should look at `XML::Smart`. Any point in the tree parsed by `XML::Smart` can be accessed as an array, or as a tree - whichever is convenient for you at that point. In addition, `XML::Smart` already has methods for writing out the tree (in the original order), and for parsing the XML (with pluggable parsers), so it would seem you would be able to concentrate on the important stuff (like what to do with the data once you have it parsed).	[reply] [d/l] [select]
Re: Re: Maintaing the insertion order of an N-tier hash by dwhitney (Beadle) on Mar 05, 2004 at 23:32 UTC
Thanks for all of your help. Take a look at the code I came up with. I know it's not quite what I said, but it gives me what I need! Thanks all!!!!!	[reply]