tomdbs98 has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

I appear to have no knack for using hashes. I have a nested hash structure call $ligHash that points to this:

$VAR1 = { '34k' => { 'R1' => { ... }, 'R2' => { ... } ... }, '34n' => { 'R1' => { ... }, etc... };
Now I pass each sub hash (i.e. '34k') to a function that dumps it:
foreach my $lig (%{$ligHash}) { DoDump($lig); } sub DoDump { my $ligand = shift; open TEMP, ">>", "ligandDump.txt"; print TEMP Dumper($ligand); close TEMP; }
I end up getting instead this:
$VAR1 = '34k'; $VAR1 = { 'R1' => { ... } etc.. } $VAR1 = '34n'; $VAR1 = { 'R1' => { ... } etc.. }
Does anyone know why this is happening?

And as a side question, when I dump the entire structure to a .txt file it ends up being ~6.5mb. Is this excessively large to hold in memory or anything like that?

Thanks ahead of time!

-Thomas

Replies are listed 'Best First'.
Re: Problem cycling through top level of nested hashes
by almut (Canon) on Jun 10, 2010 at 16:12 UTC

    Use keys (or values, in case you don't need to know which keys the hasrefs are associated with):

    foreach my $lig (keys %{$ligHash}) { DoDump($ligHash->{$lig}); }

    (not using keys/values will simply flatten the hash, so you get all its elements, i.e. alternating keys and values)

Re: Problem cycling through top level of nested hashes
by kennethk (Abbot) on Jun 10, 2010 at 16:14 UTC
    The issue you are having is in how you are cycling over your hash. foreach my $lig (%{$ligHash}) converts the hash into a list, alternating key-value pairs. I believe what you mean to use is:

    foreach my $lig (values %{$ligHash}) { DoDump($lig); }

    This will feed only the sub-hashes into your subroutine (see values).

    Regarding memory usage: Today, 6.5 MB is not all that much, unless you are using legacy hardware or developing toward mobile apps. The definition of "a lot" depends strongly on application.

    Update: After reading almut's post above, I have to ask what is your expected output? Please read How do I post a question effectively?. Clearly two monks read your post and drew opposite conclusions as to your intention.

      Actually, I get the exact same output from both solutions, which works just great for me :) So thank you both.

      I end up with:

      $VAR1 = { 'R1' => { }, 'R2' => { } ... }; $VAR1 = { 'R1' => { }, 'R2' => { } ... }; ...
      However, I will be using:
      foreach my $lig (keys %{$ligHash}) { findIntersection($ligHash->{$lig}, $lig); }
      Because it gives me the option to easily pass the hash name (i.e. '34k') as a string.

      Did either of you intend on a different output? If so I would be interested in that as well.

Re: Problem cycling through top level of nested hashes
by skywalker (Beadle) on Jun 10, 2010 at 19:09 UTC

    Just a quick one regarding your side question. As far as I know there is no limit to the size of the hash its simply a question of RAM.

    Ive read and dedupped a UK MPS file (4 1/2 million records) with no problem on a machine with 2 GB ram.

    skywalker