genecutl has asked for the wisdom of the Perl Monks concerning the following question:

I'm working with a complex data structure that looks like this:
$data[1]->{'477'}->[5] = [ 'value', 'value', 0 , undef, 5, 'value' ];
The code was working quite nicely until I started working with very large data sets that caused my machine to run out of memory. To get around this problem, I'm trying to use a disk-based tied data structure with MLDBM. Unfortunately, I can't get it to work. Here is some simplified code:
use MLDBM qw(DB_File Storable); # ... stuff .... foreach $gid (@group_ids) { my %temp_hash; unlink "$DBFilePath/$$.$gid" if -e "$DBFilePath/$$.$gid"; tie %temp_hash, 'MLDBM', "$DBFilePath/$$.$gid", O_RDWR|O_CREAT, 06 +40 or die "Can't tie temp_array to $DBFilePath/$$.$gid: $!"; $data[$gid] = \%temp_hash; foreach $sample_id (@{$groups[$gid]}) { foreach $element_num ( 0 .. $element_count ) { $element = []; # ... do a bunch of stuff here ... $data[$gid]->{$sample_id}->[$element_num] = $element; } } }
Unfortunately, that assignment at the end doesn't take, and $data[$id] remains empty. I've tried things like this:
$tmp = $data[$gid]->{$sample_id}; $tmp->[$element_num] = $element;
But that still doesn't work. The only way I've found to add anything to $data[$id] is to assign an array ref in its entirety:
$tmp = $data[$gid]; $tmp=>{$sample_id} = [ $element1, $element2, ... ];
That is obviously not a workable solution, since I need to add elements one by one to get around the memory issues. Am I asking for too much or am I just doing this wrong? Thanks!

Replies are listed 'Best First'.
Re: Need help with complex tied data structure
by perrin (Chancellor) on Nov 20, 2003 at 22:22 UTC
    Your code sample looks incomplete. You assign a ref to your tied hash to $data[$id] and then never do anything with it. Can you show a more complete example?
      $data[$gid] is a hash ref, so $data[$gid]->{$sample_id} is an element in that hash. To make it more explicit:
      foreach $sample_id (@{$groups[$gid]}) { $data[$gid]->{$sample_id} = []; # assign anon array ref to ti +ed hash foreach $element_num ( 0 .. $element_count ) { $element = []; # ... do a bunch of stuff here ... $data[$gid]->{$sample_id}->[$element_num] = $element; } }
      Does that make more sense?
        Is that a typo in your original code then? You never put the tied hash into $data[$gid], you put it in $data[$id] instead.
Re: Need help with complex tied data structure
by Abigail-II (Bishop) on Nov 20, 2003 at 23:04 UTC
    Have you paid attention to this part of the manual page:
    BUGS 1. Adding or altering substructures to a hash value is not entirely transparent in current perl. If you want to store a reference or modify an existing reference value in the DBM, it must first be retrieved and stored in a temporary variable for further modifica­ tions. In particular, something like this will NOT work properly: $mldb{key}{subkey}[3] = 'stuff'; # won't wor +k Instead, that must be written as: $tmp = $mldb{key}; # retrieve +value $tmp->{subkey}[3] = 'stuff'; $mldb{key} = $tmp; # store val +ue This limitation exists because the perl TIEHASH inter­ face currently has no support for multidimensional ties.

    Abigail

      Yes, I saw that. That's why I show the code at the end of my question with the $tmp variable. I tried pointing $tmp to different depths in the data structure, as well as having multiple temporary variables like:
      $tmp = $data[$gid]; $tmp2 = $tmp->{$id}; $tmp2->[$element_num] = $element;
      But none of that seemed to work.
        The fragment I posted also stresses out the importance of storing back the modified value in the top level hash. Your code doesn't do that.

        Abigail