traceyfreitas has asked for the wisdom of the Perl Monks concerning the following question:

I have large datastructures (hashes of hashes of ...) containing data that have been computed over quite some time and I normally just use:

use Storable; ... store \%hash1, $filename;

and

my $href2 = retrieve $filename;

to dump and reload from disk quickly, and between runs of the same program.

The problem is now that I'm using threads and threads::shared, my shared hashes do not store to disk. Actually they do, but they're basically empty. I've read posts similar to this issue, but not regarding the use of "store" itself.

Anyone know how to coax Storable to "store" a shared hash (of a hash of a hash...)? My current (inefficient) workaround is to create the HoHoH as a normal (unshared) variable, "store" it, then create a shared version of it, destroy the unshared version, and proceed with the shared version.

Thanks in advance!

Replies are listed 'Best First'.
Re: unable to store shared hash with Storable
by ikegami (Patriarch) on Aug 10, 2011 at 18:31 UTC
    You could use JSON or YAML instead. I hear they're even faster since they doesn't try to be as precise as Storable, which is also the very reason why they'll solve your problem too.

      JSON's encode() function would create a UTF8 text string out of the shared hash, so I just decoded() it back into an (unshared) HREF, and then I just stored that.

      my $encoded = JSON::XS->new->utf8->encode( $HREF ); my $decoded = JSON::XS->new->utf8->decode( $encoded ); store $decoded, $filename1;
      or I could just slap it all into one statement:
      store(JSON::XS->new->utf8->decode( JSON::XS->new->utf8->encode( $HREF ) ), $filename1);
      and all is well in the universe again!
        Well, I was suggesting you could use JSON *instead* of Storable — they're both tools to serialise data structures — but that works too :)

      I've read YAML was the way to go, but the Dump() interface would just push out ASCII text. I did another check and it appears the YAML::XS might generate the binary for me. Thanks for the suggestions! I'll report back after trying it out.

Re: unable to store shared hash with Storable
by zentara (Cardinal) on Aug 10, 2011 at 16:40 UTC
    The problem is now that I'm using threads and threads::shared, my shared hashes do not store to disk. Actually they do, but they're basically empty

    Not much help, but I know from experimenting with shared hashes, the share method only shares the first level keys. Maybe you could use the freeze and thaw methods? See the MEMORY STORE section of perldoc Storable.


    I'm not really a human, but I play one on earth.
    Old Perl Programmer Haiku ................... flash japh
Re: unable to store shared hash with Storable
by BrowserUk (Patriarch) on Aug 11, 2011 at 04:59 UTC

    This is one of those occasions when the intelligence in one module (Storable) combines with the incompleteness of another (threads::shared), to produce a mess.

    Your JSON cloning mechanism is overkill though. A simple dumb clone() does the trick:

    #! perl -slw use strict; use Data::Dump qw[ pp ]; $Data::Dump::WIDTH = 1000; use threads; use threads::shared; use Storable qw[ store retrieve ]; sub clone { my $ref = shift; ref( $ref ) or return $ref; ref( $ref ) eq 'SCALAR' and return \(''.$$ref); ref( $ref ) eq 'HASH' and return { map+( $_, clone( $ref->{$_} )), keys %{ $ref } } +; ref( $ref ) eq 'ARRAY' and return [ map clone( $ref->[$_] ) , 0 .. $#{ $ref } ] +; die; } my %hash : shared = ( letters => shared_clone( { 'a'..'z' } ), numbers => shared_clone( { 1 .. 100 } ), ); store clone( \%hash ), "$0.bin"; my $retrieved :shared = shared_clone( retrieve "$0.bin" ); pp $retrieved;

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Awesome. I benchmarked both the JSON and your clone() sub on one of my actual datasets -- with store() -- and both obviously worked really fast:

      CLONE() TIME: 0 wallclock secs ( 0.18 usr + 0.00 sys = 0.18 CPU) JSON TIME: 0 wallclock secs ( 0.16 usr + 0.00 sys = 0.16 CPU)

      There are only very slight size differences in the binary files produced by store():

      624924 Aug 11 12:39 sharedhash.json 622808 Aug 11 12:39 sharedhash.clone
      but they still contain the same data when "retrieved". Thanks, BrowserUK.