flexvault has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks

The attached program shows a comparison of 3 ways I could use to replicate a multi-dimensional hash between child servers. I currently use the 2nd method now. The master hash is in the parent global memory, and the children manage their sub-hash by copying between global and local memory. The times are very fast, but because the global memory must be share locked on copy and exclusive locked on update, the best time possible is desired.

What's surprising to me is that to copy the entire hash is much faster than copying the sub-hash. We don't want to do that because in production the master hash can get quite large and the child should not touch anything but their copy of the sub-hash. For example, the last line of the output, the global hash is 16,000K bytes and the sub-hash is 65K bytes.

While the program is bulky, the 3 different techniques are between the "Lock/Unlock" comments. I ran this program with several different versions of perl, and on several different machines and architectures. The sample output is proportional to all, except when running the perl5.8.8 version the "Replicate sub-Hash" was about 1/2 the time of the "Loop Copy", but still more than "Replicate Entire".

So the question is, am I using the best method, or are there other better ways to replicate/copy a sub-hash?

#!/usr/local/bin/perl5.12.1 use strict; use Time::HiRes qw( gettimeofday ); print "\t Replicate Entire\tLoop Copy\tReplicate sub-Hash\n","-" x +70,"\n"; my $Lcnt = 1_000; for (my $i=30; $i<=256; $i+=32 ) { ## Populate the Hash my %Account = (); keys( %{ $Account{$i} } ) = $i; for my $cnt ( 0 .. $i ) { for my $id ( 0 .. $i ) { $Account{"$cnt"}{"$id"}="x" x (int(rand($i))+6); } } ## Lock the global hash %Account and Replicate the entire hash my $tm = gettimeofday; for ( 0 .. $Lcnt ) ## Just to get a big enough number to co +mpare { my %TAccount = %Account; } $tm = sprintf("%.7f",(gettimeofday-$tm)/$Lcnt); ## Unlock the global hash %Account ## Lock the global hash %Account and loop copy the local hash from +the global hash of hashes my $tm1 = gettimeofday; for ( 0 .. $Lcnt ) ## Just to get a big enough number to co +mpare { my %TAccount = (); keys( %TAccount ) = $i; foreach my $key ( keys %{ $Account{$i} } ) { $TAccount{$key} = $Account{$i}{$key}; } } $tm1 = sprintf("%.7f",(gettimeofday-$tm1)/$Lcnt); ## Unlock the global hash %Account ## Lock the global hash %Account and replicate the local hash from +the global hash of hashes my $tm2 = gettimeofday; for ( 0 .. $Lcnt ) ## Just to get a big enough number to co +mpare { my %TAccount = %{ $Account{"$i"} }; } $tm2 = sprintf("%.7f",(gettimeofday-$tm2)/$Lcnt); ## Unlock the global hash %Account print $i, ":", scalar %Account, "\t$tm\t$tm1\t$tm2\n"; } 1;
            Replicate Entire    Loop Copy    Replicate sub-Hash
--------------------------------------------------------------------
30:23/32        0.0000673       0.0001909       0.0001010
62:44/64        0.0001366       0.0003928       0.0002065
94:70/128       0.0002118       0.0005497       0.0003087
126:86/128      0.0002938       0.0008210       0.0004337
158:119/256     0.0004036       0.0010650       0.0005504
190:134/256     0.0005288       0.0011924       0.0006958
222:148/256     0.0005881       0.0014451       0.0007889
254:163/256     0.0006681       0.0017491       0.0009370

Thank you

"Well done is better than well said." - Benjamin Franklin

Replies are listed 'Best First'.
Re: How best to replicate/copy a hash of hashes
by zentara (Cardinal) on Oct 04, 2010 at 16:34 UTC

      Thanks for the answer. I looked at both Storable and Clone, and ruled Storable out because of performance. I got Clone from CPAN, installed, then ran it in the test. Looked okay, but checking the output, I found that it put a '\n' between the key and data in the cloned hash. Maybe there is an undocumented parameter to fix this. Either way, still looking, but thanks for the info.

      Thank you

      "Well done is better than well said." - Benjamin Franklin

        I found that it put a '\n' between the key and data in the cloned hash

        That can only be an artifact of the way you are inspecting the cloned hash, because Clone::clone() adds nothing:

        use Clone qw[ clone ];; my $n = 0; my %hash = map{ $_, ++$n} 'a' .. 'e';; pp \%hash;; { a => 1, b => 2, c => 3, d => 4, e => 5 } my $copy = clone \%hash;; pp $copy;; { a => 1, b => 2, c => 3, d => 4, e => 5 }

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: How best to replicate/copy a hash of hashes
by jethro (Monsignor) on Oct 04, 2010 at 17:09 UTC
    Is {  my %TAccount = %Account; } your method to copy a Hash of Hashes ? What you copied is only the mother-hash, all the sub-hashes are still the original ones. Try this:
    #!/usr/bin/perl my %hash= ( 'a' => [ 1=>2, 3=>4 ], 'b' => [ 5=>6, 7=>8 ] ); print \%hash," ", $hash{'a'}," ", $hash{'b'},"\n"; my %hash2= %hash; print \%hash2," ", $hash2{'a'}," ", $hash2{'b'},"\n"; #prints HASH(0x8daf70) ARRAY(0x8aee28) ARRAY(0x8cfc10) HASH(0x8db120) ARRAY(0x8aee28) ARRAY(0x8cfc10)

    As you can see, %hash2 is at a different memory location but the entries in the hash point to the same subhashes

      Thanks, I took you're suggestion and put the prints into the program and ran it only once and printed the \hash and 3 \keys data. The output is below. 'B:' is for before the replicate/copy and 'A:' is after. As you can see, it looks like the data has been copied to a different location. I have no idea why 'B:' says 'REF' and A: says 'SCALAR', since they are the exact same print statement only adding the T to Account to show the two different hashes.

      print "\nB:Acct: ",\%Account,"\t",\$Account{"0"},"\t",\$Account{"1" +},"\t",\$Account{"10"},"\n"; { my %TAccount = (); keys( %TAccount ) = $i; foreach my $key ( keys %{ $Account{$i} } ) { $TAccount{$key} = $Account{$i}{$key}; } print "A:TAcct1: ",\%TAccount,"\t",\$TAccount{"0"},"\t",\$TAccount{"1 +"},"\t",\$TAccount{"10"},"\n"; print "\nB:Acct: ",\%Account,"\t",\$Account{"0"},"\t",\$Account{"1" +},"\t",\$Account{"10"},"\n"; { my %TAccount = %{ $Account{"$i"} }; print "A:TAcct2: ",\%TAccount,"\t",\$TAccount{"0"},"\t",\$TAccount{"1 +"},"\t",\$TAccount{"10"},"\n";
      B:Acct:    HASH(0x300c0120)     REF(0x300e72d8)		 REF(0x3002fd8c)         REF(0x300f4c00)
      A:TAcct1:  HASH(0x300e70a4)     SCALAR(0x301413b0)      SCALAR(0x30141284)      SCALAR(0x30141344)
      
      B:Acct:    HASH(0x300c0120)     REF(0x300e72d8)		 REF(0x3002fd8c)         REF(0x300f4c00)
      A:TAcct2:  HASH(0x300e7260)     SCALAR(0x30141974)      SCALAR(0x30141848)      SCALAR(0x30141908)
      

      Thank you

      "Well done is better than well said." - Benjamin Franklin

        I was talking previously about your first method which you called "Replicate Entire". The one that was so fast. And as far as I can see not working (except if you don't mind that the copies still all access the same data in the source HoH)

        You now showed results from the third method instead where you copy a subhash. Which is basically working correct as there is no problem in copying a simple hash.

        Its actually hard for me to understand how this code prints anything sensible at all. I just added white-space and comments. Something is a bit bizarre here.

        Update: Anyway $Account{"0"} is a reference to a hash. The purpose of this is to create a new hash and "take out one level of hash key" which is a reference. $Account{$i}{$key} should be a scalar and likewise $TAccount{$key} should be a scalar - the {$i} keys are "gone". I still think something is a bit weird with the code below, but I think I grok the idea of what is supposed to happen.

        print "\nB:Acct: ",\%Account,"\t",\$Account{"0"},"\t", \$Account{"1"},"\t",\$Account{"10"},"\n"; ####### { #I'm guessing this is a typo ???? ####### my %TAccount = (); keys( %TAccount ) = $i; foreach my $key ( keys %{ $Account{$i} } ) { $TAccount{$key} = $Account{$i}{$key}; } print "A:TAcct1: ",\%TAccount,"\t",\$TAccount{"0"}, "\t",\$TAccount{"1"}, "\t",\$TAccount{"10"},"\n"; print "\nB:Acct: ",\%Account,"\t",\$Account{"0"}, "\t",\$Account{"1"}, "\t",\$Account{"10"},"\n"; { #### this does nothing!!! ##### #### except hide the previous %TAccount my %TAccount = %{ $Account{"$i"} }; print "A:TAcct2: ",\%TAccount,"\t",\$TAccount{"0"}, "\t",\$TAccount{"1"}, "\t",\$TAccount{"10"},"\n";
Re: How best to replicate/copy a hash of hashes
by Anonymous Monk on Oct 04, 2010 at 15:41 UTC
    What do those numbers even mean? :p use Benchmark; its core