Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have a question about perl's efficiency when it comes to memory usage. Here are the situations, which one uses less memory?:

1) I have 30 scalar variables (i have good reasons, just assume I need them all...)...

2) I have one hash with 30 keys...

Which one is more memory efficient? ( assume I don't care about sorting or anything...) Since I need to reference the values by a 'name' these are my two options (unless someone knows something I don't ;) So what is the most efficient?

Replies are listed 'Best First'.
Re: Hashes/Scalars and Memory Usage
by Fastolfe (Vicar) on Feb 09, 2001 at 00:12 UTC
    The short answer: Don't worry about it. Use whatever is more readable and whatever makes sense in your application. If you're talking about 30 arbitrary variables built using soft references, for example, I would vote 100% for going with a hash.

    The long answer: Any scalar variable is stored in an SV internally. A hash is just a container around a bunch of SV's. I don't know if either method will be necessarily "less efficient" (are you talking about memory or execution time?). On one hand you have the hash lookup to find the value you're looking for in a hash, and on the other hand you have a symbol table lookup for individual scalars (though I guess different variable scoping conventions could make that work differently -- someone else may be able to provide more information).

    I honestly don't think there's any serious gain one way or the other, and as a Perl developer you shouldn't have to worry about such a thing, especially with only 30 variables. Use what makes the most sense in your code, but avoid playing tricks with soft references that could come around and bite you in the ass later on. Hope this helps.

      I'm really astounded by both the depth and number of replies :) To fill in the situation a little more, what's going on is I have a screen where I do something like the following x number of times (basically for every field I have ...):
      COMPANYNAME: { my $company_name = $cgi->textfield( -name=>'company_name', -default=>$tabdata->{CompanyInfo}->{company_name}, -size=>30 ); $screen->AppendToSection('body',<<"EOF"); <tr> <td class="fieldname">Company Name</td> <td class="fieldvalue">$company_name</td> </tr> EOF }
      I ended up settling with the above situation. Each field that's on the form (and these are very complex forms at times..) gets it's own little section like that for clarity. By placing it in the code block, the variable goes out of scope as soon as the block's done...(since I no longer need the variable...) Is this the Wrong Thing to Do(TM)?
        In this case I would certainly use a hash. You say you have the above bit of code duplicated several times in your script? This is immediately a sign that your code can probably be re-worked to eliminate the duplication.
        my @FIELDS = ( company_name => 'Company Name', company_addr => 'Company Address', ... ); for (my $i = 0; $i < @FIELDS; $i+=2) { my $field = $cgi->textfield( -name => $FIELDS[$i], -default => $tabdata->{CompanyInfo}->{$FIELDS[$i]}, -size => 30, ); $screen->AppendToSection('body', <<"EOF"); <tr> <td class='fieldname'>$FIELDS[$i+1]</td> <td class='fieldvalue'>$field</td> </tr> EOF }
        If your code varies in more than this respect (e.g. size, the use of another hash key besides CompanyInfo, etc), you could re-structure the @FIELDS array something like this:
        my @ARRAY = ( 'Company Name' => [ 'CompanyInfo' => 'company_name', 30 ], 'Company Address' => [ 'CompanyInfo' => 'company_addr', 60 ], ... );
        Then adapt your code to follow suit..
      This is what I guessed the situation to be... (btw, i'm the anonyomus poster, I just forgot to login that time...)
      I ended up going with the following:
      sub DisplayFields { my ($screen,$screenname,$tabdata,$fields) = @_; for (my $i = 0; $i < @$fields; $i+=2) { my ($field,$style) = ($$fields[$i+1]->[1],''); if ($field eq 'textfield') { $field = $cgi->textfield( -name => @$fields[$i], -size => @$fields[$i+1]->[2], -default => $$tabdata->{$screenname}->{@$fields[$i]}, ); } elsif ($field eq 'picklist') { $field = $cgi->popup_menu( -name => @$fields[$i], -values => @$fields[$i+1]->[2], -labels => @$fields[$i+1]->[3], -default => $$tabdata->{$screenname}->{@$fields[$i]}, ); } elsif ($field eq 'checkbox') { $field = $cgi->checkbox( -name => @$fields[$i], -value => 1, -label => '', -checked => $$tabdata->{$screenname}->{@$fields[$i]}, ); } elsif ($field eq 'textarea') { $style = 'vertical-align: top'; $field = $cgi->textarea( -name=> @$fields[$i], -rows => @$fields[$i+1]->[2], -columns => @$fields[$i+1]->[3], -default => $tabdata->{$screenname}->{@$fields[$i]}, ); } if ($field eq 'begin_table') { $$screen->AppendToSection('body',qq(<table class="@$fields +[$i+1]->[2]"><tr><td>&nbsp;</td></tr>)); } elsif ($field eq 'end_table') { $$screen->AppendToSection('body','</table>'); } else { $$screen->AppendToSection('body',qq(<tr><td class="fieldna +me" style="$style">@$fields[$i+1]->[0]</td><td class="fieldvalue">$fi +eld</td></tr>)); } } }
      So now in my 'tabs', (this is through a web browser don't ask how :p) I can neat things like:
      sub CoreModuleInfoTabEdit { my($module,$tabdata,$screen) = @_; SetupStyles(\$screen); DisplayFields(\$screen,'CoreModuleInfo',\$tabdata,[ begin_table => ['','begin_table','config'], exit_url => ['Exit URL','textfield',50], session_expire_time => ['Session Expire Time (minutes)', +'textfield',4], session_max_time => ['Session Max Time (hours)','textfie +ld',4], allow_guest_login => ['Allow Guest Logins','checkbox','' +], end_table => ['','end_table',''], ]); }
      (in the app i'm working on each subroutine is a different tab, and there's a complex management system that works completely behind the scenes to manage the user's navigation through the tabs)... This works great for me, before I had two screens that totaled 415 lines, after this, both screens plus the display routine totaled 123 lines :) Mmmmm, Diet Code(TM)...

      Thanks Fastolfe!
Re: Hashes/Scalars and Memory Usage
by InfiniteSilence (Curate) on Feb 09, 2001 at 00:32 UTC
    Here are three benchmarks and some cheezy benchmarking code for a hash, an array, and a variable. The access times are hardly very different. Variables are the fastest across the board, but if you are using variables to 'improve' your code I think your time is better spent looking at other things, like regexes, rather than trying to use only vars. Also, your code, with so many variables, may appear unreadable.
    Benchmark: timing 1000000 iterations of hsh, reg, var... hsh: 0 wallclock secs ( 1.25 usr + 0.00 sys = 1.25 CPU) @ 79 +8084.60/s (n=1000000) reg: 2 wallclock secs ( 0.85 usr + 0.00 sys = 0.85 CPU) @ 11 +75088.13/s (n=1000000) var: 1 wallclock secs ( 0.77 usr + 0.00 sys = 0.77 CPU) @ 12 +97016.86/s (n=1000000) Benchmark: timing 1000000 iterations of hsh, reg, var... hsh: 2 wallclock secs ( 1.17 usr + 0.00 sys = 1.17 CPU) @ 85 +2514.92/s (n=1000000) reg: 1 wallclock secs ( 0.85 usr + 0.01 sys = 0.86 CPU) @ 11 +61440.19/s (n=1000000) var: 0 wallclock secs ( 0.75 usr + 0.00 sys = 0.75 CPU) @ 13 +31557.92/s (n=1000000) Benchmark: timing 1000000 iterations of hsh, reg, var... hsh: 2 wallclock secs ( 1.18 usr + 0.00 sys = 1.18 CPU) @ 84 +5308.54/s (n=1000000) reg: 1 wallclock secs ( 0.85 usr + 0.00 sys = 0.85 CPU) @ 11 +75088.13/s (n=1000000) var: 1 wallclock secs ( 0.77 usr + 0.00 sys = 0.77 CPU) @ 12 +97016.86/s (n=1000000) #!/usr/bin/perl -w use strict; use Benchmark; my @foo = (1 .. 30); my %foo; my $foo = 10000; for (1.30) { $foo{$_} = $_; } timethese(1000000,{ var=>q( my $v = $foo; ) , reg=>q( my $v = $foo[30];), hsh=>q( my $v = $foo{'30'};) }); 1;

    Celebrate Intellectual Diversity

      Perl is doing an unseen optimization here. Since you're working with a variable with its name spelled out in the code, Perl knows immediately what scalar you are wanting to access, without having to do much in the way of symbol table lookups. I think the original poster was wanting to know the difference between setting and accessing a scalar in a hash versus a soft reference to a variable, the name of which might not be known. Here's some Benchmark code that takes that into consideration. These results are closer to what I expect, but I'm actually surprised that soft references performed this poorly. Note that I may easily be mistaken about the original poster's requirements here (named variable versus a soft reference), but I think with both of our data, there's enough information to see which is more efficient. Regardless, we're talking about an operation that takes a miniscule amount of time. If we save a few microseconds of execution time, that doesn't add up to anything significant, ever.
      Benchmark: running array, hash, softref, each for at least 10 CPU seco +nds... array: 11 wallclock secs (10.50 usr + 0.00 sys = 10.50 CPU) @ 32 +119.33/s (n=337253) hash: 11 wallclock secs (10.30 usr + 0.00 sys = 10.30 CPU) @ 22 +763.50/s (n=234464) softref: 10 wallclock secs (10.39 usr + 0.00 sys = 10.39 CPU) @ 17 +753.61/s (n=184460) use Benchmark; foreach (1..10) { $hash{$_} = 'x'; $array[$_] = 'x'; ${"var_$_"} = 'x'; } sub hash { $hash{$_} = 'hash!' foreach 1..10 } sub array { $array[$_] = 'array!' foreach 1..10 } sub softref { ${"var_$_"} = 'ref!' foreach 1..10 } timethese(-10, { hash => \&hash, array => \&array, softref => \&softref, });
Re: Hashes/Scalars and Memory Usage
by kschwab (Vicar) on Feb 09, 2001 at 00:19 UTC
    Testing on a Solaris box, with the following two scripts:

    #script one $f1="X" x 200000; $f2=$f1; $f3=$f1; ... $f30=$f1; while (1) {sleep 1} #script two for (1..30) { $f{$_}="X" x 200000; } while (1) {sleep 1}
    They seem to be identical in memory usage.

    Process 1:
    Text  VSS: 1.9mb   Data  VSS: 8.1mb   Stack VSS:  16kb
    Shmem VSS:   0kb   Other VSS:   0kb   Total VSS:  10mb
    
    Process 2:
    Text  VSS: 1.9mb   Data  VSS: 8.1mb   Stack VSS:  16kb
    Shmem VSS:   0kb   Other VSS:   0kb   Total VSS:  10mb
    
    I would suspect how you end up populating/using the scalars and/or hashes would have a greater effect.
    
Re: Hashes/Scalars and Memory Usage
by AgentM (Curate) on Feb 09, 2001 at 00:28 UTC
      use Data::Dumper; foreach (qw/ one two three /) { $hash{$_} = "in hash: $_"; $array[$_] = "in array: $_"; } print Dumper(\%hash, \@array); $VAR1 = { 'one' => 'in hash: one', 'three' => 'in hash: three', 'two' => 'in hash: two' }; $VAR2 = [ 'in array: three' ];
      The string in the array assignment is being interpreted in a numeric context, which makes it 0. Thus you are getting data from and assigning to index 0 of the array every time. The hash will grow, while the array will only contain the last item you save.