Hashes/Scalars and Memory Usage

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Hashes/Scalars and Memory Usage by Fastolfe (Vicar) on Feb 09, 2001 at 00:12 UTC
The short answer: Don't worry about it. Use whatever is more readable and whatever makes sense in your application. If you're talking about 30 arbitrary variables built using soft references, for example, I would vote 100% for going with a hash. The long answer: Any scalar variable is stored in an SV internally. A hash is just a container around a bunch of SV's. I don't know if either method will be necessarily "less efficient" (are you talking about memory or execution time?). On one hand you have the hash lookup to find the value you're looking for in a hash, and on the other hand you have a symbol table lookup for individual scalars (though I guess different variable scoping conventions could make that work differently -- someone else may be able to provide more information). I honestly don't think there's any serious gain one way or the other, and as a Perl developer you shouldn't have to worry about such a thing, especially with only 30 variables. Use what makes the most sense in your code, but avoid playing tricks with soft references that could come around and bite you in the ass later on. Hope this helps.	[reply]
Re: Re: Hashes/Scalars and Memory Usage by EvilTypeGuy (Initiate) on Feb 09, 2001 at 01:06 UTC
I'm really astounded by both the depth and number of replies :) To fill in the situation a little more, what's going on is I have a screen where I do something like the following x number of times (basically for every field I have ...): `COMPANYNAME: { my $company_name = $cgi->textfield( -name=>'company_name', -default=>$tabdata->{CompanyInfo}->{company_name}, -size=>30 ); $screen->AppendToSection('body',<<"EOF"); <tr> <td class="fieldname">Company Name</td> <td class="fieldvalue">$company_name</td> </tr> EOF }` [download] I ended up settling with the above situation. Each field that's on the form (and these are very complex forms at times..) gets it's own little section like that for clarity. By placing it in the code block, the variable goes out of scope as soon as the block's done...(since I no longer need the variable...) Is this the Wrong Thing to Do(TM)?	[reply] [d/l]
Re: Re: Re: Hashes/Scalars and Memory Usage by Fastolfe (Vicar) on Feb 09, 2001 at 02:44 UTC
In this case I would certainly use a hash. You say you have the above bit of code duplicated several times in your script? This is immediately a sign that your code can probably be re-worked to eliminate the duplication. `my @FIELDS = ( company_name => 'Company Name', company_addr => 'Company Address', ... ); for (my $i = 0; $i < @FIELDS; $i+=2) { my $field = $cgi->textfield( -name => $FIELDS[$i], -default => $tabdata->{CompanyInfo}->{$FIELDS[$i]}, -size => 30, ); $screen->AppendToSection('body', <<"EOF"); <tr> <td class='fieldname'>$FIELDS[$i+1]</td> <td class='fieldvalue'>$field</td> </tr> EOF }` [download] If your code varies in more than this respect (e.g. size, the use of another hash key besides CompanyInfo, etc), you could re-structure the `@FIELDS` array something like this: `my @ARRAY = ( 'Company Name' => [ 'CompanyInfo' => 'company_name', 30 ], 'Company Address' => [ 'CompanyInfo' => 'company_addr', 60 ], ... );` [download] Then adapt your code to follow suit..	[reply] [d/l] [select]
Re: Re: Re: Re: Hashes/Scalars and Memory Usage by EvilTypeGuy (Initiate) on Feb 10, 2001 at 02:14 UTC
Re: Re: Hashes/Scalars and Memory Usage by EvilTypeGuy (Initiate) on Feb 09, 2001 at 01:01 UTC
This is what I guessed the situation to be... (btw, i'm the anonyomus poster, I just forgot to login that time...)	[reply]
Re: Re: Hashes/Scalars and Memory Usage by EvilTypeGuy (Initiate) on Feb 13, 2001 at 03:59 UTC
I ended up going with the following: sub DisplayFields { my ($screen,$screenname,$tabdata,$fields) = @_; for (my $i = 0; $i < @$fields; $i+=2) { my ($field,$style) = ($$fields[$i+1]->[1],''); if ($field eq 'textfield') { $field = $cgi->textfield( -name => @$fields[$i], -size => @$fields[$i+1]->[2], -default => $$tabdata->{$screenname}->{@$fields[$i]}, ); } elsif ($field eq 'picklist') { $field = $cgi->popup_menu( -name => @$fields[$i], -values => @$fields[$i+1]->[2], -labels => @$fields[$i+1]->[3], -default => $$tabdata->{$screenname}->{@$fields[$i]}, ); } elsif ($field eq 'checkbox') { $field = $cgi->checkbox( -name => @$fields[$i], -value => 1, -label => '', -checked => $$tabdata->{$screenname}->{@$fields[$i]}, ); } elsif ($field eq 'textarea') { $style = 'vertical-align: top'; $field = $cgi->textarea( -name=> @$fields[$i], -rows => @$fields[$i+1]->[2], -columns => @$fields[$i+1]->[3], -default => $tabdata->{$screenname}->{@$fields[$i]}, ); } if ($field eq 'begin_table') { $$screen->AppendToSection('body',qq(<table class="@$fields +[$i+1]->[2]"><tr><td> </td></tr>)); } elsif ($field eq 'end_table') { $$screen->AppendToSection('body','</table>'); } else { $$screen->AppendToSection('body',qq(<tr><td class="fieldna +me" style="$style">@$fields[$i+1]->[0]</td><td class="fieldvalue">$fi +eld</td></tr>)); } } } [download] So now in my 'tabs', (this is through a web browser don't ask how :p) I can neat things like: `sub CoreModuleInfoTabEdit { my($module,$tabdata,$screen) = @_; SetupStyles(\$screen); DisplayFields(\$screen,'CoreModuleInfo',\$tabdata,[ begin_table => ['','begin_table','config'], exit_url => ['Exit URL','textfield',50], session_expire_time => ['Session Expire Time (minutes)', +'textfield',4], session_max_time => ['Session Max Time (hours)','textfie +ld',4], allow_guest_login => ['Allow Guest Logins','checkbox','' +], end_table => ['','end_table',''], ]); }` [download] (in the app i'm working on each subroutine is a different tab, and there's a complex management system that works completely behind the scenes to manage the user's navigation through the tabs)... This works great for me, before I had two screens that totaled 415 lines, after this, both screens plus the display routine totaled 123 lines :) Mmmmm, Diet Code(TM)... Thanks Fastolfe!	[reply] [d/l] [select]
Re: Hashes/Scalars and Memory Usage by InfiniteSilence (Curate) on Feb 09, 2001 at 00:32 UTC
Here are three benchmarks and some cheezy benchmarking code for a hash, an array, and a variable. The access times are hardly very different. Variables are the fastest across the board, but if you are using variables to 'improve' your code I think your time is better spent looking at other things, like regexes, rather than trying to use only vars. Also, your code, with so many variables, may appear unreadable. Benchmark: timing 1000000 iterations of hsh, reg, var... hsh: 0 wallclock secs ( 1.25 usr + 0.00 sys = 1.25 CPU) @ 79 +8084.60/s (n=1000000) reg: 2 wallclock secs ( 0.85 usr + 0.00 sys = 0.85 CPU) @ 11 +75088.13/s (n=1000000) var: 1 wallclock secs ( 0.77 usr + 0.00 sys = 0.77 CPU) @ 12 +97016.86/s (n=1000000) Benchmark: timing 1000000 iterations of hsh, reg, var... hsh: 2 wallclock secs ( 1.17 usr + 0.00 sys = 1.17 CPU) @ 85 +2514.92/s (n=1000000) reg: 1 wallclock secs ( 0.85 usr + 0.01 sys = 0.86 CPU) @ 11 +61440.19/s (n=1000000) var: 0 wallclock secs ( 0.75 usr + 0.00 sys = 0.75 CPU) @ 13 +31557.92/s (n=1000000) Benchmark: timing 1000000 iterations of hsh, reg, var... hsh: 2 wallclock secs ( 1.18 usr + 0.00 sys = 1.18 CPU) @ 84 +5308.54/s (n=1000000) reg: 1 wallclock secs ( 0.85 usr + 0.00 sys = 0.85 CPU) @ 11 +75088.13/s (n=1000000) var: 1 wallclock secs ( 0.77 usr + 0.00 sys = 0.77 CPU) @ 12 +97016.86/s (n=1000000) #!/usr/bin/perl -w use strict; use Benchmark; my @foo = (1 .. 30); my %foo; my $foo = 10000; for (1.30) { $foo{$_} = $_; } timethese(1000000,{ var=>q( my $v = $foo; ) , reg=>q( my $v = $foo[30];), hsh=>q( my $v = $foo{'30'};) }); 1; [download] Celebrate Intellectual Diversity	[reply] [d/l]
Re: Re: Hashes/Scalars and Memory Usage by Fastolfe (Vicar) on Feb 09, 2001 at 00:52 UTC
Perl is doing an unseen optimization here. Since you're working with a variable with its name spelled out in the code, Perl knows immediately what scalar you are wanting to access, without having to do much in the way of symbol table lookups. I think the original poster was wanting to know the difference between setting and accessing a scalar in a hash versus a soft reference to a variable, the name of which might not be known. Here's some Benchmark code that takes that into consideration. These results are closer to what I expect, but I'm actually surprised that soft references performed this poorly. Note that I may easily be mistaken about the original poster's requirements here (named variable versus a soft reference), but I think with both of our data, there's enough information to see which is more efficient. Regardless, we're talking about an operation that takes a miniscule amount of time. If we save a few microseconds of execution time, that doesn't add up to anything significant, ever. Benchmark: running array, hash, softref, each for at least 10 CPU seco +nds... array: 11 wallclock secs (10.50 usr + 0.00 sys = 10.50 CPU) @ 32 +119.33/s (n=337253) hash: 11 wallclock secs (10.30 usr + 0.00 sys = 10.30 CPU) @ 22 +763.50/s (n=234464) softref: 10 wallclock secs (10.39 usr + 0.00 sys = 10.39 CPU) @ 17 +753.61/s (n=184460) use Benchmark; foreach (1..10) { $hash{$_} = 'x'; $array[$_] = 'x'; ${"var_$_"} = 'x'; } sub hash { $hash{$_} = 'hash!' foreach 1..10 } sub array { $array[$_] = 'array!' foreach 1..10 } sub softref { ${"var_$_"} = 'ref!' foreach 1..10 } timethese(-10, { hash => \&hash, array => \&array, softref => \&softref, }); [download]	[reply] [d/l]
Re: Hashes/Scalars and Memory Usage by kschwab (Vicar) on Feb 09, 2001 at 00:19 UTC
Testing on a Solaris box, with the following two scripts: `#script one $f1="X" x 200000; $f2=$f1; $f3=$f1; ... $f30=$f1; while (1) {sleep 1} #script two for (1..30) { $f{$_}="X" x 200000; } while (1) {sleep 1}` [download] They seem to be identical in memory usage. Process 1: Text VSS: 1.9mb Data VSS: 8.1mb Stack VSS: 16kb Shmem VSS: 0kb Other VSS: 0kb Total VSS: 10mb Process 2: Text VSS: 1.9mb Data VSS: 8.1mb Stack VSS: 16kb Shmem VSS: 0kb Other VSS: 0kb Total VSS: 10mb I would suspect how you end up populating/using the scalars and/or hashes would have a greater effect.	[reply] [d/l]
Re: Hashes/Scalars and Memory Usage by AgentM (Curate) on Feb 09, 2001 at 00:28 UTC
AgentM was just blowing gas here. Never mind. Thanks to Fastolfe, chipmunk, myocom for the explanations and for catching my mistake. AgentM Systems nor Nasca Enterprises nor Bone::Easy nor Macperl is responsible for the comments made by AgentM. Remember, you can build any logical system with NOR.	[reply]
Re: Re: Hashes/Scalars and Memory Usage by Fastolfe (Vicar) on Feb 09, 2001 at 00:32 UTC
`use Data::Dumper; foreach (qw/ one two three /) { $hash{$_} = "in hash: $_"; $array[$_] = "in array: $_"; } print Dumper(\%hash, \@array); $VAR1 = { 'one' => 'in hash: one', 'three' => 'in hash: three', 'two' => 'in hash: two' }; $VAR2 = [ 'in array: three' ];` [download] The string in the array assignment is being interpreted in a numeric context, which makes it 0. Thus you are getting data from and assigning to index 0 of the array every time. The hash will grow, while the array will only contain the last item you save.	[reply] [d/l]

AgentM Systems nor Nasca Enterprises nor Bone::Easy nor Macperl is responsible for the comments made by AgentM. Remember, you can build any logical system with NOR.