Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

How to access a static hash.

by gam3 (Curate)
on Mar 17, 2007 at 17:54 UTC ( [id://605287]=perlquestion: print w/replies, xml ) Need Help??

gam3 has asked for the wisdom of the Perl Monks concerning the following question:

Once not that long ago someone told me that
my $hash = { a => 'b' }; $hash->{$x}
is faster than
{a => 'b'}->{$x}
I finally got around to testing this statment. I found that it was completely wrong. I was also suprised at the difference.
our $hash = { 'a' => 'A', 'b' => 'B'}; our $y = 'bob'; cmpthese(-10, { 'hash' => sub { for my $x (qw(a b a b a b)) { my $y = { 'a' => 'A' +, 'b' => 'B'}->{$x}; }}, 'ref' => sub { for my $x (qw(a b a b a b)) { my $y = $hash->{$x}; +}}, });
          Rate   hash    ref
hash   11956/s     --   -73%
ref    43773/s   266%     --
Update: Well as Rosan Rosanadana used to say: Never mind!
Removed unused code.
-- gam3
A picture is worth a thousand words, but takes 200K.

Replies are listed 'Best First'.
Re: How to access a static hash.
by Joost (Canon) on Mar 17, 2007 at 18:16 UTC
    You're not testing what you claim you're testing: you're testing creating and accessing an an anonymous hash vs only accessing a named hash. You're also reading the result the wrong way 'round.

    use strict; use warnings; use Benchmark qw(cmpthese); cmpthese(-10, { 'anon' => sub { for my $x (qw(a b a b a b)) { my $y = { 'a' => 'A' +, 'b' => 'B'}->{$x}; }}, 'named' => sub { for my $x (qw(a b a b a b)) { my $h = { 'a' => 'A +', 'b' => 'B'}; my $y = $h->{$x}; }}, });
    output:
    Rate named anon named 78120/s -- -8% anon 84539/s 8% --
    Meaning that using an anonymous (not stored) hash is just a bit faster. Not very interesting, considering you almost always want to reuse a hash you've just created.

      I was testing what I thought I was testing. What I don't understand is why Perl can't or doesn't optimize this case.
      Now if only I can learn to read the output of Benchmark.
      -- gam3
      A picture is worth a thousand words, but takes 200K.
        Ok, then I misunderstood your first code snippet.

        Note that the hash creation/deletion can only be optimized away if the content of the hash is constant (ie it would require extra analyzing code). I would guess it's possible to optimize it, but I don't think your construct is widely used (it's also fairly limited, since you can only access the hash once before it goes out of scope), so it probably won't really solve any "real-world" performance issues.

        As for the Benchmark output, I always find the "Rate - XXXX/s" results the easiest to interpret: those are just the number of calls completed per second, so higher is faster.

Re: How to access a static hash.
by ferreira (Chaplain) on Mar 17, 2007 at 18:19 UTC

    I think you are making a mistake. The surprising hypothesis you are testing is whether the access to a literal hash ref

    { a => 'b' }->{$k};
    is slower than saving the hash ref in a var and accessing
    my $hash_ref = { a => 'b' }; # and then $hash_ref->{$k}
    And your benchmark suggested it is so indeed. I guess you're wondering: but there should be an overhead to access the variable, so that using a var should be slower. But the big problem with your benchmark code is that the literal hash ref is being constructed inside the loop over and over, and that causes an overhead greater than the one of acessing the variable.

    I think the hypothesis "access to a literal hash ref is slower than access via a hash ref saved in a variable" is not true. The following code adds a more fair alternative with a constant to save the literal (so that is not built over and over inside the loop) and you'll see that this is the fastest option.

    Another alternative using a lexical variable (my) instead of package variable (our) was introduced as well. But there is no visible performance difference in this case.

    use strict; use warnings; use Benchmark qw(cmpthese); our $global_hash = { 'a' => 'A', 'b' => 'B'}; my $lexical_hash = { 'a' => 'A', 'b' => 'B'}; use constant CONST_HASH => { 'a' => 'A', 'b' => 'B'}; cmpthese(-2, { 'hash' => sub { for my $x (qw(a b a b a b)) { my $y = { 'a' => 'A' +, 'b' => 'B'}->{$x}; }}, 'global_ref' => sub { for my $x (qw(a b a b a b)) { my $y = $globa +l_hash->{$x}; }}, 'lexical_ref' => sub { for my $x (qw(a b a b a b)) { my $y = $lexi +cal_hash->{$x}; }}, 'const' => sub { for my $x (qw(a b a b a b)) { my $y = CONST_HASH( +)->{$x}; }}, });
    and a sample result:
    Rate hash global_ref lexical_ref const hash 27571/s -- -73% -74% -76% global_ref 102819/s 273% -- -3% -10% lexical_ref 105720/s 283% 3% -- -8% const 114361/s 315% 11% 8% --
Re: How to access a static hash.
by graff (Chancellor) on Mar 17, 2007 at 18:29 UTC
    Maybe someone really meant to say that
    my $hash = { a => 'b' }; $hash->{$x}
    is just a lot more legible, intelligible, and therefor more maintainable than
    {a => 'b'}->{$x}
    and in terms of programmer time (as opposed to execution time), the former could be a time saver overall. That's probably a matter of personal judgment and taste, but I'd be inclined to agree with it. Especially as you get into larger numbers of hash elements, IMO, keeping the initialization and the fetching of a value as separate statements seems easier and more intuitive, somehow.

    Still, now that I've seen the latter idiom (no, I hadn't seen that sort of usage before), I can appreciate its attraction -- it's sort of like a clever substitute for the (chain of) ternary operator(s). Instead of this:

    sub some_function { my ( $x ) = @_; return ( $x eq 'foo' ) ? "results for foo" : ( $x eq 'bar' ) ? "results for bar" : ( $x eq 'qaz' ) ? "results for qaz" : " ... and so on, ad nauseum"; }
    there's this:
    sub some_function { my ( $x ) = @_; { foo => "results for foo", bar => "results for bar", qaz => "results for qaz", }->{$x} || "... and so on, ad nauseum"; }
    It might be interesting to benchmark those alternatives, since they seem more "equivalent" (in terms of purpose) than the two you tested.

    (updated to fix a typo, and to add the following benchmark script:)

    and the results:
    Rate anonhash ternary anonhash 64869/s -- -72% ternary 232475/s 258% --

    Well, so much for being clever, I guess. ;)

    One last update: I was still curious about "scalability": what happens as more distinct cases are needed? I adjusted my benchmark script by adding seven more hash keys to the anonhash function, and an equivalent set of seven more chained conditionals in the ternary function. Results:

    Rate anonhash ternary anonhash 26893/s -- -87% ternary 204817/s 662% --
      What you did not test is the global hash:
                    Rate   anonhash globalhash    ternary
      anonhash   32323/s         --       -59%       -63%
      globalhash 79642/s       146%         --        -8%
      ternary    86652/s       168%         9%         --
      
      With only a few more entries the global hash would be faster. However you have to name each of these global hashes (and I am bad with names).
      #!/usr/bin/perl use strict; use Benchmark qw(:all); our $hash = { foo => "results for foo", bar => "results for bar", qaz => "results for qaz", }; sub globalhash { my ( $x ) = @_; $hash->{$x} || "... and so on, ad nauseum"; } sub anonhash { my ( $x ) = @_; { foo => "results for foo", bar => "results for bar", qaz => "results for qaz", }->{$x} || "... and so on, ad nauseum"; } sub ternary { my ( $x ) = @_; return ( $x eq 'foo' ) ? "results for foo" : ( $x eq 'bar' ) ? "results for bar" : ( $x eq 'qaz' ) ? "results for qaz" : " ... and so on, ad nauseum"; } my @strings = qw/foo bar baz qaz/; my $i = 0; cmpthese( -5, { anonhash => sub { anonhash( $strings[$i++] ); $i = 0 if ( $i == @strings ) }, ternary => sub { ternary( $strings[$i++] ); $i = 0 if ( $i == @strings ) }, globalhash => sub { globalhash( $strings[$i++] ); $i = 0 if ( $i == @strings ) }, } );
      -- gam3
      A picture is worth a thousand words, but takes 200K.
Re: How to access a static hash.
by Sidhekin (Priest) on Mar 17, 2007 at 18:04 UTC

              Rate   hash    ref
    hash   11956/s     --   -73%
    ref    43773/s   266%     --

    You are misreading the results. That says using ref is faster by 266%, not slower.

    Update: ... or perhaps you have just forgotten how you named your two codes. "hash" and "ref"?! They are both using hashrefs!

    The one with the pre-created hashref is the faster one though. Just as common wisdom would have it.

    print "Just another Perl ${\(trickster and hacker)},"
    The Sidhekin proves Sidhe did it!

Re: How to access a static hash.
by ikegami (Patriarch) on Mar 17, 2007 at 23:13 UTC
    my $hash = { a => 'b' }; $hash->{$x}

    would be less confusing (no needless use of references) and less complicated (two fewer operations and one less variable) as

    my %hash = ( a => 'b' ); $hash{$x}

    It's also much faster.

    Results:
    hash vs hash ref: hash is 30-40% faster.
    inline vs hash: hash is 30-40% faster.
    global vs local hash: global is 600-700% faster. (Would be even faster if the hash had more elements.)

    Benchmark code and numbers:

Re: How to access a static hash.
by Anno (Deacon) on Mar 17, 2007 at 18:20 UTC
    The code concerning $hash2 and $y doesn't have a function and isn't strict-safe.

    Otherwise, your benchmarks apppear to confirm the claim you think they are refuting. Your first code example builds the hash(ref) once and only accesses it at run time. That's what "ref" does in your benchmark. The second code example builds the hashref every time it is run. That's what "hash" does in the benchmarks. It costs extra time and the results show it does.

    Anno

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://605287]
Approved by graff
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (3)
As of 2024-04-25 19:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found