Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Re^2: Using 'keys' on a list

by swl (Vicar)
on Jul 07, 2021 at 00:04 UTC ( #11134741=note: print w/replies, xml ) Need Help??

in reply to Re: Using 'keys' on a list
in thread Using 'keys' on a list

I'm surprised that returning a list of keys is faster than returning a hashref and then getting its keys. A dereference requires more operations, but should it be that much?

A good chunk of the time is spent in building the hash, so I added a baseline sub to measure it. I also reduced the number of iterations so it would run in reasonable time on my machine. And some minor formatting of results.

This is perl 5, version 28, subversion 0 (v5.28.0) built for MSWin32-x +64-multi-thread
# Adapted from use strict; use warnings; use Benchmark; use Data::Dumper; # Note: You can increment a string in Perl! WOW! # As long as you don't use the string in a numeric context! # $string++ is quite different than $string +=1 # # Below, this is used to make a bunch of unique hash keys # I don't think that the fact that they are sequential in # an alphabetic sense makes much difference in the generated # hash table because of the way that the Perl hash algorithm # works. sub baseline { # create a large hash but return nothing my $string = "ABCDEFGHIJ"; my %hash = map{$string++ => 1}(1..100000); return; } sub return_hash { # create a large hash and return that entire hash as list my $string = "ABCDEFGHIJ"; my %hash = map{$string++ => 1}(1..100000); return %hash; } sub return_hash_ref { # create a large hash and return a ref to that hash my $string = "ABCDEFGHIJ"; my %hash = map{$string++ => 1}(1..100000); return \%hash; } sub return_just_keys { # create a large hash and return just the keys of that hash my $string = "ABCDEFGHIJ"; my %hash = map{$string++ => 1}(1..100000); return keys %hash; } timethese(200, { '1)Keys of Hash via list ' => 'my @keys = keys %{{return_hash()}} +', '2)Keys of Local Hash copy' => 'my %hash2 = return_hash(); my @key +s = keys %hash2;', '3)Keys of local Hash Ref ' => 'my $href = return_hash_ref(); my @ +keys = keys %$href;', '4)Just the returned keys ' => 'my @keys = return_just_keys()', '5)Baseline ' => 'my $res = baseline()', }); __END__ Benchmark: timing 200 iterations of 1)Keys of Hash via list , 2)Keys +of Local Hash copy, 3)Keys of local Hash Ref , 4)Just the returned ke +ys , 5)Baseline ... 1)Keys of Hash via list : 50 wallclock secs (49.44 usr + 1.52 sys = +50.95 CPU) @ 3.93/s (n=200) 2)Keys of Local Hash copy: 50 wallclock secs (49.98 usr + 0.38 sys = +50.36 CPU) @ 3.97/s (n=200) 3)Keys of local Hash Ref : 35 wallclock secs (34.75 usr + 0.34 sys = +35.09 CPU) @ 5.70/s (n=200) 4)Just the returned keys : 33 wallclock secs (32.51 usr + 0.16 sys = +32.67 CPU) @ 6.12/s (n=200) 5)Baseline : 25 wallclock secs (24.86 usr + 0.20 sys = +25.06 CPU) @ 7.98/s (n=200)

Replies are listed 'Best First'.
Re^3: Using 'keys' on a list
by Marshall (Canon) on Jul 07, 2021 at 00:42 UTC
    I liked your idea of adding a "baseline"!
    I also really don't know either why generating the list of keys in the sub appears to be faster then giving the caller the ref and having him do it? I would have expected that difference to be smaller. Hopefully some other Monk knows?

    However, the main point remains: If the caller needs the whole hash, give him a ref to a hash. This is much faster than passing the entire hash back as a list. Of course there are memory allocation issues with that because Perl will keep the memory for that hash allocated as long as there is reference to it.

    Added: Except as a part of an object method, I don't know why a sub() in general would create a hash, only to just pass back just the keys? Seems a bit weird, but I'd also like to know why this appears to be somewhat faster.

      "I don't know why a sub() in general would create a hash, only to just pass back just the keys?"

      There are likely to be lots of reasons, but a simple one is to generate a list of unique results from some process.

      Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
        Yes, although standard Perl idioms can do that without a named sub (i.e., get just the unique hash key names). I could envision an OO object that returns just the keys of a hash via a method call that represents the "persistent data" related to that object. In a non-OO program, the normal thing would be to return a reference to the created hash (i.e. the values are significant and needed for further processing). I don't see any disagreement here.
        I just tested my benchmark code to see if it is doing what I think that is doing.

        My eyes can certainly be deceived, but why does generating the list of keys in the sub and passing copy of that list to the caller appear to be faster than passing a ref to the caller and having him generate this list his own?

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11134741]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (2)
As of 2022-08-16 21:35 GMT
Find Nodes?
    Voting Booth?

    No recent polls found