in reply to the if statement

The comments from FunkyMonk and toolic bring up a (possible) subtle difference in the "pattern",that I have always puzzled about.

--FunkyMonk
1 hash access, + possibly 1 scalar assignment

if (my $ans = $hash{$input}) { print "$input => $ans\n";

--toolic
1 hash access + possibly 1 more hash access
is the 1st result 'cached' and the second access reuses the 1st result?

if (exists $hash{$input}) { print $hash{$input}, "\n";

Replies are listed 'Best First'.
Re^2: the if statement
by chrism01 (Friar) on Sep 29, 2008 at 08:17 UTC
    In this particular case, the desired value is always 'true', but Funkymonk's soln would fail if it was 'false' eg 0 (zero) or undef.
    toolic's soln tells you there's a hash entry there, even if the associated value is 'false' or undef.
    :)
Re^2: the if statement
by blazar (Canon) on Sep 29, 2008 at 13:46 UTC
    • Is there any efficiency advantage in either approach?
    • Does the hash size make a difference?
    • How about when done 5000 times for different $input values?

    I personally believe that there's no significative advantage of one approach over the other but possibly in terms of personal tastes. As far as efficiency is concerned, what do you mean? Speed of execution? If so, then I wouldn't mind, since it's such a tiny difference, but you may answer your question(s) yourself with Benchmark.pm!

    I notoriously suck at benchmarks, doing continuous errors, (which are generally pointed out by others...) but here's my try anyway:

    As I expected, as it is it doesn't show any noticeable difference.

    kirk:~ [15:16:04]$ ./bm.pl Rate double assign double 29.4/s -- -1% assign 29.7/s 1% --

    Indeed I generally refrain from the temptation of doing benchmarks "like this" when someone suggests them, and even tend to slightly bash those who do: this time I was curious to see if at least a tiny systematic difference would have arisen, but that doesn't seem to be the case: feel free to modify it the way you like most though!

    --
    If you can't understand the incipit, then please check the IPB Campaign.
      I love your answer. Below I have put my result of running your benchmark in a Fedora9 VM on a Lenovo laptop.

      But what really intrigues me is how you built the test hash and keys to test with... 2 'map's in 2 lines of code. Figuring out 'genkeys' and the %hash, and @test values will take me the rest of the afternoon; thanks

      ### using 5.010 [~]# time perl hash_test_benchmark.pl Rate assign double assign 119/s -- -2% double 122/s 2% -- real 2m30.601s user 2m17.789s sys 0m6.317s ### without 5.010 [~]# time perl hash_test_benchmark.pl Rate double assign double 121/s -- -0% assign 122/s 0% -- real 2m32.598s user 2m19.745s sys 0m6.498s [~]#
        I love your answer. Below I have put my result of running your benchmark in a Fedora9 VM on a Lenovo laptop.

        I personally believe this just shows that the benchmark itself is not significative, or that it is significative in showing that there's no significative difference between the two "techniques" and thus also as a reminder not to even bother in the future: just do so when you have actually different algorithms to start with...

        You may find much more interesting benchmarks in another recent thread...

        But what really intrigues me is how you built the test hash and keys to test with... 2 'map's in 2 lines of code. Figuring out 'genkeys' and the %hash, and @test values will take me the rest of the afternoon; thanks

        What's so difficult to understand? I hope I can help you to clarify: %hash and %test are a plain regular hash and array respectively. Since they're lexical variables, the subs used in the benchmark will be closures around them.

        genkeys() takes a whole number $n and returns that many random strings, of length comprised in an hardcoded manner between 5 and 14. Since genkeys() makes no attempt at removing duplicate entries from its return list, %hash is a hash with at most 5000 keys, but it may have less. @test has all these keys, plus other 5000, and it may have duplicates. I wanted a test array of "input" values such that about half of them values will succeed and about a half will fail.

        Coming to genkeys(), analyze it top-down; it's simply of the form

        sub genkeys { map { CODE } 1..shift; }

        with CODE being:

        join '' => map $chr[rand @chr], (1) x (5 + rand 10);

        The former takes a list of the length of the supplied argument and to each element of it will apply CODE. Since $_ is not actually used in CODE, the actual values of the elements don't matter, only the length of the list, and it may well have been e.g. (1) x shift. In the latter, similarly, I build a list of arbritrary thingies of length between 5 and 14. Then map makes that into a list of length between 5 and 14 of random characters taken from the @chr array and join... err... well, joins them into a string of that length. As you can see, it's not that esoteric after all...

        --
        If you can't understand the incipit, then please check the IPB Campaign.