In this particular case, the desired value is always 'true', but Funkymonk's soln would fail if it was 'false' eg 0 (zero) or undef.
toolic's soln tells you there's a hash entry there, even if the associated value is 'false' or undef.
:) | [reply] |
- Is there any efficiency advantage in either approach?
- Does the hash size make a difference?
- How about when done 5000 times for different $input values?
I personally believe that there's no significative advantage of one approach over the other but possibly in terms of personal tastes. As far as efficiency is concerned, what do you mean? Speed of execution? If so, then I wouldn't mind, since it's such a tiny difference, but you may answer your question(s) yourself with Benchmark.pm!
I notoriously suck at benchmarks, doing continuous errors, (which are generally pointed out by others...) but here's my try anyway:
As I expected, as it is it doesn't show any noticeable difference.
kirk:~ [15:16:04]$ ./bm.pl
Rate double assign
double 29.4/s -- -1%
assign 29.7/s 1% --
Indeed I generally refrain from the temptation of doing benchmarks "like this" when someone suggests them, and even tend to slightly bash those who do: this time I was curious to see if at least a tiny systematic difference would have arisen, but that doesn't seem to be the case: feel free to modify it the way you like most though!
| [reply] [d/l] [select] |
### using 5.010
[~]# time perl hash_test_benchmark.pl
Rate assign double
assign 119/s -- -2%
double 122/s 2% --
real 2m30.601s
user 2m17.789s
sys 0m6.317s
### without 5.010
[~]# time perl hash_test_benchmark.pl
Rate double assign
double 121/s -- -0%
assign 122/s 0% --
real 2m32.598s
user 2m19.745s
sys 0m6.498s
[~]#
| [reply] [d/l] |
I love your answer. Below I have put my result of running your benchmark in a Fedora9 VM on a Lenovo laptop.
I personally believe this just shows that the benchmark itself is not significative, or that it is significative in showing that there's no significative difference between the two "techniques" and thus also as a reminder not to even bother in the future: just do so when you have actually different algorithms to start with...
You may find much more interesting benchmarks in another recent thread...
But what really intrigues me is how you built the test hash and keys to test with... 2 'map's in 2 lines of code. Figuring out 'genkeys' and the %hash, and @test values will take me the rest of the afternoon; thanks
What's so difficult to understand? I hope I can help you to clarify: %hash and %test are a plain regular hash and array respectively. Since they're lexical variables, the subs used in the benchmark will be closures around them.
genkeys() takes a whole number $n and returns that many random strings, of length comprised in an hardcoded manner between 5 and 14. Since genkeys() makes no attempt at removing duplicate entries from its return list, %hash is a hash with at most 5000 keys, but it may have less. @test has all these keys, plus other 5000, and it may have duplicates. I wanted a test array of "input" values such that about half of them values will succeed and about a half will fail.
Coming to genkeys(), analyze it top-down; it's simply of the form
sub genkeys {
map { CODE } 1..shift;
}
with CODE being:
join '' => map $chr[rand @chr], (1) x (5 + rand 10);
The former takes a list of the length of the supplied argument and to each element of it will apply CODE. Since $_ is not actually used in CODE, the actual values of the elements don't matter, only the length of the list, and it may well have been e.g. (1) x shift. In the latter, similarly, I build a list of arbritrary thingies of length between 5 and 14. Then map makes that into a list of length between 5 and 14 of random characters taken from the @chr array and join... err... well, joins them into a string of that length. As you can see, it's not that esoteric after all...
| [reply] [d/l] [select] |