Re: Benchmark of hash emptiness test

perldoc perldata says:

If you evaluate a hash in scalar context, it returns false if the hash is empty. If there are any key/value pairs, it returns true; more precisely, the value returned is a string consisting of the number of used buckets and the number of allocated buckets, separated by a slash. This is pretty much useful only to find out whether Perl's internal hashing algorithm is performing poorly on your data set. For example, you stick 10,000 things in a hash, but evaluating %HASH in scalar context reveals "1/16", which means only one out of sixteen buckets has been touched, and presumably contains all 10,000 of your items. This isn't supposed to happen.

Comment on Re: Benchmark of hash emptiness test

Replies are listed 'Best First'.
Re^2: Benchmark of hash emptiness test by aufflick (Deacon) on Nov 10, 2006 at 00:46 UTC
So presumably the logic required to determine these stats is what is taking the time (to make it take longer than the keys function). Is there a case for saying that a hash in a scalar context should do the simplest (or rather, quickest) operation possible to determine if it is empty or not? There could always be a builtin to return these stats if you really wanted them. This is, after all, a very common operaion quite often used within a tight loop.	[reply]
Re^3: Benchmark of hash emptiness test by demerphq (Chancellor) on Nov 10, 2006 at 10:34 UTC
The reason for the speed difference is because if (%hash) {} ends being implemented internally as `sv = sv_newmortal(); if (HvFILL((HV)hv)) Perl_sv_setpvf(aTHX_ sv, "%ld/%ld", (long)HvFILL(hv), (long)HvMAX(hv) + 1); else sv_setiv(sv, 0);` [download] So what happens is a new SV is created, its then populated with a string using something like sprintf. This can be compared against the keys %hash option where an extra opcode is executed, BUT that opcode involves creating an SvIV only and therefore requires no memory allocation, no conversion of longs to strings, etc. So the bottom line is that if you are concerned about speed use the keys form. In Perl 5.10 we will try to make this an internal optimisation, (internally using `if (keys %foo)` when the user typed `if (%foo)` )but its not exactly priority, at least not for me :-). Update:* Well, I gave it a try just to see what was involved, and before I knew it I was sending off patches. So theres a half decent chance this will be fixed in perl 5.10 --- $world=~s/war/peace/g	[reply] [d/l] [select]
Re^4: Benchmark of hash emptiness test by Hofmator (Curate) on Nov 10, 2006 at 10:55 UTC
Pardon my ignorance (I don't know much about the perl internals) but I was under the impression that there was something like boolean context. So shouldn't the hash in `if (%hash) {}` be able to tell that it was used in boolean context and thus react accordingly? This should then be an easy patch, just defining a different behaviour for `%hash` in boolean (in contrast to scalar) context. -- Hofmator Code written by Hofmator and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.	[reply] [d/l] [select]
Re^5: Benchmark of hash emptiness test by demerphq (Chancellor) on Nov 10, 2006 at 13:02 UTC
Re^4: Benchmark of hash emptiness test by aufflick (Deacon) on Nov 12, 2006 at 12:12 UTC
sweet! I really should get familiar with the perl source. I've read through perlgust illustrated a few times and cracked out a little XS code, but maybe it's time to start getting interested in the perl core. Being able to roll an optimisation into the perl codebase - now thats cool!	[reply]