in reply to Re: What is the most efficient way to see if a hash is empty?
in thread What is the most efficient way to see if a hash is empty?

From perldata: "If you evaluate a hash in scalar context, it returns false if the hash is empty. If there are any key/value pairs, it returns true; more precisely, the value returned is a string consisting of the number of used buckets and the number of allocated buckets, separated by a slash."..

Am I right to assume that if the hash is non-empty if (%hash) and scalar(%hash) are simply returning a stored number of buckets (no loops to count buckets)? If so, that would be exactly want I wanted - O(1). I missed that paragraph. Thanks.

Best, beth

Update: just to be clear - this is meant to be an internals question, not a "trust the docs" question. perlguts doesn't go that deep into the internals of hash implementation.

Replies are listed 'Best First'.
Re^3: What is the most efficient way to see if a hash is empty?
by moritz (Cardinal) on Apr 28, 2009 at 08:35 UTC
    That's what I expect from common sense, yes.

    A quick benchmark seems to confirm this; asking an empty hash is much faster than asking a filled one, but the number of keys in the full hash seems to have no significant influence on the speed:

    use strict; use warnings; use Benchmark qw(cmpthese); for my $len (2..6) { our (%full, %empty); my $a = 'A'; $full{$a++} = $a while length($a) < $len; print "\nLength $len with ", scalar(%full), "items\n"; cmpthese -1, { full => sub {die unless %full}, empty => sub {die if %empty}, } } __END__ Length 2 with 22/32items Rate full empty full 1286220/s -- -77% empty 5481192/s 326% -- Length 3 with 510/1024items Rate full empty full 1052183/s -- -82% empty 5748771/s 446% -- Length 4 with 13961/32768items Rate full empty full 998734/s -- -84% empty 6056132/s 506% -- Length 5 with 310721/524288items Rate full empty full 963764/s -- -83% empty 5681139/s 489% -- Length 6 with 8655839/16777216items Rate full empty full 849541/s -- -86% empty 6124859/s 621% --

    Note that the number of hash items grows exponentially, while the number of iterations decreases linearly at best.

Re^3: What is the most efficient way to see if a hash is empty?
by Anonymous Monk on Apr 28, 2009 at 09:31 UTC
    Devel::Peek and a quick perusal of the sources seems back up the idea that its a lookup
    C:\> perl -MDevel::Peek -e"Dump { } SV = RV(0x183cfc8) at 0x226074 REFCNT = 1 FLAGS = (TEMP,ROK) RV = 0x225f84 SV = PVHV(0x22b504) at 0x225f84 REFCNT = 1 FLAGS = (SHAREKEYS) IV = 0 <<<------------------------------------------ NV = 0 ARRAY = 0x0 KEYS = 0 <<<------------------------------------------ FILL = 0 MAX = 7 RITER = -1 EITER = 0x0 C:\>perl -MDevel::Peek -e"Dump { 1..10 } SV = RV(0x183cfc8) at 0x182f4f4 REFCNT = 1 FLAGS = (TEMP,ROK) RV = 0x225f94 SV = PVHV(0x22b514) at 0x225f94 REFCNT = 1 FLAGS = (SHAREKEYS) IV = 5 <<<------------------------------------------ NV = 0 ARRAY = 0x1825554 (0:3, 1:5) hash quality = 150.0% KEYS = 5 <<<------------------------------------------ FILL = 5 MAX = 7 RITER = -1 EITER = 0x0 Elt "1" HASH = 0x806b80c9 SV = IV(0x1828ce0) at 0x226084 REFCNT = 1 FLAGS = (IOK,pIOK) IV = 2 Elt "3" HASH = 0xa400c7f3 SV = IV(0x1828d04) at 0x2260b4 REFCNT = 1 FLAGS = (IOK,pIOK) IV = 4 Elt "7" HASH = 0xecc9d984 SV = IV(0x1828d14) at 0x2260f0 REFCNT = 1 FLAGS = (IOK,pIOK) IV = 8

      Thanks for the links to Perl source code and the dump.

      Key counts being stored would be consistent with BrowserUk's observations that there seems to be little difference between scalar keys on an empty and full hash. But scalar(%h) is returning counts of used and allocated buckets, not key counts.

      Given moritz's observations of a significant difference in empty and non-empty hashes for %h and the lack of an obvious dump field to store bucket counts in Devel::Peek output, I would think that %h is doing some sort of looping and is O(number of buckets).

      Best, beth

        I don't understand how you got to that conclusion. My benchmark (when modified to use scalar %hash shows that there is no O(number of buckets) operation. All it does is factor of 4 or 5 difference between empty and non-empty hashes. For me that doesn't look like a loop, but rather like a good optimization for the empty case.

        (For the empty hash the format of scalar %hash differs from the "normal" case. Instead of "$count/$buckets" it's simply 0).

        The number of buckets seems to be stored in the MAX field.

        hv.h
        281 /* the number of keys (including any placeholers) */ 282 #define XHvTOTALKEYS(xhv) ((xhv)->xhv_keys) 289 #define HvKEYS(hv) HvUSEDKEYS(hv) 290 #define HvUSEDKEYS(hv) (HvTOTALKEYS(hv) - HvPLACEHOLDERS_ +get(hv)) 293 #define HvPLACEHOLDERS_get(hv) (SvMAGIC(hv) ? Perl_hv_placeholder +s_get(aTHX_ (const HV *)hv) : 0)
        hv.c
        2536 I32 2537 Perl_hv_placeholders_get(pTHX_ const HV *hv) 2538 { 2539 dVAR; 2540 MAGIC * const mg = mg_find((const SV *)hv, PERL_MAGIC_rhash); 2541 2542 PERL_ARGS_ASSERT_HV_PLACEHOLDERS_GET; 2543 2544 return mg ? mg->mg_len : 0; 2545 }
        mg.c
        411 MAGIC* 412 Perl_mg_find(pTHX_ const SV *sv, int type) 413 { 414 PERL_UNUSED_CONTEXT; 415 if (sv) { 416 MAGIC *mg; 417 for (mg = SvMAGIC(sv); mg; mg = mg->mg_moremagic) { 418 if (mg->mg_type == type) 419 return mg; 420 } 421 } 422 return NULL; 423 }
Re^3: What is the most efficient way to see if a hash is empty?
by Anonymous Monk on Apr 28, 2009 at 08:25 UTC
    No, the documentation is all lies :)
    C:\>perl -e"die scalar %hash 0 at -e line 1. C:\>perl -e"die scalar %ENV 26/64 at -e line 1.