bowei_99 has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I'm trying to do some troubleshooting on a hash, so am trying to figure out what the scalar function (without keys) does:

For instance, the following code

#!/usr/bin/perl use Data::Dumper; %h = ( 1 => "one", 2 => "two", 3 => "", 4 => "four", # 5 => "five", ); print "h is " . scalar %h . "\n"; print "Dump is " . Dumper(\%h);
yields
# perl test.pl h is 4/8 Dump is $VAR1 = { '1' => 'one', '2' => 'two', '3' => '', '4' => 'four' };
So, from experimenting with this code by commenting out elements in %h, I figure the first value is the number of keys, but I'm not sure what the second value is. It stays at 8 no matter what I change. I remember vaguely it has something to do with how densely the hash is populated.... Can anybody tell me what it is?

-- Burvil

Replies are listed 'Best First'.
Re: Effect of scalar function on hash
by friedo (Prior) on Mar 22, 2006 at 21:39 UTC
    Congratulations. You have stumbled on the least useful feature of Perl hashes. :)

    Per perldata:

    If you evaluate a hash in scalar context, it returns false if the hash is empty. If there are any key/value pairs, it returns true; more precisely, the value returned is a string consisting of the number of used buckets and the number of allocated buckets, separated by a slash. This is pretty much useful only to find out whether Perl's internal hashing algorithm is performing poorly on your data set.
Re: Effect of scalar function on hash
by ikegami (Patriarch) on Mar 22, 2006 at 21:53 UTC
    To make it change, force perl to make more buckets by adding more items to the hash.
    >perl -le "for ('a'..'z') { $h{$_}=1; print scalar %h } 1/8 2/8 3/8 4/8 5/8 6/8 7/8 8/16 9/16 10/16 11/16 12/16 13/16 14/16 15/16 16/32 17/32 18/32 19/32 20/32 21/32 22/32 23/32 24/32 25/32 26/32
    Note that sometimes, buckets will be reused:
    >perl -le "for ('a'..'z') { $h{$_ x 2}=1; print scalar %h } 1/8 2/8 3/8 4/8 5/8 6/8 7/8 8/16 9/16 10/16 11/16 12/16 13/16 14/16 15/16 <- before adding 'pp' 15/16 <- after adding 'pp' 15/16 15/16 15/16 15/16 21/32 21/32 21/32 21/32 21/32 21/32
    Note the duplicate '15/16's. 'pp' was added to an existing bucket because 'pp' hashed to the same value as another key.
      To make it change, force perl to make more buckets by adding more items to the hash.
      Or tell perl that you are going to be adding more items to the hash.
      perl -wle'%h=%ENV; print ~~%h; keys %h = 1000; print ~~%h'
Re: Effect of scalar function on hash
by ysth (Canon) on Mar 23, 2006 at 11:12 UTC
    Note that scalar(%h) can be unreliable for tied hashes, especially prior to 5.8.3. Before then, the result usually depended on the content of the hash before it was tied (which means it would typically be false). Beginning in 5.8.3, if a SCALAR method is supplied, the result will be whatever that returns. Without SCALAR, perl makes a guess about whether the hash has content and returns a simple true or false, with the idea that at least if (%h) should work where possible. (The only case where this guess is wrong is if you've iterated through the hash with each() and deleted every returned key, but not yet gotten an undef from each().)