in reply to Best Hash Practices?

[ Some of this has been said already, but it's mostly to lead to the stuff that hasn't. ]

The Perlish way to test if a scalar has been set is to do something like: if ($foo){...}

You can't test whether a scalar has been set. It's a good thing it's rarely useful to know that.

"if ($foo)" is even a poor check for checking if $foo contains a number or a string. One of each will be interpreted incorrectly. "if (defined($foo))" is much more useful.

so ideally the perlish way of testing a value in a hash might be to do something like this:

Well, it was if (defined($foo)) for scalars, so is it if (defined($hash{foo})) for hashes? Indeed it is. Very rarely do need to know whether the key exists or not. defined is quite often sufficient.

In fact, a simple truth test is usually sufficient because hashes and arrays often contains objects or references to other hashes and arrays.

But the reality is that [something like if ($hash{foo})] will not work because an entry for the key automatically gets created in the hash if we try to do that

That's not true. You need to use the hash value as an lvalue for it to get created, and even that's not enough in some cases.

my %hash; 1 if $hash{t1}; 1 for $hash{t2}; \$hash{t3}; sub { }->( $hash{t4} ); sub { ++$_[0] }->( $hash{t5} ); print "$_\n" for sort keys %hash;
t2 t3 t5

Sub args are special lvalues.


So why do you think "if ($hash{foo})" creates $hash{foo}?

Maybe you're thinking of multi-level structures.
if ($hash{foo}{bar}),
if (defined($hash{foo}{bar})) and
if (exists($hash{foo}{bar}))
all populate $hash{foo} with a hash ref if if it didn't exist or if it wasn't defined. This is called autovivification, and it's a feature of dereferencing.

Remember that
$hash{foo}{bar}
is short for
$hash{foo}->{bar}
and that -> is the dereferencing operator. It needs a reference to act upon. Since its LHS is undefined, it creates the necessary reference rather than crapping out. It can be annoying to debug, but it's a very convenient shortcut at times.

If you want to grab the element of a multi-level structure without autovivifying the lower levels if they don't exist, you need to check each level.

if ($hash{foo}{bar})
would be changed to
if ($hash{foo} && $hash{foo}{bar})

Notice I didn't use defined or exists for $hash{foo}. If $hash{foo} can contain a reference, it's usually the case that it can't contain anything but undef or a reference, so it's sufficient to test for truthfulness. This goes back to what I said earlier on (4th paragraph).

Replies are listed 'Best First'.
Re^2: Best Hash Practices?
by AnomalousMonk (Archbishop) on Oct 09, 2009 at 06:56 UTC
    Maybe you're thinking of multi-level structures.
    Here is an example for the OPer to play with to gain definition concerning the existence of the truth of all this:
    >perl -wMstrict -le "my %hash = qw(a 1 b 2); print 'true 1st level' if $hash{c}; print exists $hash{c} ? '' : 'NOT ', 'exists 1 c'; print 'true 2nd level' if $hash{c}{d}; print exists $hash{c} ? '' : 'NOT ', 'exists 2 c'; print exists $hash{c}{d} ? '' : 'NOT ', 'exists 2 d'; " NOT exists 1 c exists 2 c NOT exists 2 d
    And substitute something like
        ... if $hash{c} == 42;
    for
        print 'true 1st level' if $hash{c};
    to see the effects of an actual comparison versus a simple truth test.
Re^2: Best Hash Practices?
by ssandv (Hermit) on Oct 09, 2009 at 22:45 UTC
    if ($hash{foo}{bar}) bit me really badly a while back. Yet another good reason to use the -> instead of leaving it out.
      I wish autovivi was controllable by pragma. Maybe one day I'll be inspired to write it.


        Vincent Pit saved you/us the trouble :)

        -- 
        
        perl -MLWP::Simple -e'print$_[rand(split(q.%%\n., get(q=http://cpan.org/misc/japh=)))]'