in reply to Re: Best Multidimensional Hash Practices?
in thread Best Multidimensional Hash Practices?

if key1 didn't exist - it pops into existence. This is a process called autovivification

I always wondered what the design decision behind this autovivification behaviour with mere existence testing had been...  I mean why does Perl not simply do a short-circuit evaluation from left to right, stopping as soon as a hash key does not exist?  In the example, key2 can't possibly exist if there is no hash referenced via key1 at all, because there is not even a key1. So why proceed any further?

Replies are listed 'Best First'.
Re^3: Best Multidimensional Hash Practices?
by CountZero (Bishop) on Oct 12, 2009 at 21:10 UTC
    If you do not want the autovivification to happen with exists, try no autovivification 'exists';

    You can even restrict its effects within a lexical scope!

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      IMHO, this should eventually become a core module/pragma...

        Perl 6 goes as far as allowing autvivification only on writing accesses, so %hash<k1><k2><k3> shouldn't autovivify anything, but %hash<k1><k2><k3> = 1; should.
Re^3: Best Multidimensional Hash Practices?
by muba (Priest) on Oct 13, 2009 at 00:08 UTC

    Because it isn't exists that triggers autovivication. It is the (implied or explicitely written) -> operator, as was pointed out in the previous thread by DamianKaelGreen.

    Autovivication is exactly the kind of thing that makes this code work:

    use strict; use warnings; my $hashRef; # note how it is undefined at this moment! $hashRef->{cogito} = "ergo sum"; print ref $hashRef, "\n" # Output: HASH # Hey... aint that cool? # But we can go even further! $hashRef->{"somewhere"}->{"deep down"}->{"this"}->{"data structure"} = + "I think, so I exist.";

    Now, that last line isn't a true beauty but there are those cases that you need deep structures like that. And it really wouldn't be Perlish if you had to make each part of it come into existance manually.

      I never said it's the exists itself that autovivifies...  sure it's the dereferencing. And the behaviour is all nice and dandy, if it does make sense to actually access the last element in the chain (and create intermediate structures as required), as is the case with your example where you're assigning a value.  It just doesn't make much sense when there's no real need for complete dereferencing, such as when testing for existence or definedness.

      In other words, why not take the shortcut and just not dereference the entire structure when it's clear right in the beginning that this isn't going to lead to any new conclusions with respect to the existence of the final key?  I mean some special handling could be done for the sake of DWIM, just like it does happen elsewhere in Perl.

        I'd like to start this reply with saying that I might be completely wrong. I merely try to state why I think it works the way it works.

        when it's clear right in the beginning that this isn't going to lead to any new conclusions with respect to the existence of the final key

        Is it, though? If you see operators as functions with funny syntax (which is what operators are), you could say that the underlying function for -> is deref(HASHREF, KEY). Having that established, it is of importance to note that normal rules of precedence

        So the simple case of exists $hashRef->{key1}->{key2} boils down to

        exists( deref( deref( $hashRef, "key1" ), "key2" ) )
        Considering the order of precedence, the very first call that is made, is the innermost thing: deref($hashRef, "key1"). It would be weird if, at this point, perl would break in and said, wait! Before we're doing anything, let's find out why we're doing it anyway! and would analyze the whole statement to see if it's doing an assignment or simply an exists/defined check.

        Is it really worth the hassle? Or is it simpler to just learn the side effect of the -> operator?