Re^5: Best Multidimensional Hash Practices?

I'd like to start this reply with saying that I might be completely wrong. I merely try to state why I think it works the way it works.

when it's clear right in the beginning that this isn't going to lead to any new conclusions with respect to the existence of the final key

Is it, though? If you see operators as functions with funny syntax (which is what operators are), you could say that the underlying function for -> is deref(HASHREF, KEY). Having that established, it is of importance to note that normal rules of precedence

So the simple case of exists $hashRef->{key1}->{key2} boils down to

exists(
  deref(
    deref(
      $hashRef,
      "key1"
    ),
    "key2"
  )
)
[download]

Considering the order of precedence, the very first call that is made, is the innermost thing: deref($hashRef, "key1"). It would be weird if, at this point, perl would break in and said, wait! Before we're doing anything, let's find out why we're doing it anyway! and would analyze the whole statement to see if it's doing an assignment or simply an exists/defined check.

But okay, Perl has some DWIM on other occasions too, so why not here? Then what happens if you roll your own functions which operate on the hash refs somehow?

$state = 1;

sub retrieve_a_value {
  # a subroutine that just... does some things
  # and one of those things involves a state change:
  $success = Resource::External::retrieve();
  $state = 2 if $success;
  return $success;
}    

sub mess_it_all_up {
  if( $state == 1 && retrieve_a_value) {
    $_[0] = 1;
  }
  else {
    return $_[0];
  }
}

if ( mess_it_all_up($hashRef->{key1}->{key2}) ) {
  ...
}
else {
  die "Whoah. I couldn't mess it all up at this point even though I sh
+ould be able to.";
}
[download]

Perl would need to break in into the innermost deref call to see why that call is actually made. Let's call this 'checktime', as opposed to runtime or compile time. Ah, turns out that the multi-dereferenced value is going to be passed to mess_it_all_up. Now there are two options:

Go on with dereferencing (and therefore autovivivying) because it isn't known what mess_it_all_up is going to do - assign or check?
or we check just that, now that we're breaking in into the path of execution anyway.

Option 1 is pointless. Why would you need perl to active check time if it's going to decide to autovivicate in most cases anyway?

Option 2 is even less desirable. If we don't execute retrieve_a_value we still won't know what that subroutine is going to do, so hey, let's dive in, activate 'checktime' and trigger some side effects such as changing the state variable (while we're only trying to see if we should autovivicate) and completely ruin with the rest of the program. Let's assume that retrieve_a_value is successful. So perl notices that the actual content of the data structure is going to be altered and it gives green light for the dereferencing. Perl switches back to runtime, makes two calls to deref() and finally another call to mess_it_all_up. Notice how, because of the state change, mess_it_all_up is not going to retrieve_a_value or change the data structure in any way at this point, so instead, the program dies.

Of course, you could circumvent that from happening by making sure that runtime and checktime maintain their own values for anything that happens, so that the state change in checktime doesn't affect the state of runtime. The only problem here is that this requires Resource::External::retrieve() to be called twice. How effecient is that going to be? What if the external resource is a multi million records database table?

So then you'd have to work around that problem by making sure Resource::External::retrieve() is never executed during checktime, by making sure it isn't part of a path of execution that happens to trigger checktime.

Is it really worth the hassle? Or is it simpler to just learn the side effect of the -> operator?

Comment on Re^5: Best Multidimensional Hash Practices? Select or Download Code