mak007 has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I am encountering a very strange issue where hash keys are created by just being referenced in the condition statement of an if-statement. Dumper($hash_ref), returns the following:

$VAR1 = { 'block1' => { 'lib1' => { 'cell_1' => 1, 'cell_2' => 1, }, }, 'block2' => { 'lib1' => { 'cell_3' => 1, 'cell_4' => 1, }, }, };

After running the following code where $block="block3" & $libName = "lib1" :

if(!defined $hash_ref->{$block}->{$libName}) { print "This is a test\n"; }

Dumper($hash_ref), returns the following:

$VAR1 = { 'block1' => { 'lib1' => { 'cell_1' => 1, 'cell_2' => 1, }, }, 'block2' => { 'lib1' => { 'cell_3' => 1, 'cell_4' => 1, }, }, 'block3' => {}, };

Any ideas why this is happening. The same script used to work ok, but it suddently stopped working. I have tried different perl installations but the issue persists!

Replies are listed 'Best First'.
Re: Strange Hash related bug, keys are created by themselves!
by marioroy (Prior) on Nov 03, 2016 at 17:46 UTC

    The following is another way when Autovivification is not desired.

    if (exists $hash_ref->{$block} && !defined $hash_ref->{$block}->{$libN +ame}) { print "This is a test\n"; }
Re: Strange Hash related bug, keys are created by themselves!
by BrowserUk (Patriarch) on Nov 03, 2016 at 17:38 UTC
    Any ideas why this is happening.

    It's not a bug, it is called Autovivification (See section 3 of "Using references").

    If you really can't learn to work with it -- you should try, it is incredibly useful -- then you can disable it with this module: no autovivification.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Thanks!
      "no autovivification" fixed the issue.
      I wonder how autovivification was enabled recently on our servers without my team knowing about it.

        As BrowserUk said, autovivification is very useful most of the time. It is thanks to it that you can create directly a hash item such as:
        $time{2016}{November}{03}{sales} = 2500;
        even if $time{2016} (and, consequently, also $time{2016}{November}, and so on) does not exit yet.

        Disabling autovivification might bring some bugs if your code is creating on the fly nested HoH items.

        In some cases, of course, autovivification creates unwanted elements in a nested data structure, most notably when you check for the existence of a deeply nested element as in your example.

        I would suggest that it is probably better to leave autovivification enabled and to perform your checks step by step, i.e. to change:

        if(!defined $hash_ref->{$block}->{$libName}) { print "This is a test\n"; }
        to something like:
        if(exists $hash_ref{$block} and !defined $hash_ref{$block}{$libName}) +{ print "This is a test\n"; }
        This will prevent $hash_ref->{$block} from springing into existence due to autovivification when this is unwanted, as in the case of an existence test such as the one you're doing.

        But you'll keep autovivification enabled when it is useful (i.e. in most cases).

        To sum it up, it's not a bug, it's a feature. Although, to tell the truth, there could be a better middle way where autovivification would be enabled only when the nested reference appears in a Lvalue assignment statement. I think that the Camel book says something about it to the effect that it might be fixed one day but that it's not a priority; I was not able to find where, though, in the limited time I was ready to devote to this.

        I wonder how autovivification was enabled recently on our servers without my team knowing about it.

        You don't enable autovivification, it is and has been an integral part of the language since forever.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Strange Hash related bug, keys are created by themselves!
by davido (Cardinal) on Nov 03, 2016 at 23:24 UTC

    Autovivification is a default and standard behavior with Perl. Here is a simple, contrived example:

    perl -MData::Dumper -e 'my %h; print "yes\n" if exists $h{d}{j}; print + Dumper \%h;' $VAR1 = { 'd' => {} };

    What just happened here?

    • We created a hash named %h. It is empty.
    • We asked if $h{d}{j} exists. So really we're asking this: Does the anonymous hash stored in $h{d} contain an element named j?
    • For Perl to answer the question of whether %{$h{d}} contains an element named j, the $h{d} element must contain a hashref. In this case, and because $h{d} didn't contain anything already, Perl happily spawns a hashref where there was previously nothing.
    • Now that Perl has put a hashref into $h{d}, it can answer the question of whether a j key exists in %{$h{d}}. (The answer is no, by the way).

    So when we're done, we have an anonymous hash stored in an element that never even existed before we asked. The best way to prevent this is to break your test into smaller chunks:

    if(defined $h{d} && exists $h{d}{j}) {...}

    If $h{d} is empty or non-existent then the Boolean of defined($h{d}) is false. The logical short circuit && operator stops dead in its tracks when the left-hand argument is false. This has the desirable effect of never getting to the code that tests for the existence of j if the test on the lefthand side of the && operator already returned false. Thus, we never force Perl to put a hashref where nothing existed before.

    Autovivification is useful because it allows one to do this:

    my %hash; $hash{d}{j} = 'foo';

    Even though d never existed before, the fact that we are treating it as a reference to an anonymous hash, Perl is willing to oblige. It allows us to avoid doing this:

    my %hash; $hash{d} ||= {}; $hash{d}{j} = 'foo';

    You could look at it another way. Have you ever tried creating a path ~/foo/bar/baz? First you try mkdir foo/bar/baz, and you see "mkdir: cannot create directory 'foo/bar/baz': No such file or directory. So then you man mkdir and remember the -p flag. -p is autovivification for file paths. mkdir -p foo/bar/baz works similar to $hash{foo}{bar}{baz} = '...'.

    By setting no autovivification; you are preventing Perl from being itself. That's fine, I suppose, until the next time you actually want autovivification, or the next time you forget to include that boilerplate at the top of the lexical scope where you need it. It's probably safer to just keep track of what can trigger autovivification.


    Dave

Re: Strange Hash related bug, keys are created by themselves!
by hippo (Archbishop) on Nov 04, 2016 at 10:09 UTC
    The same script used to work ok, but it suddently stopped working.

    If that's true then the prime suspect is that a different data set or structure is being fed to the script.

    In addition to the other excellent responses already given in this thread it might be beneficial to point out the FAQ How can I check if a key exists in a multilevel hash? which covers this topic.

Re: Strange Hash related bug, keys are created by themselves!
by dsheroh (Monsignor) on Nov 04, 2016 at 07:56 UTC
    As thoroughly explained by previous answers, not strange, not an issue, not a bug. Autovivification is a well-known and documented feature of Perl.

    If you want to check for the existence of keys without autovivifying, you can avoid writing long chains of if (exists $h{a} && exists $h{a}{b} && exists $h{a}{b}{c} &&... by using the Dive function from Data::Diver. This is, IMO, highly preferable to using no autovivification, as it does not break standard Perl behavior.