milanpwc has asked for the wisdom of the Perl Monks concerning the following question:

Hello. I am having a hard time understanding the difference between the "exists" and "defined". Given the following piece of code:
my @array; $array[3] = 2; for my $i (0..5) { say $i . ": " . (exists $array[$i]); }
I am expecting after the assignment in the second line, the array should consists of four elements #0-3, and "exists" should return true for all of them. However, "exists" only returns true for element #3. Why? Have I missed something?

Replies are listed 'Best First'.
Re: Difference between exists and defined
by hippo (Archbishop) on Apr 16, 2019 at 09:36 UTC

    From the documentation:

    WARNING: Calling exists on array values is strongly discouraged. The notion of deleting or checking the existence of Perl array elements is not conceptually coherent, and can lead to surprising behavior.

    I would not use exists with arrays, only with hashes. YMMV.

Re: Difference between exists and defined
by LanX (Saint) on Apr 16, 2019 at 10:48 UTC
    The internal implementation of arrays is very space efficient.

    All used slots point to values (scalar, constant ...), but unused slots don't demand much space or even no space.

    • exists tells you if a slot is used
    • assigning undef will fill a slot with the value undef
    • defined will also return undef if a slot isn't used.

    There are only little use cases where using exists on arrays can be of help.

    You don't want to meddle with such internals.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

Re: Difference between exists and defined
by thanos1983 (Parson) on Apr 16, 2019 at 09:37 UTC

    Hello milanpwc,

    From the documentation exists:

    A hash or array element can be true only if it's defined and defined o +nly if it exists, but the reverse doesn't necessarily hold true.

    So why you are expecting to be true on not defined elements (not existing elements)?

    Is it more clear or still confusing?

    Also take notice from documentation:

    exists may also be called on array elements, but its behavior is much +less obvious and is strongly tied to the use of delete on arrays. WARNING: Calling exists on array values is strongly discouraged. The n +otion of deleting or checking the existence of Perl array elements is + not conceptually coherent, and can lead to surprising behavior.

    Update: I think I understand why you are confused. You are not assigning to all elements in the array the value 2. Only to one element. See below:

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my @array; $array[3] = 2; print Dumper \@array; __END__ $ perl test.pl $VAR1 = [ undef, undef, undef, 2 ];

    BR / Thanos

    Seeking for Perl wisdom...on the process of learning...not there...yet!

      Your code example, the way you use Data::Dumper to show the contents of the array allows room for misinterpretation of what is realy happening here. I feel that a better code example is the following:

      use strict ; use warnings ; use Data::Dumper ; my @ar ; $ar[0] = undef ; $ar[2] = 3 ; if ( exists $ar[0] ) { print "0 exists\n" ; } if ( exists $ar[1] ) { print "1 exists\n" ; } if ( exists $ar[2] ) { print "2 exists\n" ; } if ( exists $ar[3] ) { print "3 exists\n" ; } print Dumper(\@ar) ;

      This prints:

      0 exists 2 exists

      As you can see element 0 DOES exists when it is explicitly set to undefined. Using Data::Dumper prints:

      $VAR1 = [ undef, undef, 3 ];

        Hello Veltro,

        You are right my example was not clear if I do not add the following part:

        #!/usr/bin/perl use strict; use warnings; use Data::Dumper; use feature 'say'; my @array; $array[0] = undef; # Assigns the value 'undef' to $array[0] $array[3] = 2; # print Dumper \@array; for my $value (@array) { say "Defined: \$array[$value]" if defined $value; } __END__ $ perl test.pl Defined: $array[2]

        The reason that this is happening is explained in the documentation exists:

        Given an expression that specifies an element of a hash, returns true +if the specified element in the hash has ever been initialized, even +if the corresponding value is undefined.

        So if you check the array element with exists and you have manually defined undef to the element then exists will return True. :)

        Thanks for pointing it out. Hopefully it will avoid confusion for future reference. :)

        BR / Thanos

        Seeking for Perl wisdom...on the process of learning...not there...yet!
Re: Difference between exists and defined
by dsheroh (Monsignor) on Apr 17, 2019 at 07:20 UTC
    I am expecting after the assignment in the second line, the array should consists of four elements #0-3
    Why would you expect that? $array[3] is the only element you've assigned a value to, or even mentioned, prior to the loop, so there's no reason for Perl to have allocated storage space for any other elements.

    What you have missed is that Perl arrays are sparse data structures, so that you can designed to allow you to assign to $array[8675309] without consuming an unreasonable amount of memory to store the 8675309 unused elements which precede it. They are explicitly not C-style indexes into a region of contiguous memory.

    (Also, as previous replies have mentioned, the docs warn against using exists on arrays, so this behavior should be considered implementation-dependent and other versions of the perl binary may potentially behave differently. I kind of doubt that they actually would behave differently in this case, but you still shouldn't rely on it in any code that you care about.)

      Update: The ideas I've expressed in this post are apparently neither entirely correct nor entirely incorrect! Please see the posts of LanX here, haukex here and dsheroh here.

      $array[3] is the only element you've assigned a value to ... so there's no reason for Perl to have allocated storage space for any other elements. ... Perl arrays are sparse data structures ... you can assign to $array[8675309] without consuming ... memory to store ... unused elements ...

      I think these statements are incorrect regarding Perl positional (if that's the correct term) arrays. (Perl associative arrays are sparse.) Using Windows Task Manager to graph memory usage in real time (Windoze gotta be good for something) when the following code is executed, one can see that assignment to an array element causes contiguous allocation of enough memory to "grow" the array sufficiently to include the assigned element.

      c:\@Work\Perl\monks>perl -wMstrict -le "my @ra; print 'array declared'; sleep 5; ;; $ra[ 100_000_000 ] = 42; print '1st array assignment'; sleep 5; ;; $ra[ 200_000_000 ] = 137; print '2nd array assignment'; sleep 5; ;; print 'byebye'; " array declared 1st array assignment 2nd array assignment byebye
      The same effect is seen with assignment to array length rather than to any element:
          $#ra = 100_000_000;

      It's a question of what to do with the allocated memory. Perl arrays are arrays of scalars, and a scalar is constructed by default in the very well-defined state of un-defined-ness; an "undefined" scalar is a completely specified C/C++ object. So how do you initialize the space for 100,000,000 scalars allocated in the example above? The specific way this question is answered from one CPU/OS/Perl implementation to another is the basis of the ambiguity surrounding the use of exists on allocated but never-accessed array elements.

      My fuzzy understanding of the Perl guts is that to save time (not space!), array elements in the situation described above are quickly created in a state of quasi-existence: the memory is not left as random garbage, but neither is it a sequence of fully-fledged, default-initialized scalars. Hence the advice regarding use of exists with array elements: Don't Do That!™

      Perhaps others more familiar with the details of this question can comment on specifics.


      Give a man a fish:  <%-{-{-{-<

        I don't remember where I read it (probably in the Panther book) but Perl arrays are designed to easily compete with linked lists, while keeping the benefits of indexed access.

        That is to allow dynamic growth on both ends in a very dynamic way.

        An array has an internal index for the first and last element and allocates twice as much space as reserve for push or unshift.

        Basically only the range between the first and last existing element need to be stored, plus mentioned reserve.

        The existing elements are kind of pointers to scalars which are allocated separately.

        Allocation of new space is only needed if the reserve elements are filled, since this happens in exponential steps of doubling* it's statistically very efficient.

        Shrinking the array happens just by adjusting the indices for the first and last element.

        HTH

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

        update

        see here Shift, Pop, Unshift and Push with Impunity!

        *) not sure anymore about the doubling, maybe confusing that part with hashes.

        I think these statements are incorrect regarding Perl positional (if that's the correct term) arrays. (Perl associative arrays are sparse.)

        Just wanted to confirm that you're correct that Perl's arrays are not sparse. I haven't yet found a reference in the official docs that says so explicitly, but I'm sure it's somewhere.

        use Devel::Size 'total_size'; my @foo; print total_size(\@foo), "\n"; # prints 64 $foo[100_000_000] = 'x'; print total_size(\@foo), "\n"; # prints 800000114 $foo[200_000_000] = 'x'; print total_size(\@foo), "\n"; # prints 1760000156
        I stand corrected. At some point, I probably read the linked list thing that LanX mentioned and now misremembered it.

        To convince myself, I threw together:

        #!/usr/bin/env perl use strict; use warnings; use 5.010; use Memory::Usage; my @array1; my @array2; my $mu = Memory::Usage->new(); $mu->record('ready to go'); $array1[5268] = 1; $mu->record('array1 has an element'); $array2[8675309] = 1; $mu->record('array2 has an element'); $mu->dump();
        Running this on a Debian 8.11 machine with perl 5.20.2, I get the result:
        time vsz ( diff) rss ( diff) shared ( diff) code ( diff) + data ( diff) 0 20824 ( 20824) 2568 ( 2568) 1916 ( 1916) 8 ( 8) + 920 ( 920) ready to go 0 20824 ( 0) 2568 ( 0) 1916 ( 0) 8 ( 0) + 920 ( 0) array1 has an element 0 88600 ( 67776) 70416 ( 67848) 2048 ( 132) 8 ( 0) + 68696 ( 67776) array2 has an element
        The array index 5268 that I used for array1 is a magic number, apparently corresponding to the minimum size that my perl allocates for an array when it's initially declared. If I increase the index to 5269, it shows an additional 132k (all the numbers are in kilobytes) allocated when array1 is assigned to.