Difference between exists and defined

milanpwc has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Difference between exists and defined by hippo (Archbishop) on Apr 16, 2019 at 09:36 UTC
From the documentation: WARNING: Calling exists on array values is strongly discouraged. The notion of deleting or checking the existence of Perl array elements is not conceptually coherent, and can lead to surprising behavior. I would not use exists with arrays, only with hashes. YMMV.	[reply]
Re: Difference between exists and defined by LanX (Saint) on Apr 16, 2019 at 10:48 UTC
The internal implementation of arrays is very space efficient. All used slots point to values (scalar, constant ...), but unused slots don't demand much space or even no space. `exists` tells you if a slot is used assigning `undef` will fill a slot with the value `undef` `defined` will also return `undef` if a slot isn't used. There are only little use cases where using exists on arrays can be of help. You don't want to meddle with such internals. Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery FootballPerl is like chess, only without the dice}	[reply] [d/l] [select]
Re: Difference between exists and defined by thanos1983 (Parson) on Apr 16, 2019 at 09:37 UTC
Hello milanpwc, From the documentation exists: `A hash or array element can be true only if it's defined and defined o +nly if it exists, but the reverse doesn't necessarily hold true.` [download] So why you are expecting to be true on not defined elements (not existing elements)? Is it more clear or still confusing? Also take notice from documentation: `exists may also be called on array elements, but its behavior is much +less obvious and is strongly tied to the use of delete on arrays. WARNING: Calling exists on array values is strongly discouraged. The n +otion of deleting or checking the existence of Perl array elements is + not conceptually coherent, and can lead to surprising behavior.` [download] Update: I think I understand why you are confused. You are not assigning to all elements in the array the value 2. Only to one element. See below: `#!/usr/bin/perl use strict; use warnings; use Data::Dumper; my @array; $array[3] = 2; print Dumper \@array; __END__ $ perl test.pl $VAR1 = [ undef, undef, undef, 2 ];` [download] BR / Thanos Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]
Re^2: Difference between exists and defined by Veltro (Hermit) on Apr 16, 2019 at 10:13 UTC
Your code example, the way you use Data::Dumper to show the contents of the array allows room for misinterpretation of what is realy happening here. I feel that a better code example is the following: `use strict ; use warnings ; use Data::Dumper ; my @ar ; $ar[0] = undef ; $ar[2] = 3 ; if ( exists $ar[0] ) { print "0 exists\n" ; } if ( exists $ar[1] ) { print "1 exists\n" ; } if ( exists $ar[2] ) { print "2 exists\n" ; } if ( exists $ar[3] ) { print "3 exists\n" ; } print Dumper(\@ar) ;` [download] This prints: `0 exists 2 exists` [download] As you can see element 0 DOES exists when it is explicitly set to undefined. Using Data::Dumper prints: `$VAR1 = [ undef, undef, 3 ];` [download]	[reply] [d/l] [select]
Re^3: Difference between exists and defined by thanos1983 (Parson) on Apr 16, 2019 at 10:52 UTC
Hello Veltro, You are right my example was not clear if I do not add the following part: `#!/usr/bin/perl use strict; use warnings; use Data::Dumper; use feature 'say'; my @array; $array[0] = undef; # Assigns the value 'undef' to $array[0] $array[3] = 2; # print Dumper \@array; for my $value (@array) { say "Defined: \$array[$value]" if defined $value; } __END__ $ perl test.pl Defined: $array[2]` [download] The reason that this is happening is explained in the documentation exists: `Given an expression that specifies an element of a hash, returns true +if the specified element in the hash has ever been initialized, even +if the corresponding value is undefined.` [download] So if you check the array element with exists and you have manually defined undef to the element then exists will return True. :) Thanks for pointing it out. Hopefully it will avoid confusion for future reference. :) BR / Thanos Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]
Re: Difference between exists and defined by dsheroh (Monsignor) on Apr 17, 2019 at 07:20 UTC
I am expecting after the assignment in the second line, the array should consists of four elements #0-3 Why would you expect that? `$array[3]` is the only element you've assigned a value to, or even mentioned, prior to the loop, so there's no reason for Perl to have allocated storage space for any other elements. What you have missed is that Perl arrays are ~~sparse data structures, so that you can~~ designed to allow you to assign to `$array[8675309]` without consuming an unreasonable amount of memory to store the 8675309 unused elements which precede it. They are explicitly not C-style indexes into a region of contiguous memory. (Also, as previous replies have mentioned, the docs warn against using `exists` on arrays, so this behavior should be considered implementation-dependent and other versions of the `perl` binary may potentially behave differently. I kind of doubt that they actually would behave differently in this case, but you still shouldn't rely on it in any code that you care about.)	[reply] [d/l] [select]
Re^2: Difference between exists and defined (updated) by AnomalousMonk (Archbishop) on Apr 17, 2019 at 21:31 UTC
Update: The ideas I've expressed in this post are apparently neither entirely correct nor entirely incorrect! Please see the posts of LanX here, haukex here and dsheroh here. `$array[3]` is the only element you've assigned a value to ... so there's no reason for Perl to have allocated storage space for any other elements. ... Perl arrays are sparse data structures ... you can assign to `$array[8675309]` without consuming ... memory to store ... unused elements ... I think these statements are incorrect regarding Perl positional (if that's the correct term) arrays. (Perl associative arrays are sparse.) Using Windows Task Manager to graph memory usage in real time (Windoze gotta be good for something) when the following code is executed, one can see that assignment to an array element causes contiguous allocation of enough memory to "grow" the array sufficiently to include the assigned element. `c:\@Work\Perl\monks>perl -wMstrict -le "my @ra; print 'array declared'; sleep 5; ;; $ra[ 100_000_000 ] = 42; print '1st array assignment'; sleep 5; ;; $ra[ 200_000_000 ] = 137; print '2nd array assignment'; sleep 5; ;; print 'byebye'; " array declared 1st array assignment 2nd array assignment byebye` [download] The same effect is seen with assignment to array length rather than to any element: `$#ra = 100_000_000;` It's a question of what to do with the allocated memory. Perl arrays are arrays of scalars, and a scalar is constructed by default in the very well-defined state of un-defined-ness; an "undefined" scalar is a completely specified C/C++ object. So how do you initialize the space for 100,000,000 scalars allocated in the example above? The specific way this question is answered from one CPU/OS/Perl implementation to another is the basis of the ambiguity surrounding the use of exists on allocated but never-accessed array elements. My fuzzy understanding of the Perl guts is that to save time (not space!), array elements in the situation described above are quickly created in a state of quasi-existence: the memory is not left as random garbage, but neither is it a sequence of fully-fledged, default-initialized scalars. Hence the advice regarding use of exists with array elements: Don't Do That!™ Perhaps others more familiar with the details of this question can comment on specifics. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^3: Difference between exists and defined by LanX (Saint) on Apr 17, 2019 at 23:35 UTC
I don't remember where I read it (probably in the Panther book) but Perl arrays are designed to easily compete with linked lists, while keeping the benefits of indexed access. That is to allow dynamic growth on both ends in a very dynamic way. An array has an internal index for the first and last element and allocates twice as much space as reserve for push or unshift. Basically only the range between the first and last existing element need to be stored, plus mentioned reserve. The existing elements are kind of pointers to scalars which are allocated separately. Allocation of new space is only needed if the reserve elements are filled, since this happens in exponential steps of doubling* it's statistically very efficient. Shrinking the array happens just by adjusting the indices for the first and last element. HTH Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery FootballPerl is like chess, only without the dice} update see here Shift, Pop, Unshift and Push with Impunity! *) not sure anymore about the doubling, maybe confusing that part with hashes.	[reply]
Re^3: Difference between exists and defined by haukex (Archbishop) on Apr 18, 2019 at 07:23 UTC
I think these statements are incorrect regarding Perl positional (if that's the correct term) arrays. (Perl associative arrays are sparse.) Just wanted to confirm that you're correct that Perl's arrays are not sparse. I haven't yet found a reference in the official docs that says so explicitly, but I'm sure it's somewhere. `use Devel::Size 'total_size'; my @foo; print total_size(\@foo), "\n"; # prints 64 $foo[100_000_000] = 'x'; print total_size(\@foo), "\n"; # prints 800000114 $foo[200_000_000] = 'x'; print total_size(\@foo), "\n"; # prints 1760000156` [download]	[reply] [d/l]
Re^3: Difference between exists and defined by dsheroh (Monsignor) on Apr 18, 2019 at 08:06 UTC
I stand corrected. At some point, I probably read the linked list thing that LanX mentioned and now misremembered it. To convince myself, I threw together: `#!/usr/bin/env perl use strict; use warnings; use 5.010; use Memory::Usage; my @array1; my @array2; my $mu = Memory::Usage->new(); $mu->record('ready to go'); $array1[5268] = 1; $mu->record('array1 has an element'); $array2[8675309] = 1; $mu->record('array2 has an element'); $mu->dump();` [download] Running this on a Debian 8.11 machine with perl 5.20.2, I get the result: `time vsz ( diff) rss ( diff) shared ( diff) code ( diff) + data ( diff) 0 20824 ( 20824) 2568 ( 2568) 1916 ( 1916) 8 ( 8) + 920 ( 920) ready to go 0 20824 ( 0) 2568 ( 0) 1916 ( 0) 8 ( 0) + 920 ( 0) array1 has an element 0 88600 ( 67776) 70416 ( 67848) 2048 ( 132) 8 ( 0) + 68696 ( 67776) array2 has an element` [download] The array index 5268 that I used for array1 is a magic number, apparently corresponding to the minimum size that my `perl` allocates for an array when it's initially declared. If I increase the index to 5269, it shows an additional 132k (all the numbers are in kilobytes) allocated when array1 is assigned to.	[reply] [d/l] [select]

update