Array Slice Referencing

PerlPhi has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: Array Slice Referencing
by shmem (Chancellor) on Jun 02, 2007 at 10:41 UTC

QUESTION 1: how could it be that array slice generates a scalar references when an array referencing generates an array reference?

See perlref:

Taking a reference to an enumerated list is not the same as using square brackets--instead it's the same as creating a list of references!
@list = (\$a, \@b, \%c);
@list = \($a, @b, %c);      # same thing!
[download]

That is consistent with referencing an array slice, since slicing an array yields a list.

QUESTION 2: is there any other way to generate an array reference (not a scalar references) by picking a selected data to be referenced from an array? (what happens when @_ is replaced by a scalar variable, it only gets the first index from the array slice)

To generate a new array reference from an array slice, you would use the anonymous array constructor [] and place the array slice (which is a list) into that:

$list = [ @array[2,0] ];
[download]

If you assign an array slice to a scalar, you get the same behaviour as with assigning a list to a scalar.

$\ = "\n";
my @array = ("my","list","here");
$_ = @array[0,1,2];
print;           # prints "here" - last element of slice
$_ = ("my","list","here");
print;           # also prints "here"
[download]

In scalar context, the comma operator just discards its left argument and returns the right argument (after evaluating it if it's a expression). So the expression in parens yields its last argument. See perlop. Again, an array slice is a list.

well for my second question the following might be a solution (inefficient enough):
#!perl/bin/perl use strict; my @array = ("my","list","here"); my @ref = ([@array[2,0]],[@array[1]]); $ref[0][0] = "changed"; print "$ref[0][0] @array";
[download]
yes it is inefficient enough because it doesn't change the values of my orignal array, but only the value of my array reference. well that's very wierd because it only means that the reference consumes a new space in my memory in which it has its own data other than my array.

If you want the elements of your newly created array of anonymous arrays to refer to the elements in your original array, you must take references to the elements of the original array instead of populating your newly created array references with copies of your original array elements:

my @array = ("my","list","here");
my @ref = ([\@array[2,0]],[\@array[1]]);

${$ref[0][0]} = "changed";
print "${$ref[0][0]} @array";
__END__
changed my list changed
[download]

QUESTION 3: how come it happens that creating a new reference also creates a space in my meomory based on that program?

That's because the following statements are equivalent:

my @ref = (\@array[2,0],\@array[1]);
my @ref = \(@array[2,0],@array[1]);
my @ref = (\$array[2],\$array[0],\$array[1]);
my @ref = \($array[2],$array[0],$array[1]);
my @ref = \(@array[2,0,1]);
[download]

You don't get two references, you get three references. See above.

update: minor wording change

--shmem

_($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                              /\_¯/(q    /
----------------------------  \__(m.====·.(_("always off the crowd"))."·
");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}

[reply]
[d/l]
[select]

Re^2: Array Slice Referencing

by PerlPhi (Sexton) on Jun 02, 2007 at 11:35 UTC

thanks for the solution... it solves my delimma about array slice referencing. but i want something to clarify about your codes (as follows):

my @ref = ([\@array[2,0]],[\@array[1]]);
[download]

does this mean (logic flow):

(1) to create a reference for the array slice (2) assign that reference to the newly created anonymous array (3)assign the refernce of the anonymous array to the list values of @ref

was this right? so it means we have to create an extra one reference (memory address) to every list values of @ref?

im just curious, how much is the size (in bytes) of a reference? since a reference consumes a space of memory in which the memory address of one variable is its value...

[reply]
[d/l]

Re: Array Slice Referencing
by FunkyMonk (Bishop) on Jun 02, 2007 at 10:36 UTC

QUESTION 1: how could it be that array slice generates a scalar references when an array referencing generates an array reference?

That's because making a reference of a list generates a list of references:

print Dumper \( 1, 2, 3 );

#output:
$VAR1 = \1;
$VAR2 = \2;
$VAR3 = \3;
[download]

QUESTION 2: is there any other way to generate an array reference (not a scalar references) by picking a selected data to be referenced from an array? (what happens when @_ is replaced by a scalar variable, it only gets the first index from the array slice)

I don't think you can make a reference to an array slice, but I'd be happy to be enlightened.

QUESTION 3: how come it happens that creating a new reference also creates a space in my meomory based on that program?

Sorry, I don't understand your question.

[reply]
[d/l]

Re: Array Slice Referencing
by BrowserUk (Patriarch) on Jun 02, 2007 at 17:50 UTC

In a nutshell, Perl does not have a mechanism for taking a reference to a slice of an existing array. (Actually, it seems it does, though you won't find it documented anywhere to my knowledge. See below).

A slice is just a list of values; copies of the values from whichever datastructure they were drawn. They are no longer associated with that originating data structure. As soon as you 'take a slice of an array', you have copied the values into new memory--a list--and all association with the original array is lost.

Once you've taken the slice, you have a list (and the new memory allocated to hold it), and you cannot take a reference to a list. You can assign it to a new array and take a reference to that. Whether a formally named new array and a standard reference to that, or indirectly by assigning the list to an annonymous array which gives you that reference directly.

That said, there is a way to do what you are trying to do, though it is not formally described (other than in this old post by Juerd) and will doubtless be frowned upon as a nasty obscure hack. However, it utilises a standard and frequently used mechanism of Perl, and so far I have seen nothing to pursuade me that it is not a useful and valid tactic for certain kinds of operation.

#! perl -slw
use strict;

sub refSlice{ return \@_ } ## return a reference to an array of aliase
+s!

my @a = 1 .. 10;
print "Original array:\n@a";

my $refSliceA  = refSlice @a[ 2 .. 8 ];
print "\nsliceA:\n@$refSliceA";

my $refSliceB = refSlice @a[ 1, 3, 5, 7, 9 ];
print "\nsliceB:\n@$refSliceB";

$_ **= 2 for @$refSliceA;
print "\nsliceA modified:\n@$refSliceA";

$_ /= 2 for @$refSliceB;
print "\nsliceB modified:\n@$refSliceB";

print "\nArray after slice operations:\n@a";

__END__
C:\test>refslice.pl
Original array:
1 2 3 4 5 6 7 8 9 10

sliceA:
3 4 5 6 7 8 9

sliceB:
2 4 6 8 10

sliceA modified:
9 16 25 36 49 64 81

sliceB modified:
1 8 18 32 5

Array after slice operations:
1 1 9 8 25 18 49 32 81 5
[download]

So, you can generate 'a reference to a slice of an array'. And you can hold multiple, slice references to an array. And mutating operations on those slices will mutate the original array.

However, that may not be as useful to you as you might hope, because this kind of 'slice reference' isn't a ~~huge~~* total memory saver!

Update: It turns out (see below), that aliases are significantly lighter than copies or references. On the basis of purely experimental evidence, it seems that it costs 4 bytes per alias, rather than at least 12 bytes per copy. Well worth having in memory constrained situations.

The reference you obtain this way is actually a reference to an anonymous array of references*, to the scalars in the original array. And as reference to a scalar is itself a scalar, it takes almost as much memory to have a array of references as it does to have an array of copies of the scalars to which they point. Of course, references are fixed in size whereas the original scalars could hold large strings, in which case there may be some memory saving.

*Actually aliases, but that just semantics :)

It does however, allow you to perform operations on subsets, and multiple subsets and overlapping multiple subsets of an array without having to destroy the original ordering or having to try and re-create that original ordering by piecing the subsets back together.

For some applications where this is required, it is a very effective technique.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

[reply]
[d/l]

Re^2: Array Slice Referencing

by TimToady (Parson) on Jun 02, 2007 at 20:42 UTC

PDL

[reply]

Re^2: Array Slice Referencing

by Jenda (Abbot) on Jun 02, 2007 at 21:52 UTC

A slice is just a list of values; copies of the values from whichever datastructure they were drawn. They are no longer associated with that originating data structure. As soon as you 'take a slice of an array', you have copied the values into new memory--a list--and all association with the original array is lost.

No. This is wrong. A slice is a list of aliases to the array elements:

my @a = (1,2,3,4,5);
for (@a[1,3,4]) {$_+=10};
print join( ',', @a), "\n";

# or

my @a = (1,2,3,4,5);
sub foo { for(@_) {$_+=10} }
foo( @a[1,3,4]);
print join( ',', @a), "\n";
[download]

Jenda
Support Denmark!
Defend the free world!

[reply]
[d/l]

Re^3: Array Slice Referencing

by BrowserUk (Patriarch) on Jun 02, 2007 at 23:19 UTC

Indeed, you are correct. I initially thought that this was a semantic detail as your demonstrations both use constructs, for and @_, that are already known to produce aliases; whether you are performing a slice operation or not. You could also use map or grep or List::Util::reduce and it's meta-ops etc.

But, I thought, that outside of those aliasing operators, that your argument was semantics because there was no way to get a handle on or utilise the aliasing of a slice operation. And that to use the aliasing, meant having to re-slice the array and re-create the aliases, each time you wanted to use them.

I was wrong! In attempting to prove my point, I came up with something I've never seen described or used before.

It's a little discused and used, and slightly aberrant and confusing, fact that in Perl, \( 1, 2, 3 ) doesn't give you a reference to the list, but rather a list of references to the items in the list.

Unlike \@a which gives a reference to the array.

Attempting to use that to discount your semantic argument, I tried the following:

@a = 1 .. 5;

@b = \@a[ 0, 2, 4 ]; 
print "@b"; ## An array of references to aliases or copies?
SCALAR(0x18eece0) SCALAR(0x18eecf8) SCALAR(0x18eed10)

$$_ += 10 for @b;
print $$_ for @b;
11
13
15

print "@a"; 
11 2 13 4 15 ## It was mutated!
[download]

Which means that it isn't necessary to use the sub trick to get a reference to an array slice, provided you don't mind dereferencing the references.

That came as a big surprise to me. I think it may come as a surprise to a few other also?

So, you are correct. And it's not just a semantic detail :)

I could see it being classed as a bug by some, but since it has (probably) been around for a long time, it's unlikely to go away now.

However, there is one significant difference between the two methods:

@a = 1 .. 1e6;;
print $$;;
2156
!tasklist /fi "pid eq 2156";;
perl.exe                    2156                         0     63,320 
+K

sub refSlice{ \@_ };;
$b = refSlice @a[ 1 .. $#a ];;
!tasklist /fi "pid eq 2156";;

perl.exe                    2156                         0     71,144 
+K
[download]

Which shows that taking a reference to a slice of 999,999 aliases (using the sub method), requires an extra 7.824 MB. A significant saving over the 60 MB required to make a copy of the slice of the array.

Using the other method:

@a = 1 .. 1e6;;
print $$;;
4108
!tasklist /fi "pid eq 4108";;
perl.exe                    4108                         0     63,296 
+K

@b = \@a[ 1 .. $#a ];;
!tasklist /fi "pid eq 4108";;
perl.exe                    4108                         0     91,392 
+K
[download]

Which shows this method requires 28.096 MB. Roughly 4 times as much. Understandable, because we created an array of references to a list (or array?) of aliases to the original data.

It seems that (at least in the latest versions of perl), that aliases are significantly lighter than copies of an array. I'll modify my post above to reflect this.

I'm sure the memory saving of aliases was far less significant the last time I looked at this back when Juerd originally posted about it?)

Thanks for forcing me to re-evaluate old knowledge :)

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

[reply]
[d/l]
[select]