in reply to Re^2: Selecting Ranges of 2-Dimensional Data
in thread Selecting Ranges of 2-Dimensional Data

This is all very interesting to read. I've put some say statements in the new getsubset to figure out aspects of the parameter array and matrix manipulations. I'll put abridged output and (unabridged) source between readmore tags and pull out the bits I want to ask about after. All of the useful source in this has been listed upthread, so I'd probably skip to the code niblets...

... inside first anonymous block parameter array at top is ARRAY(0x5600d58ded50) R3 default is R3 data is4 parameter array at bottom is ARRAY(0x5600d58ded50) R3 data is 1 2 3 4 5 6 7 8 9 10 a b c d e f g h i j 11 12 13 14 15 16 17 18 19 20 51 52 53 54 55 56 57 58 59 60 leaving getsubset 11 12 13 14 15 16 17 18 19 20 ok 19 11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 66 17 18 19 20 exiting first anonymous block ---------- 1 2 3 4 5 6 7 8 9 10 a b c d e f g h i j 11 12 13 14 15 66 17 18 19 20 51 52 53 54 55 56 57 58 59 60 ---------- parameter array at top is ARRAY(0x5600d58ded50) C2 default is C2 data is4 parameter array at bottom is ARRAY(0x5600d58ded50) C2 data is 1 2 3 4 5 6 7 8 9 10 a b c d e f g h i j 11 12 13 14 15 66 17 18 19 20 51 52 53 54 55 56 57 58 59 60 leaving getsubset 2 b 12 52 ok 20 2 b 21 52 exit 2nd ---------- 1 2 3 4 5 6 7 8 9 10 a b c d e f g h i j 11 21 13 14 15 66 17 18 19 20 51 52 53 54 55 56 57 58 59 60 ---------- parameter array at top is ARRAY(0x5600d58ded50) R4C2 default is R4C2 data is4 parameter array at bottom is ARRAY(0x5600d58ded50) R4C2 data is 1 2 3 4 5 6 7 8 9 10 a b c d e f g h i j 11 21 13 14 15 66 17 18 19 20 51 52 53 54 55 56 57 58 59 60 leaving getsubset 52 ok 21 end 3rd ---------- 1 2 3 4 5 6 7 8 9 10 a b c d e f g h i j 11 21 13 14 15 66 17 18 19 20 51 42 53 54 55 56 57 58 59 60 ---------- parameter array at top is ARRAY(0x5600d58ded50) R2C5:R4C8 default is R2C5:R4C8 data is4 parameter array at bottom is ARRAY(0x5600d58ded50) R2C5:R4C8 data is 1 2 3 4 5 6 7 8 9 10 a b c d e f g h i j 11 21 13 14 15 66 17 18 19 20 51 42 53 54 55 56 57 58 59 60 leaving getsubset e f g h 15 66 17 18 55 56 57 58 ok 22 added 5 to a value ---------- 1 2 3 4 5 6 7 8 9 10 a b c d e f g h i j 11 21 13 14 15 66 22 18 19 20 51 42 53 54 55 56 57 58 59 60 ---------- parameter array at top is ARRAY(0x5600d58ded50) C8:Cn default is C8:Cn data is4 parameter array at bottom is ARRAY(0x5600d58ded50) C8:Cn data is 1 2 3 4 5 6 7 8 9 10 a b c d e f g h i j 11 21 13 14 15 66 22 18 19 20 51 42 53 54 55 56 57 58 59 60 leaving getsubset 8 9 10 h i j 18 19 20 58 59 60 ok 23 substitutes X ---------- 1 2 3 4 5 6 7 8 9 10 a b c d e f g h i X 11 21 13 14 15 66 22 18 19 20 51 42 53 54 55 56 57 58 59 60 ---------- parameter array at top is ARRAY(0x5600d58ded50) R4 default is R4 data is4 parameter array at bottom is ARRAY(0x5600d58ded50) R4 data is 1 2 3 4 5 6 7 8 9 10 a b c d e f g h i X 11 21 13 14 15 66 22 18 19 20 51 42 53 54 55 56 57 58 59 60 leaving getsubset 51 42 53 54 55 56 57 58 59 60 ok 24 ---------- 1 2 3 4 5 6 7 8 9 10 a b c d e f g h i X 11 21 13 14 15 66 22 18 19 20 51 42 53 54 55 56 57 58 59 60 ---------- ok 25 ## end abridged output begin source $ cat 2.da.pl #!/usr/bin/perl -w use 5.011; use Carp; use Data::Alias 'alias'; use Data::Dumper; sub print_aoa { use warnings; use 5.011; my $a = shift; my @array = @$a; for my $row (@array) { print join( " ", @{$row} ), "\n"; } return $a; } sub rangeparse { local $_ = shift; say "default is $_"; my @o; # [ row1,col1, row2,col2 ] (-1 = last row/col) if ( @o = /\AR([0-9]+|n)C([0-9]+|n):R([0-9]+|n)C([0-9]+|n)\z/ ) { } elsif (/\AR([0-9]+|n):R([0-9]+|n)\z/) { @o = ( $1, 1, $2, -1 ) } elsif (/\AC([0-9]+|n):C([0-9]+|n)\z/) { @o = ( 1, $1, -1, $2 ) } elsif (/\AR([0-9]+|n)C([0-9]+|n)\z/) { @o = ( $1, $2, $1, $2 ) } elsif (/\AR([0-9]+|n)\z/) { @o = ( $1, 1, $1, -1 ) } elsif (/\AC([0-9]+|n)\z/) { @o = ( 1, $1, -1, $1 ) } else { croak "failed to parse '$_'" } $_ eq 'n' and $_ <readmore>= -1 for @o; return \@o; } sub getsubset { my ( $data, $range ) = @_; say "parameter array at top is @_"; my $cols = @{ $$data[0] }; @$_ == $cols or croak "data not rectangular" for @$data; $range = rangeparse($range) unless ref $range eq 'ARRAY'; @$range == 4 or croak "bad size of range"; say "data is", 0 + @$data; my @max = ( 0 + @$data, $cols ) x 2; # say "max is @max"; max is 4 10 4 10 for my $i ( 0 .. 3 ) { $$range[$i] = $max[$i] if $$range[$i] < 0; croak "index $i out of range" if $$range[$i] < 1 || $$range[$i] > $max[$i]; } croak "bad rows $$range[0]-$$range[2]" if $$range[0] > $$range[2]; croak "bad cols $$range[1]-$$range[3]" if $$range[1] > $$range[3]; my @cis = $$range[1] - 1 .. $$range[3] - 1; say "parameter array at bottom is @_"; say "data is"; print_aoa($data); say "leaving getsubset"; return [ map { sub { \@_ } ->( @{ $$data[$_] }[@cis] ) } $$range[0] - 1 .. $$range[2] - 1 ]; } use Test::More tests => 25; is_deeply rangeparse("R1"), [ 1, 1, 1, -1 ]; is_deeply rangeparse("C1"), [ 1, 1, -1, 1 ]; is_deeply rangeparse("Rn"), [ -1, 1, -1, -1 ]; is_deeply rangeparse("Cn"), [ 1, -1, -1, -1 ]; is_deeply rangeparse("R4C5"), [ 4, 5, 4, 5 ]; is_deeply rangeparse("RnCn"), [ -1, -1, -1, -1 ]; is_deeply rangeparse("R2:R3"), [ 2, 1, 3, -1 ]; is_deeply rangeparse("C2:C3"), [ 1, 2, -1, 3 ]; is_deeply rangeparse("R4:Rn"), [ 4, 1, -1, -1 ]; is_deeply rangeparse("C5:Cn"), [ 1, 5, -1, -1 ]; is_deeply rangeparse("R2C3:R4C5"), [ 2, 3, 4, 5 ]; is_deeply rangeparse("R4C3:R4C3"), [ 4, 3, 4, 3 ]; is_deeply rangeparse("R5C1:R5C9"), [ 5, 1, 5, 9 ]; is_deeply rangeparse("R2C6:R11C6"), [ 2, 6, 11, 6 ]; is_deeply rangeparse("R3C1:RnC2"), [ 3, 1, -1, 2 ]; is_deeply rangeparse("R5C4:R5Cn"), [ 5, 4, 5, -1 ]; is_deeply rangeparse("RnC2:RnC5"), [ -1, 2, -1, 5 ]; is_deeply rangeparse("R3C2:RnCn"), [ 3, 2, -1, -1 ]; my $data = [ [ 1 .. 10 ], [ 'a' .. 'j' ], [ 11 .. 20 ], [ 51 .. 60 ] ] +; { say "inside first anonymous block"; my $subset = getsubset( $data, "R3" ); print_aoa $subset; is_deeply $subset, [ [ 11 .. 20 ] ]; print_aoa $subset; $subset->[0][5] = 66; print_aoa $subset; say "exiting first anonymous block"; } say "----------"; print_aoa $data; say "----------"; { my $subset = getsubset( $data, "C2" ); print_aoa $subset; is_deeply $subset, [ [2], ['b'], [12], [52] ]; $subset->[2][0] = 21; print_aoa $subset; say "exit 2nd"; } say "----------"; print_aoa $data; say "----------"; { my $subset = getsubset( $data, "R4C2" ); print_aoa $subset; is_deeply $subset, [ [52] ]; $subset->[0][0] = 42; say "end 3rd"; } say "----------"; print_aoa $data; say "----------"; { my $subset = getsubset( $data, "R2C5:R4C8" ); print_aoa $subset; is_deeply $subset, [ [ 'e' .. 'h' ], [ 15, 66, 17, 18 ], [ 55, 56, 5 +7, 58 ] ]; $subset->[1][2] += 5; say "added 5 to a value"; } say "----------"; print_aoa $data; say "----------"; { my $subset = getsubset( $data, "C8:Cn" ); print_aoa $subset; is_deeply $subset, [ [ 8 .. 10 ], [ 'h' .. 'j' ], [ 18 .. 20 ], [ 58 .. 60 ] ]; $subset->[1][2] = 'X'; say "substitutes X"; } say "----------"; print_aoa $data; say "----------"; { my $subset = getsubset( $data, "R4" ); print_aoa $subset; is_deeply $subset, [ [ 51, 42, 53, 54, 55, 56, 57, 58, 59, 60 ] ]; #$subset->[0][0] = 'M'; #say "substitutes M"; } say "----------"; print_aoa $data; say "----------"; is_deeply $data, [ [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ], [qw/a b c d e f g h i X/], [ 11, 21, 13, 14, 15, 66, 22, 18, 19, 20 ], [ 51, 42, 53, 54, 55, 56, 57, 58, 59, 60 ] ]; $

I have not seen this syntax before, and just to be sure, I thumbed through _Learning Perl_, not seeing it in chapter 4, Lists and Arrays. Asking google what "perl arrays x 2" means does not ask an effective question.

  my @max = ( 0 + @$data, $cols ) x 2;

I'm just looking for a reference to read up on that. The second thing I wanted to bring up was about the parameter array. Is it the case that @_ does not change over the life of the function? Does it have intrinsic aliasing?

Finally, after days of tinkering with it, I'm still baffled by the return from getsubset. We know what it's to be because we print it out when it gets returned. Lo and behold, it is a reference to an array. I can't see how the sausage gets made here:

return [ map { sub { \@_ } ->( @{ $$data[$_] }[@cis] ) } $$range[0] - 1 .. $$range[2] - 1 ];

You use the range operator once. LanX (upthread for the curious) used it twice:

return [ map { arr_alias @$_[ $cols->[0] .. $cols->[1] ] # x-slice } @$data[ $rows->[0] .. $rows->[1] ] # y-slice ];

Are they logically equivalent? Thanks for your comment and raising a topic I haven't looked at in perl very far. (Scientific computing in my day was fortran.) Wouldn't it be relatively easy to display these values using Tk::TableMatrix?

Replies are listed 'Best First'.
Re^4: Selecting Ranges of 2-Dimensional Data
by haukex (Archbishop) on Oct 30, 2018 at 22:54 UTC
    my @max = ( 0 + @$data, $cols ) x 2; I'm just looking for a reference to read up on that.

    See the x operator under Multiplicative Operators. I'm just using it as a shorthand for my @max = ( 0 + @$data, $cols, 0 + @$data, $cols );.

    The second thing I wanted to bring up was about the parameter array. Is it the case that @_ does not change over the life of the function? Does it have intrinsic aliasing?

    @_ is described in perlsub: "The array @_ is a local array, but its elements are aliases for the actual scalar parameters. In particular, if an element $_[0] is updated, the corresponding argument is updated ... Assigning to the whole array @_ removes that aliasing, and does not update any arguments." As to whether it doesn't change, that depends on what the sub does - it is not read-only. For example, shift and pop can modify @_, and in Perl versions 5.10 and older, split could clobber @_. And there are some other potentially tricky issues with @_, for example, if a sub is called with an & and no argument list, "no @_ array is set up for the subroutine: the @_ array at the time of the call is visible to subroutine instead" (also perlsub).

    You use the range operator once. LanX (upthread for the curious) used it twice:

    Well, actually I used it twice, note how I set up the @cis array. And yes, those two snippets of code from LanX and myself are basically equivalent. One difference is that I use 1-based indexing in the indices stored in the @$range array.

    I can't see how the sausage gets made here

    Ok, so here's my original code:

    my @cis = $$range[1]-1 .. $$range[3]-1; return [ map { sub{\@_}->(@{$$data[$_]}[@cis]) } $$range[0]-1 .. $$range[2]-1 ]

    First, let's reformat that, and instead of @$range, let me use four lexical variables corresponding to the elements of the array ($row1, $col1, $rowN, $colN), and make them 0-based instead of 1-based.

    my @column_indices = $col1 .. $colN; return [ map { sub{ \@_ }->( @{ $$data[$_] }[@column_indices] ) } $row1 .. $rowN ]

    Now, map can be translated into a for with push (I hope that transformation is clear?). I've also pulled out various bits of expressions into lexical variables.

    my @row_indices = $row1 .. $rowN; my @column_indices = $col1 .. $colN; my @row_subset; for my $row_idx (@row_indices) { my $row = $$data[$row_idx]; # deref $data and get row my $column_subset_aliases = sub{ \@_ }->( @$row[@column_indices] # deref $row and get array slice ); push @row_subset, $column_subset_aliases; } return \@row_subset;

    Now the last bit of trickery here is sub{\@_}->(...). sub {...} constructs an anonymous subroutine, which is then immediately called (via ->(...)) and with the arguments (...). The body of the sub is just \@_, which means "return a reference to @_". Because the elements of @_ are aliases to the original arguments, what we get back from the whole expression is an arrayref whose elements are aliases to the arguments. In the above code, those arguments are the elements of the array referred to by $row, which were selected by the array slice.

    I hope it's at least a little bit more clear now?

      I hope it's at least a little bit more clear now?

      It is, thank you. I struggle with the map {sub{\@_}->(...)} syntax, but can imitate it and know where to find it now. Indeed, I wasn't really grasping it until happening on the thread where davido, you, and others comment on the parameter array referring to other source: Re: passing a hash ref in a sub

      I like how the monastery can solve one's questions with deeper reading and cross-referencing nodes that carry a topic forward instead of beating it to death in one thread with one example....