reaper9187 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks. Is there any way I can use grep to parse through a 2D array and extract rows for which particular elements are greater than a specific value. Consider the following example:
use strict; use warnings; use Data::Dumper; my @arr; @arr = ( [ 'A', 1, 0, 5, 6, 2 ], [ 'B', 3 , 4 , 5 , 6 , 7 ], [ 'C' , 2 , 4 ,3 ,5 ], [ 'D' , 6 , 7 , 8 ,8 ], [ 'E' , 2 , 5 , 4 , 5 ], [ 'F' , 4 , 3 , 2 ,2 ], [ 'G' , 1 , 2 , 4, 5 ], [ 'H' , 1 , 4 , 5, 6 ] ); my @rows = grep { grep {$_ > 6} @$_ } @arr; print Dumper(\@rows);
The above code checks the each 'row' of the 2D array and extracts those arrays(1D)/entries if any of the column entries is greater than the threshold( in this case, 6). I would like to able to set multiple checks across multiple columns,i.e. for eg: extract rows for which the second column entry > 4 and 4th column entry > 3 and so on.

Any idea how to do this ? Thanks in advance

Replies are listed 'Best First'.
Re: Simplest way to match/filter 2d array of values to search in perl
by kennethk (Abbot) on Jul 03, 2014 at 20:33 UTC
    If you need to test if the 2nd column is greater than 4, then you'll have a term in your grep that looks like
    $_->[1] > 4
    and for the 4th entry greater than 3
    $_->[3] > 3
    What stumbling blocks have you hit? The structural difference from your posted code is you will only require one grep.

    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      Thank you !! That did it trick. I feel stupid now that I didn't think of it earlier.
Re: Simplest way to match/filter 2d array of values to search in perl
by AnomalousMonk (Archbishop) on Jul 03, 2014 at 22:45 UTC

    I would include an index range test to avoid a possible slew of warnings such as those generated by the OPed example code (numeric comparisons of non-numerics). (The pair of tests, of index and element value, could easily be encapsulated in a function.)

    c:\@Work\Perl>perl -wMstrict -MData::Dump -le "use Data::Dumper; ;; my @arr = ( [ 'A', 1, 0, 5, 6, 2 ], [ 'B', 3, 4, 5, 6, 7 ], [ 'C', 2, 4, 3, 5 ], [ 'D', 6, 7, 8, 8 ], [ 'E', 2, 5, 4, 5 ], [ 'F', 4, 3, 2, 2 ], [ 'G', 1, 2, 4, 5 ], [ 'H', 1, 4, 5, 6 ] ); ;; my @rows = grep { (@$_ > 2 && $_->[2] > 4) && (@$_ > 4 && $_->[4] > 3) } @arr ; dd \@rows; " [["D", 6, 7, 8, 8], ["E", 2, 5, 4, 5]]
Re: Simplest way to match/filter 2d array of values to search in perl
by reaper9187 (Scribe) on Jul 04, 2014 at 09:51 UTC
    Thank you all for your comments. As an extension to this problem, I would also like to implement the following:

    Count the number of occurences in the grep search where only those 'rows' are selected where the number of elements that exceed a particular value are defined. For eg, consider:
    @arr = ( [ 'A' , 1, 0, 5, 6 ], [ 'B' , 3 , 4 , 5 , 6 ], [ 'C' , 2 , 4 ,3 ,5 ], [ 'D' , 6 , 7 , 8 ,8 ], [ 'E' , 2 , 5 , 4 , 5 ], [ 'F' , 4 , 8 , 8 ,8 ], [ 'G' , 1 , 2 , 4, 5 ], [ 'H' , 1 , 4 , 5, 6 ] );
    Now, I define the number of string matches to be equal to '2' and threshold value to be '8'. The output should only include the rows where the no. of occurences of the value '8' is twice or more. i.e
    __OUTPUT__ 'D' , 6 , 7 , 8 ,8 'F' , 4 , 8 , 8 ,8
    Any idea on how to do this ?? Thank you very much !!!

      Straightforward solution:

      my $matches = 2; my $threshold = 8; my @results = (); foreach my $row (@arr) { my $count; for (1..$#$row) { $count++ if($row->[$_] >= $threshold); } push @results, $row if($count >= $matches); } foreach(@results) { say join ",", @$_; }

      Shorter/idiomatic/obfuscated (take your pick) solution:

      my $matches = 2; my $threshold = 8; my @results = grep { $matches <= scalar grep { $_ >= $threshold } @$_[1..$#$_]; } @arr; foreach(@results) { say join ",", @$_; }

      Both of these output:

      D,6,7,8,8 F,4,8,8,8

      BTW, I'm assuming here that you're actually interested in whether there's at least $matches item equal to or greater than $threshold, since that's what "threshold" implies. If you're only interest in exactly that value, change >= $threshold to == $threshold in either code snippet above.