coldy has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I want to print the start and stop positions of a numerical array where the elements value is greater than 0.5 eg
my @probs =(0,0.1,0.53,0.51,0.59,0.67,0.2,0.04,0.05,0.56,0.89,0.75);
gives a file with start and end positions:
2 5 9 11
Ive tried using a loop, and I can get the sequence of positions using,
my $positions = undef; my @segment=(); foreach (0..$#probs){ if ($probs[$_]>0.5 && $probs[$_+1]>0.5){ $positions = $positions.','.$_; } else {push @segment, $positions if ($positions); $positions = undef} }
I was happy with that and just filter it again for the first and last number on each line.

any ideas on a better way to do this?

Thanks again!

Replies are listed 'Best First'.
Re: printing array positions that match a condition
by graff (Chancellor) on Apr 22, 2009 at 05:56 UTC
    You didn't mention what you want when your input includes a sequence like this:
    ..., 0.3, 0.58, 0.2, ...
    Supposing the "0.58" was the 20th element in the list, should that produce a "segment" like this?
    20
    or like this?
    20 20
    Anyway, the fact that your solution works fine (once you add some very simple post-processing on @segments) seems good enough to me. But in case you want to generalize it to different conditions and without having to rewrite it every time (and rewrite the post-processing too), here's one approach for a general-purpose subroutine, which would be easy to pop into a module, if you like:
    #!/usr/bin/perl use strict; use warnings; my @probs; { local $/; @probs = split " ", <DATA>; } my $segs = get_ranges( sub { return ( $_[0] > 0.5 ) }, \@probs ); print "$_\n" for ( @$segs ); sub get_ranges { my ( $cmp, $list ) = @_; my @ranges = (); my @current_range = (); my $endpoint = 0; for ( 0 .. $#$list ) { if ( $cmp->( $$list[$_] )) { $current_range[$endpoint] = $_; $endpoint = 1; } elsif ( @current_range ) { push @ranges, "@current_range"; @current_range = (); $endpoint = 0; } } return \@ranges; } __DATA__ 0 0.1 0.53 0.51 0.59 0.67 0.2 0.04 0.05 0.56 0.89 0.75 0.3 0.58 0.25
    The idea is that from one application to the next, you might need to change your conditions for determining what sort of range you want to identify, so you simply supply a suitable subroutine along with the input list.

    (Updated to simplify the anon.sub that is passed to get_ranges; $cmp->($item) just needs to return true when $item is in the targeted range.)

Re: printing array positions that match a condition
by NetWallah (Canon) on Apr 22, 2009 at 06:27 UTC
    Using flip-flops :
    my @probs =(0,0.1,0.53,0.51,0.59,0.67,0.2,0.04,0.05,0.56,0.89,0.75); my $t; print $_ . (",","\n")[$t=1-$t] for grep {($probs[$_] >=.5 .. $probs[$_]> .5) and $t=!$t } 0..$#probs
    Output:
    2,4 9,11
    update:graff has pointed out that this code is buggy, because it produces (2,4), instead of (2,5).
    Suggestions to fix it are invited. Fixes I can think of involve several lines of code - can anyone do it in 2 ?

         ..to maintain is to slowly feel your soul, sanity and sentience ebb away as you become one with the Evil.

      Add an extra datum to @probs and the output layout goes a bit ga ga as well.

      $ cat spw759180_NetWallah #!/usr/bin/perl # use strict; use warnings; my @probs =(0,0.1,0.53,0.51,0.59,0.67,0.2,0.04,0.05,0.56,0.89,0.75,0.6 +); my $t; print $_ . (",","\n")[$t=1-$t] for grep {($probs[$_] >=.5 .. $probs[$_]> .5) and $t=!$t } 0..$#probs $ ./spw759180_NetWallah 2 4,9 11,$

      I think you have to separate the task of finding the in-range values from that of determining the range boundaries as you have no idea what the next datum will be when processing the current one. Also, I think the start and stop conditions of the flip-flop should be consistent (and both > 0.5 since that is the requirement in the OP). In the following code I've added some data to show how single-element ranges are treated.

      use strict; use warnings; my @probs = ( 0, 0.1, 0.53, 0.51, 0.59, 0.67, 0.2, 0.04, 0.05, 0.56, 0.89, 0.75, 0.1, 0.51, 0.6, 0.2, 0.7, ); my @inRange = grep { $probs[ $_ ] > 0.5 .. $probs[ $_ ] > 0.5 } 0 .. $#probs; my @starts = map { $inRange[ $_ ] } grep { $_ == 0 || $inRange[ $_ ] - $inRange[ $_ - 1 ] != 1 } 0 .. $#inRange; my @ends = map { $inRange[ $_ ] } grep { $_ == $#inRange || $inRange[ $_ + 1 ] - $inRange[ $_ ] != 1 } 0 .. $#inRange; printf qq{%3d%3d\n}, $starts[ $_ ], $ends[ $_ ] for 0 .. $#starts;

      The output.

      2 5 9 11 13 14 16 16

      I hope this is of interest.

      Cheers,

      JohnGG

Re: printing array positions that match a condition
by moritz (Cardinal) on Apr 22, 2009 at 06:13 UTC
    If you want the start and end of each sequence directly, you can write something like this:
    use strict; use warnings; my @probs =(0,0.1,0.53,0.51,0.59,0.67,0.2,0.04,0.05,0.56,0.89,0.75); my @pos; my @segment; foreach (0..$#probs){ no warnings 'uninitialized'; my $start; if ($probs[$_] > 0.5 && $probs[$_+1] <= 0.5){ push @pos, $_; } elsif ($probs[$_] <= 0.5 && $probs[$_+1] > 0.5) { push @pos, $_ + 1; } } use Data::Dumper; print Dumper \@pos;
Re: printing array positions that match a condition
by przemo (Scribe) on Apr 22, 2009 at 10:12 UTC

    You may use existing solutions, like Bit::Vector:

    use Bit::Vector; my @probs =(0,0.1,0.53,0.51,0.59,0.67,0.2,0.04,0.05,0.56,0.89,0.75); my $v = Bit::Vector->new(int @probs); $v->Index_List_Store(grep { $probs[$_] > .5 } 0..$#probs); for (my ($start, $min, $max) = 0; $start < $v->Size(); $start = $max + + 2) { last unless ($min, $max) = $v->Interval_Scan_inc($start); print "$min $max\n"; }

    or even use $v->to_Enum instead of above for loop and receive

    2-5,9-11

    As a marginal note, you use indexes out of bounds of the array (in $probs[$_+1], when $_=$#probs), which is an error-prone practice.