ginni has asked for the wisdom of the Perl Monks concerning the following question:

This node falls below the community's threshold of quality. You may see it by logging in.
  • Comment on getting elements from an array based on a value not in the array

Replies are listed 'Best First'.
Re: getting elements from an array based on a value not in the array
by BrowserUk (Patriarch) on Aug 11, 2006 at 19:27 UTC

    Assuming that you mean that you want $size elements either side of where the $target would be in the sorted array, then using a binary search (binary chop) to locate the appropriate position will be about your quickest method. Something along the lines of this:

    Updated: Slightly better test data generation.

    #! perl -slw use strict; our $TARGET ||= 4; our $SIZE ||= 2; our $N ||= 10; sub binaryChop{ my( $aref, $target ) = @_; my( $lo, $hi, $mid ) = ( 0, $#$aref ); while( $lo < $hi ) { $mid = ( $hi + $lo ) / 2; if( $aref->[ $mid ] > $target ) { $hi = $mid - 1; } elsif( $aref->[ $mid ] < $target ) { $lo = $mid + 1; } else { return $mid; } } return $lo; } my @data = grep{ $_ != $TARGET } 1 .. $N; my $targetIndex = binaryChop( \@data, $TARGET ); print "@data"; print join ', ', @data[ $targetIndex - $SIZE .. $targetIndex + $SIZE - + 1 ]; __END__ C:\test>566908 -N=20 -TARGET=8 -SIZE=2 1 2 3 4 5 6 7 9 10 11 12 13 14 15 16 17 18 19 20 6, 7, 9, 10 C:\test>566908 -N=20 -TARGET=8 -SIZE=2 1 2 3 4 5 6 7 9 10 11 12 13 14 15 16 17 18 19 20 6, 7, 9, 10 C:\test>566908 -N=20 -TARGET=8 -SIZE=2 1 2 3 4 5 6 7 9 10 11 12 13 14 15 16 17 18 19 20 6, 7, 9, 10 C:\test>566908 -N=20 -TARGET=8 -SIZE=3 1 2 3 4 5 6 7 9 10 11 12 13 14 15 16 17 18 19 20 5, 6, 7, 9, 10, 11 C:\test>566908 -N=20 -TARGET=8 -SIZE=4 1 2 3 4 5 6 7 9 10 11 12 13 14 15 16 17 18 19 20 4, 5, 6, 7, 9, 10, 11, 12 C:\test>566908 -N=20 -TARGET=11 -SIZE=4 1 2 3 4 5 6 7 8 9 10 12 13 14 15 16 17 18 19 20 7, 8, 9, 10, 12, 13, 14, 15 C:\test>566908 -N=20 -TARGET=11 -SIZE=1 1 2 3 4 5 6 7 8 9 10 12 13 14 15 16 17 18 19 20 10, 12 C:\test>566908 -N=2000 -TARGET=1111 -SIZE=3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 ... 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 ... 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 ... 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 ... 22 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 10 .. 23 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 12 .. 23 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 14 .. 23 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 16 .. 23 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 18 .. 1107, 1108, 1109, 1110, 1112, 1113

    Note that my test data generation is simplistic (allows duplicates; allows the target value), and the math selecting the appropriate slice of the data to pick may need adjusting to suit your requirements.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Thank you very much, this puts me on the right track.
Re: getting elements from an array based on a value not in the array
by lima1 (Curate) on Aug 11, 2006 at 18:42 UTC
    you mean report the first n ($size) elements bigger and smaller than $target in @test? is @test always sorted?

    because you perform this multiple times, a lookup table for $target could be a good solution (depends on the possible range of values for $target). if this is possible and you have no idea how to implement this, ask again...

      lima1,
      If we assume that one and only one output exists for a given $target then @test must always be sorted. Assuming that as a given, the best approach is a modified binary search.

      The only way caching seems to make sense to me is if the list of $target(s) can contain duplicates. I can think of some other scenarios but they are highly dependent on details we don't have.

      For instance, if $target will only ever be integers we could fill a hash with the integers between numbers in @test with a value of the @test index (N.5 since $target is missing). Then we need only lookup up $target in our hash, get the index and return a slice with $size elements on either side. The entire hash could be built all at once up front or it could be built "as you go" which would be a win if the list of $target(s) were clustered.

      The trouble is that this is all highly speculative. More information and cleare explanation is required for a decent solution in my opinion.

      Cheers - L~R

        yes, exactly. I would have implemented the cache as an hash with key $target and value the offset of splice.

        but you are right. binary search is the best solution if we understood the OP right.

Re: getting elements from an array based on a value not in the array
by Ieronim (Friar) on Aug 11, 2006 at 19:42 UTC
    if lima1 is right, you need to specify what you want to get if $target is in the @test and if $size exceeds the boundaries of the array. Next problem is processing duplicates.

    I.e., what do you want to get if $target = 5 or if $target = 0?

    After you define this, all you need is to implement binary search like BrowserUK's.

    NOTE: his solution does not handle the case when $target is outside the array at all.


         s;;Just-me-not-h-Ni-m-P-Ni-lm-I-ar-O-Ni;;tr?IerONim-?HAcker ?d;print
Re: getting elements from an array based on a value not in the array
by eric256 (Parson) on Aug 11, 2006 at 18:34 UTC

    I'm a bit confused. You returned 4 elements but you said it should be 2, and that 4 should "fall between" which i dont understand at all. Maybe another example of the output would help?


    ___________
    Eric Hodges
A reply falls below the community's threshold of quality. You may see it by logging in.