in reply to Re^2: Modified Binary Search
in thread Modified Binary Search

This was a 1 time activity...

Seems a simple linear search would have been sufficient then.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"I'd rather go naked than blow up my ass"

Replies are listed 'Best First'.
Re^4: Modified Binary Search
by Limbic~Region (Chancellor) on Jan 14, 2010 at 23:32 UTC
    BrowserUk,
    That's the way I had it coded originally but when it wasn't complete when I came back from lunch - I wasn't happy to continue waiting. Performing a linear search through 5 million records 600,000 times still takes longer than I wanted.

    Cheers - L~R

      Hm. You have a big list and a small list of dates, both sorted (or readily sortable).

      And for each date in the small list, you want to find the subset of the big list that are 1 year either side.

      And you search anew for each item in the small list?

      Here's a one-pass algorithm:

      my @bigSorted = ...; my @smallSorted = ...; my %ranges; my( $lo, $hi ) = ( 0, 0 ); for ( @smallSorted ) { my( $early, $late ) = calcBoundaryDates( $_ ); ++$lo while $bigList[ $lo ] lt $early; $hi //= $lo; ++$hi while $bigList[ $hi ] le $late; $ranges{ $_ } = $hi - $lo; }

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        That works fine if you're doing it as a one-time deal for all elements of Tbl_B. LR's problem description seems to say that the search needs to be done for the modified elements each time Tbl_B is modified. Since the above technique does linear search, it's not going to be optimal.