in reply to Re^3: Can I speed this up?
in thread Can I speed this up? (repetitively scanning ranges in a large array)

Correct me if I'm wrong, but wouldn't all local maxima have to be at the center of at least one range? If not, you'd be able to move one place towards the midpoint and get a higher value. (Since you can't use two ranges at the same time to make a bigger range)

Rather than testing 4M points, just grab the 25k ranges, delete the ones that are subsets of another range, and then the midpoints of what are left are your local maxima.

Conversely, local minima would be found at the midpoints between the midpoints of overlapping ranges. You would still have to test three values to ensure it is a true minima and not a flat spot.


In this case, you might get to use the buckets technique from back up the thread. 25k+1 midpoints between adjacent ranges * 3 sampled values * ~30 ranges in the bucket = 2.3M compares instead of 120M

Replies are listed 'Best First'.
Re^5: Can I speed this up?
by daverave (Scribe) on Nov 02, 2010 at 06:47 UTC
    Oh, I was too tired last night. I'm looking for local minima. Sorry for the error. I don't follow your idea re. local minima. Could you please elaborate.

      Do you have a larger test set--say max=100 ranges=100 rangesize=10--plus results?

        This is one of the smallest real examples I have: example.corrected.tar.gz.

        A few notes:

        1. Remember coordinates start from 1, not zero.

        2. Max length = 87688.

        3. Results are given in half sizes (e.g., if the minimal uncovered window centered at i is of size 3, the result will be 1, if it's 5 the result will be 2, etc.).

        UPDATED link with a corrected version of the ranges. Previously wrapped ranges span out of max length, now they are in the correct form.

      The potential maxima are at the center of your ranges. Since the peaks are all the same size (ranges being all 5k wide, and the same slope (+/-1 per unit distance) then the minima will be at the half way point between two maxima.

      4k |-   / \  / \    / \ / \       /
      2k |-  /   \/   \  /       \     /
      0k |- /          \/         \___/
      

      Since you know where the peaks are (start+2.5k), and you can sort the ranges by start position, you can trivially identify the neighboring ranges. Halfway between the peak of range N and range N+1, there might be a local minima, or a flat spot, as in the picture.

      The value at the minima will be easily calculated once located by finding the distance to either of the two ranges causing it.

        I'm still thinking about this, but one thing I should note right away is that ranges are not all of the same width. 5k is some average, actual lengths are different and typically range between 500 - 20k.

        This also raises the question of what are neighboring ranges? Those with nearest centers? nearest edges?

        Id you could give the basic loop your idea refers to I could combine it in the code previously published and see if it makes sense.