in reply to Points on a line and associated intervals

Use a heap structure, where the "largest" item is the item with the shortest interval. For however many intervals you want to remove, then, all you have to do is take the top interval from the heap, modify the interval on the side you're removing the point from (you'll have links to adjacent intervals for each interval), and reheap it to its new, higher position. The interval on the other side will just change its link to the interval on this side, since the interval is now adjacent to it.

Each interval removal should take at worst (and probably not even close to this) O(lg n), so the total complexity is O(s lg n) where s is the set number of intervals to remove, plus of course O(n lg n) to build the original heap. Easily manageable for millions of intervals.

Or you could calculate where each remaining point needs to be on your line, given its total length and the number of points you want, then choose the point closest to that position (or weight the points nearest to it and come up with an average) and use that instead.

  • Comment on Re: Points on a line and associated intervals