in reply to [OT] A measure of 'sortedness'?
Does the sequential pass of the buffers have to be done first?
Given that the merge sort runs through each list fully in the process of merging, could you not make the results of the scan a side effect of the merge?
I suppose you might be able to determine an upper limit during your sequential scans, to see how much buffer space would be required to do the merge without swapping.
On the other hand, even if you have to do some swapping in order to keep the buffer to a fixed value, you're still going to get linear worst case (starting with list A before list B, if all the Bs sort to the front, then you'll have to swap each A into the vacated B positions before moving them once more to their final position.
--AAAAABBBBB #B's sort before A's, - is empty buffer B-AAAAA-BBBB BBAAAAA--BBB BB-AAAAa-BBB #A moved to a to make room BBBAAAAa--BB BBB-AAAaa-BB BBBBAAAaa--B BBBB-AAaaa-B BBBBBAAaaa-- BBBBB-Aaaaa- BBBBBaA-aaa- #Each A ends up moving twice; O(1.5N), linear. BBBBBa-aaaa- BBBBBaa-aaa- BBBBBaaa-aa- BBBBBaaaa-a- BBBBBaaaaa-- # Plus 1.0N if you want to shift the set # back to the original address range: --BBBBBaaaaa
Perhaps the more important question is in the nature of the items. How much work does it take to determine their value/sort order, and how much work does it take to move them?
If they are complex objects with an expensive sort function, you could make the initial pass pre-calculate that. If the values of sequential items are correlated, you could possibly save time and/or brainpower doing it up front.
If the moves are particularly expensive, then the pre-calculations getting a measure of sortedness could provide an improved time to completion estimate by guessing how many items need to be swapped twice to do it in-place.
PseudoEdit:
sauoc's idea sounds like a way to optimize down the number random disk accesses by doing bunches of objects at a time. I got the impression your data was all in memory however.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: [OT] A measure of 'sortedness'?
by BrowserUk (Patriarch) on Mar 18, 2015 at 23:02 UTC | |
by RonW (Parson) on Mar 19, 2015 at 18:17 UTC | |
by BrowserUk (Patriarch) on Mar 19, 2015 at 18:27 UTC | |
by RonW (Parson) on Mar 19, 2015 at 19:45 UTC | |
by BrowserUk (Patriarch) on Mar 19, 2015 at 20:44 UTC |