in reply to Re^3: [OT] A measure of 'sortedness'?
in thread [OT] A measure of 'sortedness'?

why not smaller chunks?

Because eventually, you need to merge the smaller chunks into bigger ones. That includes the ones bigger than memory, but by spltting into 1/2 memory sized chunks, you can merge them in pairs:

A B C D Each 2GB A&B B&C C&D The largest have migrated from A to D, no need to revis +it B&C A&B The smallest have migrated from D to A no need to revis +it B&C And final pass ensures everything is in place.

Using smaller buffers only delays the inevitable and increases the number of passes.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

Replies are listed 'Best First'.
Re^5: [OT] A measure of 'sortedness'?
by RonW (Parson) on Mar 19, 2015 at 19:45 UTC

    I understand that, but I still wonder if the performance boost afforded by simplifying the "shuffling around" could more than offset the extra overhead of more merges.

      No.

      If you sort A & B & C & D, then merge A+B & C+D; you still need to merge AB + CD. Better to have sorted AB, and CD, and do one merge. Ie. 4 sorts and 3 merges versus 2 sorts and 1 merge.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
      In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked