in reply to Re^5: Can you improve upon my algorithm.
in thread Can you improve upon my algorithm.

The code has been running for 8:20 mins and processed 2.8GB; I'll let you do the math.

On the other hand, an N-way merge in C, on the hot path of the inner loop, requires just to perform a couple of operations plus the key comparisons log2N times. That may sum up to 50 instructions for every loop cycle (=sorted record)!

Show me the code!


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
  • Comment on Re^6: Can you improve upon my algorithm.

Replies are listed 'Best First'.
Re^7: Can you improve upon my algorithm.
by salva (Canon) on Mar 13, 2015 at 11:31 UTC
    Show me the code!

    Be patient!

    I am still playing with your cookies problem!

      Be patient!

      I didn't say when :)

      FWIW: After killing the process I left running; I made one change and re-ran it:

      my $nextRec = substr( $bufs[ $r ], -1024, 1024, '' ); // 1/64th itera +tions/buffer.

      And it completed in under 10 minutes:

      E:\junk>c:t-salvaTheory.pl 0Took 492 seconds.

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
      In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
        Here we go!

        n-way-merge.c

        On my old home computer it merges at around 3GB/min, 100GB in half an hour.

        As it is just a prove of concept, it can only handle records formed by integers of type int64_t.

        The meaning of the constants are as follows:

        N = maximum number of files that can be sorted CHUNKSIZE = the per-file memory buffer KEYSIZE = the number of int64_t numbers per record
        The source files are passed as arguments on the command line, the sorted output is sent to STDOUT. Only tested on Ubuntu 64bits.

        Update: It is difficult to count the number of instructions inside the loop on the hot path. It is certainly below 100, don't know if I have been able to reach the 50 I have predicted... I blame gcc -O3 for that for unrolling small loops as the key comparison.