in reply to Re^8: external sort performance improved?
in thread external sort performance improved?

Ouch! Your machine must be slower; or your actual data more complex than my quick mockup. It only took about 20 minutes on my machine for approximately the same size file. Sorry for the bum steer.

Looking back at your OP, you might gain a little by changing your sort sub from:

my $sortscheme = sub { my @flds_a = split(/,,/, $Sort::External::a); my @flds_b = split(/,,/, $Sort::External::b); $flds_a[0] cmp $flds_b[0]; };

To:

my $sortscheme = sub { substr( $Sort::External::a, 0, 25 ) cmp substr( + $Sort::External::b, 0, 25 ); };

But I wouldn't expect it to make a great difference.


Another idea that might speed things up, but it will only work if your records are, (as they appear to be in your samples), fixed length?

The following is a single command and should be entered as a single line in your command prompt:

perl -nle"print substr( $_, 21 ) . substr($_, 0, 21 )" dataf | \windows\system32\sort /M 5242880 /+62 | perl -nle"print substr($_,62) . substr($_,0,-21)" > dataf.sorted

It should run at very nearly the same speed as the original windows sort version, but this time I work around the no-key-length limitation of that sort program by:

  1. using perl to take the significant portion of each record and swap it to the end;
  2. using the /+62 offset parameter to sort on only that portion of the record;
  3. use perl again to switch the two parts back around again.

As the two perl processes are O(N) and run substantially in parallel with the O(NlogN) sort, the runtime should be little changed over the original windows sort version -- assuming that was actually quicker for you.


Another approach -- assuming that 12GB machine of yours also has multiple processors and can see multiple disks-- would be to split the file into N chunks, one per procesor; distribute those chunks onto different drives; and the run gnusort on each of the N chunks concurrently.

When the N files are sorted, you can use the gnusort switch -m to merge the partial sorts together into a finally sorted file.

But having multiple drives is crucial for this to work. Otherwise, IO contention will probably kill any gains through through the parallelism.


Finally, assuming you are going to be doing this regularly -- this thread would be a waste of time if you are doing it only once :) -- then probably the best gains you could get would be the purchase of a SSD.

The fastest of these are now upto several hundred (if not a 1000) times faster than harddrives, though the fastest are the PCIe based devices which tend to cost £1000+.

But even the much cheaper consumer grade devices -- ~£100 for a 60GB sata3 -- are a couple of 100 times faster and will make a considerable difference to your elapsed time.

Combine one of those with the above split - parallel sort - merge mechanism -- which will work well putting all the chunks on the same SSD as they do not have heads so no seek-time losses -- and you ought to be able to cut your runtime close to 32mins / number of processors.

I hope at least one of these ideas is useful to you.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?

Replies are listed 'Best First'.
Re^10: external sort performance improved?
by rkshyam (Acolyte) on Apr 24, 2012 at 11:19 UTC
    Thanks for detailed ideas.My system is not slower.Performance of the s +ystem is good.We dont have option to buy new SSDs. However I have tried with the code change that you suggested. There wa +s an improvement of 8 minutes compared to earlier code.I donot know h +ow to use the another approach splitting into multiple chunks. Thanks again for your time!!