perl_lover_always has asked for the wisdom of the Perl Monks concerning the following question:

I have a very large file to sort and I dont want to split it! I'm using linux sort with d and f (dictionary and ignore case) options. it takes too long! I want your suggestion to know shall I shift to perl File::Sort module or any faster module or it would be same way! Thanks!

Replies are listed 'Best First'.
Re: Linux sort or FILE::SORT
by Utilitarian (Vicar) on Feb 08, 2011 at 10:42 UTC
    The sort utility in the shell is a C program optimised for sorting, it will always be faster than loading Perl and calling a module, regardless of the efficiency of the module (assuming someone doesn't come up with a spectacularly fast new sort algorithm ;)

    print "Good ",qw(night morning afternoon evening)[(localtime)[2]/6]," fellow monks."

      It is actually quite easy to beat gnusort for performance.

      Firstly it uses miniscule buffers relative to modern memory sizes necessitating far more write read cycles than could ever be optimum.

      Secondly, with most machines having multiple cores these days, it is silly not to utilise them.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        It does take the -S $BUFFER_SIZE flag.

        Are sort operations usually CPU bound?

        print "Good ",qw(night morning afternoon evening)[(localtime)[2]/6]," fellow monks."
Re: Linux sort or FILE::SORT
by Anonyrnous Monk (Hermit) on Feb 08, 2011 at 14:29 UTC