in reply to sorting large data

If you are on a Unix, POSIX, Windows + cygwin (or some other package with Unix tools), VMS + whatever the name of their Unix tools package is, or some other decent system, you could speed up your program, while bringing it down to one line:
system "sort in_file > out_file";
Unix sort is specially equiped to sort large files (large on disk compared to available RAM).

Abigail

Replies are listed 'Best First'.
Re: Re: sorting large data
by simeon2000 (Monk) on Jul 23, 2002 at 16:24 UTC
    But how would you do this kind of sort if you don't have access to gnu sort? I have pondered this for a while and have not come up with any efficient solutions. Certainly loading up a 45MB text file into RAM is not the answer.

    "Falling in love with map, one block at a time." - simeon2000

      Well, if don't have access to GNU sort, you can always try one of the many other implementations of Unix sort.... ;-).

      Anyway, you would do as Unix sort would do. Split up the data in sizes that you can swallow (how much that is depends from system to system). Sort that, and store it in a temporary file. Now you have a bunch of sorted files - and you have to merge them. You even might have to do this recursively.

      Read Knuth if you want to know everything about merge sort.

      Abigail

        Short of going out and buying books (with a limited cash supply), is there anywhere on the web with this sort of information?

      Any good algorithms book should cover sorting. See Knuth volume 2, or Orwant et al Mastering Algorithms with Perl.