in reply to Sorting a large file

First of all, take a look at Sorting data that don't fit in memory. The BerkeleyDB solution is something that works for sure.

Be warned about the memory usage of arrays and hashes in perl. I found out it takes 46 bytes for one array-item to be stored in memory. That's a lot. If your arrays are more than say 1Mb items long, you might run into trouble.

You also can use a RDBM like postgres or mysql to manage the LOG. Efficient storage, sorting AND use.

Maybe the memory problem lies in the fact that perl tries to create a duplicate array holding your data. Do you have twice the memory available before the sort starts?

Hope this helps,

Jeroen
"We are not alone"(FZ)

Replies are listed 'Best First'.
Re: Re: Sorting a large file
by c-era (Curate) on Feb 21, 2001 at 18:48 UTC
    One problem that I should of stated above, I'm on Windows NT, using active perl (no BerkeleyDB). The database seems like an overkill for sorting a log file, but it could be an option.
      You might try to install MySQL (its free) on your NT box, write a script to monitor your log file for changes, import new files into the database; use SQL to sort the file.

      Celebrate Intellectual Diversity