js1 has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I've written a perl script which analyzes postfix logs. However with large logs, the hashes I'm using fill up the memory quickly abends with the message 'Out of Memory'.

If I use MLDBM, I can write the hashes to disk, but I still don't have enough memory, so I wondered whether the hashes might still be held in memory as well as on disk? Here's how I've been using one of my hashes:

tie %maillist, 'MLDBM', "/home/u752359/maillist.db", O_CREAT|O_RDWR, 0 +640 or die $!; $maillist{$hostqid}=$mail; $msgid=${maillist{$hostqid}}->msgid; delete $maillist{$hostqid}; $maillist{$hostqid}=$nmail; foreach $hostqid (keys %maillist){
Does anything here look suspect? Thanks for any help.

JS.

update (broquaint): tidied up formatting

Replies are listed 'Best First'.
Re: hash on memory or disk?
by Abigail-II (Bishop) on Jan 13, 2004 at 10:37 UTC
    keys %maillist will generate a list of all keys in %maillist, so, if the hash is large, this list will be large as well.

    If you want to iterate over a (large) hash in a memory efficient way, use each and a while loop.

    Abigail

Re: hash on memory or disk?
by DrHyde (Prior) on Jan 13, 2004 at 10:55 UTC
    As well as what Abigail said, how are you reading your log file? If you're doing something like ...
    @log_entries = <LOGFILE>;
    then you'll be slurping the whole file into memory. Better to do ...
    while(<LOGFILE>) { # stuff to process this single log entry }
    This:
    $maillist{$hostqid}=$mail; $msgid=${maillist{$hostqid}}->msgid; delete $maillist{$hostqid}; $maillist{$hostqid}=$nmail;
    also looks weird. You assign to $maillist{$hostqid}, then use it, then immediately delete the hash entry, then immediately assign a new value to it.
Re: hash on memory or disk?
by Anonymous Monk on Jan 13, 2004 at 11:22 UTC
    If you don't always want the value, just use FIRSTKEY/NEXTKEY, like
    { my $X = tied %maillist; my $key = $X->FIRSTKEY; while(defined $key ){ next unless $key =~ /^rubber/; ... $key = $X->NEXTKEY($key); } }
    See perltie for more.
Re: hash on memory or disk?
by js1 (Monk) on Jan 13, 2004 at 11:35 UTC
    Thanks for all your quick replies. I'll give them a try and let you know what happens.

    JS.

      Does this test for an empty hash use a lot of memory if the hash is big?
      $nmail->relay('<>',1) unless keys %$hash_ref;

        No, keys in scalar context just returns the number of elements of the hash; and the hash keeps score in its internal datastructure.

        This is of course assuming we are talking about a normal hash. If the hash is tied, then it will depend on the implementation.

        Abigail