kingman has asked for the wisdom of the Perl Monks concerning the following question:

I have a question about memory usage when saving large hashes to disk via Storable.pm's lock_store and lock_retrieve functions. This is kind of an extreme example:

If I have a hash that looks like this and I've managed to save it to a file called 'books.data' with lock_store:

my $x = { 'Moby Dick' => "The entire text of Moby Dick...", 'The Lord of the Rings' => "The entire text of Lord of the Rings.. +.", 'Great Expectations' => "The entire text of Great Expectations..." +, 'War and Peace' => "The entire text of War and Peace...", };
What happens when I do this:
my $books = lock_retrieve('books.data');
Am I reading all that data into memory at once? How hard is it on the machine if I iterate through the keys of $%books?
my $search_string = 'ect'; foreach $title (keys %$books) { print $books->{$title} if $title =~ /$search_string/i; }
I'd appreciate any comments on this. I'm kind of wondering how much data is too much. Thanks for your help!

Replies are listed 'Best First'.
Re: Saving Large Hashes to Disk via Storable.pm
by perrin (Chancellor) on Jan 24, 2002 at 03:09 UTC
    Storable loads and saves the entire hash at once. If you want to selectively load things, you should use a dbm file.
Re: Saving Large Hashes to Disk via Storable.pm
by Rhandom (Curate) on Jan 24, 2002 at 03:24 UTC
    Doesn't answer your question, but as an alternate method... store the filename of the book contents instead of the actual contents as in:

    $books = { "Moby Dick" => '/path/to/books/moby_dick.dat', "LoTR" => '/path/to/books/lotr.dat', };

    my @a=qw(random brilliant braindead); print $a[rand(@a)];

Re: Saving Large Hashes to Disk via Storable.pm
by Ryszard (Priest) on Jan 24, 2002 at 09:38 UTC
    As perrin and crazyinsomniac mention you sound like you need a database.

    I personally like postgres because its free, transaction based, and easy to setup and configure.

    IMHO you will only have too much data when it starts to become irrevalent, for example, if you have a book in your catalogue, that you will never reference/gone out of print/has been stolen et al, perhaps you dont need that record.

    If you start to constrain your data to fit your iron, you need another method of accessing it.