in reply to Re: Constructing a hash - why isn't my regex matching anything
in thread Constructing a hash - why isn't my regex matching anything

Thanks a lot for detailed explanation

I have one more question.I am trying to run the above script on a file 125000 lines and output the hash to a text file.I keep getting "Out of memory!" message,is there something that can be done about it?

  • Comment on Re^2: Constructing a hash - why isn't my regex matching anything

Replies are listed 'Best First'.
Re^3: Constructing a hash - why isn't my regex matching anything
by Corion (Patriarch) on Dec 19, 2010 at 10:49 UTC

    Yes. Use less memory.

    Look over your program where you are needlessly wasting memory. Maybe you are reading a complete file into memory instead of processing it line by line. Maybe you are doing something else that wastes memory.

    Even still, 125000 lines is not much, so most likely you are doing something that wastes a lot of memory.

      I did look at my code(below),I am hardly doing anything other than constructing the hash

      #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my %hash; open my $fh, '<', $ARGV[0] or die "could not open $ARGV[0]'' $!"; while (my $line = <$fh>) { #print "line:$line\n"; my ($key) = $line =~ /;([^;]+)\s-\s/; #print "KEY:$key\n"; my ($value) = $line =~ /\.\\(.*)-\d+\;/; #print "VALUE:$value\n"; if (!($hash{$key})) { $hash{$key}=$value; } } open my $hash, '>', "hash_flf.txt"; print Dumper(\%hash); close $hash;

        How much memory do you have available? The following test case on my system uses 34 MB (according to top) after having filled the hash, and a total of 110 MB after having created the dump string.

        #!/usr/bin/perl -w use strict; use Data::Dumper; my $key_ = '//programfiles/documents/data/lookup/script_auth_pap.h'; my $val_ = 'root\edit\perl\scripts\scripths\sec\inc\script_auth_pap.h' +; my $c = 0; my %hash; for (1..125000) { my $key = "$key_$c"; my $val = "$val_$c"; $c++; $hash{$key} = $val; } <>; # 34 MB my $dump = Dumper \%hash; <>; # 110 MB
Re^3: Constructing a hash - why isn't my regex matching anything
by ELISHEVA (Prior) on Dec 19, 2010 at 11:24 UTC

    The hash alone takes up between 12 and 13 megs (125,000 * 100 chars per key-value pair), but 13 megs isn't a great deal of memory on most machines these days. What sort of machine are you on? Are you by any chance running this script on a server or virtual machine with some sort of artificial per-process memory cap?

    Another possibility: How do you construct this file that you are extracting keys and values from? Earlier you posted a question about recursive extraction of file names. Is this part of the same script? Perhaps earlier or later in your script (above or below this loop) you have some left over code that slurped in a very large file all at once? Or perhaps your recursion rather than this loop is eating up all of the memory?

      I think it takes more than that :) The numbers are in Kbytes and the memory usage doubles due to Data::Dumper, from 78MB to 142MB

        Well, I'll be....

        Any idea of where all that extra memory usage is coming from (beyond the 78M for Data::Dumper)? That's a lot of extra space for 13M of actual data. Based on a conversation in the CB, hash buckets only account for about half a meg extra, not 60M (or 20M as per another tester in a reply further up)

        Update:A quick check on my machine comes up with 26M for storing key value pairs in an array, and 34M for storing them in a hash:

        key-value pair: 112 bytes total data for 125,000 key-value pairs: 13.25M virtual memory usage for array built via push @aData, $k, $v: 26M virtual memory usage for hash built via $hData{$k} = $v: 34M

        The test script is below