in reply to Re^2: Constructing a hash - why isn't my regex matching anything
in thread Constructing a hash - why isn't my regex matching anything

The hash alone takes up between 12 and 13 megs (125,000 * 100 chars per key-value pair), but 13 megs isn't a great deal of memory on most machines these days. What sort of machine are you on? Are you by any chance running this script on a server or virtual machine with some sort of artificial per-process memory cap?

Another possibility: How do you construct this file that you are extracting keys and values from? Earlier you posted a question about recursive extraction of file names. Is this part of the same script? Perhaps earlier or later in your script (above or below this loop) you have some left over code that slurped in a very large file all at once? Or perhaps your recursion rather than this loop is eating up all of the memory?

  • Comment on Re^3: Constructing a hash - why isn't my regex matching anything

Replies are listed 'Best First'.
Re^4: Constructing a hash - why isn't my regex matching anything
by Anonymous Monk on Dec 19, 2010 at 11:40 UTC
    I think it takes more than that :) The numbers are in Kbytes and the memory usage doubles due to Data::Dumper, from 78MB to 142MB

      Well, I'll be....

      Any idea of where all that extra memory usage is coming from (beyond the 78M for Data::Dumper)? That's a lot of extra space for 13M of actual data. Based on a conversation in the CB, hash buckets only account for about half a meg extra, not 60M (or 20M as per another tester in a reply further up)

      Update:A quick check on my machine comes up with 26M for storing key value pairs in an array, and 34M for storing them in a hash:

      key-value pair: 112 bytes total data for 125,000 key-value pairs: 13.25M virtual memory usage for array built via push @aData, $k, $v: 26M virtual memory usage for hash built via $hData{$k} = $v: 34M

      The test script is below

        See illguts and add up the individual HEs, HEKs, SVs, etc. A simple scalar value (SV) already has 4 fields plus the actual data body. And even for the array data type, which is less complex than a hash, an empty entry adds up to around 80 bytes administrative overhead.

        I have no idea :) Maybe its because i have a 64-bit cpu, running 32-bit OS with 32-bit perl, with only 512mb physical memory .... I think its all the virtual memory managers fault :) at least its better than 5.6.1 :) mingw32/msvs6 doesn't make a difference