in reply to Memory usage by perl application

See illguts. Basically, every scalar value (SV in internal-speak) in Perl takes up 16 bytes plus whatever payload (think "string length") is stored in the value. Hash keys are also refcounted, maybe they also are SVs. So, if you store very many, relatively small items in your hash, as keys and values, your memory needs might be up to 16 or 32 times the size of the input file, at least if the keys are all different.

You can easily move your hash to disk by using DB_File or one of the other tied hash implementation (SDBM_File, GDBM_File). This means your hash access is slower, but you are only limited by disk space, not core memory.

Alternatively, maybe you can easily save memory by simply not reading the whole file into a hash, by changing to a different algorithm. But for that, we will need to see your data and code.

Replies are listed 'Best First'.
Re^2: Memory usage by perl application
by magarwal (Novice) on Dec 22, 2010 at 17:31 UTC
    Hi Corion,

    I am doing this just as a exercise to check memory usage by the system. Please find below my code snippet,
    open FILE,$input_file or die $!; my @data; while(<FILE>){ push(@data,$_); } close FILE;
    The file i am reading is 2 Gb in size.
    In this scenario, the total memory usage should be around 2Gb or something about 4 Gb. My system shows about 3.8 Gb of memory usage.
    Does variable FILE also stores the full file in memory and again my array is having the full file.

    Is this acceptable. Let me know your inputs.

    Thanks,
    Manu

      In your first post, you talked about a hash. I see no hash in your code.

      See illguts, again. It talks about the underlying data structures and their memory needs.

      Depending on how large each line in $input_file is, Perl will, again, use up to 16 times the memory (based on the calculation that each line is one character long, and takes an overhead of 16 bytes, disregarding the SvPV entry and overhead of the array itself).

      This behaviour is acceptable to me. There are very few reasons to read a file completely into an array. If you really need to handle generic large data structures, most likely a database like Postgres or SQLite will suit your needs far better than storing the data through Perl can.