Memory usage by perl application

magarwal has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Memory usage by perl application by Corion (Patriarch) on Dec 22, 2010 at 15:30 UTC
See illguts. Basically, every scalar value (SV in internal-speak) in Perl takes up 16 bytes plus whatever payload (think "string length") is stored in the value. Hash keys are also refcounted, maybe they also are SVs. So, if you store very many, relatively small items in your hash, as keys and values, your memory needs might be up to 16 or 32 times the size of the input file, at least if the keys are all different. You can easily move your hash to disk by using DB_File or one of the other tied hash implementation (SDBM_File, GDBM_File). This means your hash access is slower, but you are only limited by disk space, not core memory. Alternatively, maybe you can easily save memory by simply not reading the whole file into a hash, by changing to a different algorithm. But for that, we will need to see your data and code.	[reply]
Re^2: Memory usage by perl application by magarwal (Novice) on Dec 22, 2010 at 17:31 UTC
Hi Corion, I am doing this just as a exercise to check memory usage by the system. Please find below my code snippet, `open FILE,$input_file or die $!; my @data; while(<FILE>){ push(@data,$_); } close FILE;` [download] The file i am reading is 2 Gb in size. In this scenario, the total memory usage should be around 2Gb or something about 4 Gb. My system shows about 3.8 Gb of memory usage. Does variable FILE also stores the full file in memory and again my array is having the full file. Is this acceptable. Let me know your inputs. Thanks, Manu	[reply] [d/l]
Re^3: Memory usage by perl application by Corion (Patriarch) on Dec 22, 2010 at 18:01 UTC
In your first post, you talked about a hash. I see no hash in your code. See illguts, again. It talks about the underlying data structures and their memory needs. Depending on how large each line in `$input_file` is, Perl will, again, use up to 16 times the memory (based on the calculation that each line is one character long, and takes an overhead of 16 bytes, disregarding the SvPV entry and overhead of the array itself). This behaviour is acceptable to me. There are very few reasons to read a file completely into an array. If you really need to handle generic large data structures, most likely a database like Postgres or SQLite will suit your needs far better than storing the data through Perl can.	[reply] [d/l]