in reply to memory use array vs ref to array
G'day dkhosla1,
Welcome to the Monastery.
Firstly, I was unable to repeat your exact tests because of issues with Memory::Usage. With a little more stringent testing from the author, this module would not install on many systems (including mine: Mac OS X) — see "Bug #83323 for Memory-Usage: Mark certain OS as unsupported" (raised three and a half years ago) for more on this.
However, I was interested in what you reported, and so ran different tests using Devel::Size. I tested the array much like you:
my @data = <$fh>;
I tested the arrayref in two different ways:
my $data_ref; @$data_ref = <$fh>;
and
my $data_ref = [ <$fh> ];
In ~/local/dev/test_data, I have a series of files I use for volume testing. Each consists of records of exactly 100 bytes (99 'X' characters plus a newline). They range in size from 1,000 to 10,000,000,000 bytes. I used the following for testing (a thousand, a million and a billion bytes):
$ ls -lSr text_?_1 -rw-r--r-- 1 ken staff 1000 8 Feb 2013 text_K_1 -rw-r--r-- 1 ken staff 1000000 8 Feb 2013 text_M_1 -rw-r--r-- 1 ken staff 1000000000 8 Feb 2013 text_G_1
Here's the test code:
#!/usr/bin/env perl -l use strict; use warnings; use autodie qw{:all}; use Devel::Size qw{size total_size}; { open my $fh, '<', $ARGV[0]; my @data = <$fh>; print 'size(\@data): ', size(\@data); print 'total_size(\@data): ', total_size(\@data); } { open my $fh, '<', $ARGV[0]; my $data_ref; @$data_ref = <$fh>; print 'size($data_ref): ', size($data_ref); print 'total_size($data_ref): ', total_size($data_ref); } { open my $fh, '<', $ARGV[0]; my $data_ref = [ <$fh> ]; print 'size($data_ref): ', size($data_ref); print 'total_size($data_ref): ', total_size($data_ref); }
Here's the test results:
$ pm_1171361_mem_use_array.pl ~/local/dev/test_data/text_K_1 size(\@data): 144 total_size(\@data): 1494 size($data_ref): 144 total_size($data_ref): 1494 size($data_ref): 144 total_size($data_ref): 1494
$ pm_1171361_mem_use_array.pl ~/local/dev/test_data/text_M_1 size(\@data): 80064 total_size(\@data): 1420366 size($data_ref): 80064 total_size($data_ref): 1420366 size($data_ref): 80064 total_size($data_ref): 1420366
$ pm_1171361_mem_use_array.pl ~/local/dev/test_data/text_G_1 size(\@data): 80000064 total_size(\@data): 1420322314 size($data_ref): 80000064 total_size($data_ref): 1420322314 size($data_ref): 80000064 total_size($data_ref): 1420322314
As you can see, the sizes of the variables are identical regardless of whether arrays or arrayrefs were used.
While the variables are only a little over 40% greater than the raw data size, this doesn't take into account the memory used by the entire process (which is what you were measuring). The 1kB and 1MB tests finished almost instantaneously; the 1GB tests took about 8secs each (measured very roughly by counting in my head) and total available system memory (determined very roughly by inspection) dropped from ~3.5 GB to ~0.5GB for each run. Although a little smaller, this does appear to be at least of the same order of magnitude as you report.
I suggest you take my test code, run it with your "bigfile", and see what results you get. I recommend that you run it at least a few times to check that you're getting consistent results.
— Ken
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: memory use array vs ref to array
by dkhosla1 (Sexton) on Sep 17, 2016 at 13:41 UTC | |
by BrowserUk (Patriarch) on Sep 17, 2016 at 15:04 UTC | |
by choroba (Cardinal) on Sep 17, 2016 at 20:36 UTC | |
by BrowserUk (Patriarch) on Sep 17, 2016 at 21:41 UTC | |
by choroba (Cardinal) on Sep 17, 2016 at 22:00 UTC | |
| |
by dkhosla1 (Sexton) on Sep 21, 2016 at 04:02 UTC | |
by dkhosla1 (Sexton) on Sep 21, 2016 at 03:59 UTC |