Re^4: Create dynamically different array names based on counter incrementation

Replies are listed 'Best First'.
Re^5: Create dynamically different array names based on counter incrementation by BrowserUk (Patriarch) on Jan 21, 2011 at 15:34 UTC
Actually, it requires very close to twice as much memory. Try it for yourself: `## Create a test datafile of ~ 4MB perl -E"say 'x 'x10 for 1 .. 2e5" > junk.dat ## Then load it into an array of arrays using a while loop ## and check the memory consumed using the Task Manager or TOP perl -E"$n=0;$a[$n++]=[split] while $_=<>; <STDIN>" junk.dat ## On my system the process has used 214.8 MB ## Now do the same thing using map perl -E"@a=map[split],<>; <STDIN>" junk.dat ## On my system this process has used 345.1 MB.` [download] With map, First one list of all the lines in the file is contructed; From that a second (output) list of all the references to small arrays is contructed; Finally, that list is assigned to the array. Whilst the final AoA will consume the same amount of memory in both cases, the memory consumed by the intermediate lists will have considerably increased thhe overall memory required to construct the final array. And depending upon your OS, the time taken to build it can be considerable longer using map. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^5: Create dynamically different array names based on counter incrementation by Anonyrnous Monk (Hermit) on Jan 21, 2011 at 15:44 UTC
Sorry, I should've been more precise. Perl constructs an intermediate list of arrays, which is held on Perl's stack (in its entirety) before it is assigned to the final array. This temporarily consumes a lot of memory, which is not returned to the OS (at least not with typical Unix-builds of Perl). Of course, the memory is returned to Perl's own memory pool, so it may be reused later. But the peak memory usage of the process increases. As soon as references are involved, the additional temporary data only involves the first level elements, i.e. the array references in this case. The referenced data isn't being duplicated, of course. Still, if the ratio of the references to the payload data is bad (i.e. scond-level arrays with only few non-complex items), there can still be considerable overhead. Just for comparison, for anyone interested: sub mem { print "$_[0]:\n"; system "/bin/ps", "-osize,vsize", $$; } my @a; my %tests = ( # immediate data - no references iter_flat => sub { my $n = shift; push @a, $_42 for 1..$n; }, func_flat => sub { my $n = shift; @a = map $_42, 1..$n; }, # indirect/referenced data iter_ref => sub { my $n = shift; push @a, [ $_42 ] for 1..$n; }, func_ref => sub { my $n = shift; @a = map [ $_42 ], 1..$n; } ); my $what = shift @ARGV; my $n = shift @ARGV \|\| 10_000_000; mem("before"); $tests{$what}->($n); mem("after"); [download] `$ ./883539.pl iter_flat before: SZ VSZ 608 22032 after: SZ VSZ 395196 416620 $ ./883539.pl func_flat before: SZ VSZ 608 22032 after: SZ VSZ 1390892 1412316 # map needs 3.5 times as much $ ./883539.pl iter_ref before: SZ VSZ 608 22032 after: SZ VSZ 1547832 1569256 $ ./883539.pl func_ref before: SZ VSZ 608 22032 after: SZ VSZ 2571632 2593056 # map needs 1.7 times as much` [download]	[reply] [d/l] [select]