in reply to using map and anonymous subroutines

A little note on efficiency.

First create a file (win32):

perl -wle"for ('00001'..'10000') { print join ' ', ($_) x 20; }" > + temp.tmp

Then get @temp in various ways and look at peak memory consumption.

open my $fh, 'temp.tmp' or die $!; # Alternative 1: 40 984 K my @temp = map { "$_\n" } map { split ' ' } <$fh>; # Alternative 2: 29 076 K my @temp = map { split ' ' } <$fh>; $_ .= "\n" for @temp; # Alternative 3: 14 816 K local $_; my @temp; push @temp => map "$_\n", split ' ' while <$fh>;

Alternative 2 is less memory hungry (as well as faster) because Perl doesn't have to build up an extra list. Alternative 3 is even less memory hungry because it doesn't slurp the file before building up @temp. Of course, there could've been a fourth alternativ having alternativ 2's for loop, but I think the point is made already.

(I use ActivePerl v5.8.0 built for MSWin32-x86-multi-thread, build 806.)

Hope this helps,
ihb

Replies are listed 'Best First'.
Re: Re: using map and anonymous subroutines
by flyingmoose (Priest) on Apr 20, 2004 at 22:04 UTC
    We all have to have uses for our overly powerful pentium toaster units, don't we ? :)

    No, seriously, thanks for the feedback, though I would hope the case 1 and case2 would have been equivalent, essentially map is going to create an array, and we are saying that assignment for an array is slower than a push, or are we instead saying that the blocks { } are the slowdown? Either way, it seems we can get more granularity from this test to learn more about the internals.

    Anyhow, this all really matters when dealing with the size of input, usually memory is cheap and in surplus, so you just want speed ... yet not always...

    As a last note, the bestest fastest is not always the best way, given that readability and maintainability is important. Hence you might have to add blocks later, or maybe blocks add clarity, and definitely the push appears hackish -- if not indicative of a bug somewhere. I wouldn't leave production code using that final push.

Re: Re: using map and anonymous subroutines
by TomDLux (Vicar) on Apr 21, 2004 at 03:32 UTC

    This feels very wrong to me.

    When you have data in an array, generally there should be no "\n" in the data.

    Once you are ready to output your data to the screen or to a file, then you might introduce newlines, either by using join, or by temporarily setting $" or by other similar means.

    --
    TTTATCGGTCGTTATATAGATGTTTGCA