comment on

On page 18 of the "Camel" book is the following example (Question at bottom):

#!/usr/bin/perl

open(GRADES, "grades") or die "Can't open grades: $!\n";
while ($line = <GRADES>) {
    ($student, $grade) = split(" ", $line);
    $grades{$student} .= $grade . " ";
}

foreach $student (sort keys %grades) {
    $scores = 0;
    $total = 0;
    @grades = split(" ", $grades{$student});
    foreach $grade (@grades) {
        $total += $grade;
        $scores++;
    }
    $average = $total / $scores;
    print "$student: $grades{$student}\tAverage: $average\n";
}
[download]

I benchmarked this example against the same example, only changing the @grades assignment line and the foreach $grade line as shown in the following code:

#!/usr/bin/perl

open(GRADES, "grades") or die "Can't open grades: $!\n";
while ($line = <GRADES>) {
    ($student, $grade) = split(" ", $line);
    $grades{$student} .= $grade . " ";
}

foreach $student (sort keys %grades) {
    $scores = 0;
    $total = 0;
    foreach $grade (split(" ", $grades{$student})) {
        $total += $grade;
        $scores++;
    }
    $average = $total / $scores;
    print "$student: $grades{$student}\tAverage: $average\n";
}
[download]

As you can see, removing the @grades assignment and loop are the only changes. Below is the benchmark I performed (100,000 iterations) on a file containing 9 grade entries. Correct me if I am wrong, the 7% performance increase is from avoiding the array assignment. However, I watched the memory usage via a top -d1 command (btw: Is there a module to model memory usage?), and the second routine uses 8-20K more memory (1508-1524K vs. 1528-1532K).

Benchmark: timing 100000 iterations of from_book, from_book2...
 from_book: 60 wallclock secs (41.60 usr +  4.34 sys = 45.94 CPU) @ 21
+76.75/s (n=100000)
from_book2: 55 wallclock secs (39.09 usr +  3.81 sys = 42.90 CPU) @ 23
+31.00/s (n=100000)
             Rate  from_book from_book2
from_book  2177/s         --        -7%
from_book2 2331/s         7%         --
[download]

So, I copied the grades flat file 1000 times into a grades.big flat file; the file contains the same 9 entries used above, 1000 times each. When running 1000 iterations of the big file (100,000 was taking a long time), the following benchmark comes out. Memory differences on this are about 36K, (1856K vs. 1892K).

Benchmark: timing 1000 iterations of from_book, from_book2...
 from_book: 278 wallclock secs (211.20 usr +  0.68 sys = 211.88 CPU) @
+  4.72/s (n=1000)
from_book2: 261 wallclock secs (200.35 usr +  0.56 sys = 200.91 CPU) @
+  4.98/s (n=1000)
             Rate  from_book from_book2
from_book  4.72/s         --        -5%
from_book2 4.98/s         5%         --
[download]

The question I would like to ask the PerlMonks is why is there more memory usage when the split is implicit in the foreach loop?

Thank you much,
Casey

Edit Masem 2002-01-23 - Changed title from "Page 18 of the Camel"

In reply to Inefficient code in Camel book? by cmilfo

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.