On page 18 of the "Camel" book is the following example (Question at bottom):
#!/usr/bin/perl open(GRADES, "grades") or die "Can't open grades: $!\n"; while ($line = <GRADES>) { ($student, $grade) = split(" ", $line); $grades{$student} .= $grade . " "; } foreach $student (sort keys %grades) { $scores = 0; $total = 0; @grades = split(" ", $grades{$student}); foreach $grade (@grades) { $total += $grade; $scores++; } $average = $total / $scores; print "$student: $grades{$student}\tAverage: $average\n"; }

I benchmarked this example against the same example, only changing the @grades assignment line and the foreach $grade line as shown in the following code:
#!/usr/bin/perl open(GRADES, "grades") or die "Can't open grades: $!\n"; while ($line = <GRADES>) { ($student, $grade) = split(" ", $line); $grades{$student} .= $grade . " "; } foreach $student (sort keys %grades) { $scores = 0; $total = 0; foreach $grade (split(" ", $grades{$student})) { $total += $grade; $scores++; } $average = $total / $scores; print "$student: $grades{$student}\tAverage: $average\n"; }

As you can see, removing the @grades assignment and loop are the only changes. Below is the benchmark I performed (100,000 iterations) on a file containing 9 grade entries. Correct me if I am wrong, the 7% performance increase is from avoiding the array assignment. However, I watched the memory usage via a top -d1 command (btw: Is there a module to model memory usage?), and the second routine uses 8-20K more memory (1508-1524K vs. 1528-1532K).

Benchmark: timing 100000 iterations of from_book, from_book2... from_book: 60 wallclock secs (41.60 usr + 4.34 sys = 45.94 CPU) @ 21 +76.75/s (n=100000) from_book2: 55 wallclock secs (39.09 usr + 3.81 sys = 42.90 CPU) @ 23 +31.00/s (n=100000) Rate from_book from_book2 from_book 2177/s -- -7% from_book2 2331/s 7% --


So, I copied the grades flat file 1000 times into a grades.big flat file; the file contains the same 9 entries used above, 1000 times each. When running 1000 iterations of the big file (100,000 was taking a long time), the following benchmark comes out. Memory differences on this are about 36K, (1856K vs. 1892K).

Benchmark: timing 1000 iterations of from_book, from_book2... from_book: 278 wallclock secs (211.20 usr + 0.68 sys = 211.88 CPU) @ + 4.72/s (n=1000) from_book2: 261 wallclock secs (200.35 usr + 0.56 sys = 200.91 CPU) @ + 4.98/s (n=1000) Rate from_book from_book2 from_book 4.72/s -- -5% from_book2 4.98/s 5% --


The question I would like to ask the PerlMonks is why is there more memory usage when the split is implicit in the foreach loop?

Thank you much,
Casey

Edit Masem 2002-01-23 - Changed title from "Page 18 of the Camel"


In reply to Inefficient code in Camel book? by cmilfo

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.