wanna_code_perl has asked for the wisdom of the Perl Monks concerning the following question:

Here's a simplified version of a pattern I come across every so often that bugs me: you're reading from a stream of unknown length, you need to do "something" at every Nth iteration, and N is pretty much never an even divisor, so there are always leftovers, which makes for a clumsy re-use of the "something", like so:

use 5.014; use warnings; my ($sum, $count) = (0,0); while (<DATA>) { $sum += $_; $count++; if ($count == 5) { printf "Sum: %4d, count: %2d, mean: %5.1f\n", $sum, $count, $sum/$count; $sum = 0; $count = 0; } } printf "Sum: %4d, count: %2d, mean: %5.1f\n", $sum, $count, $sum/$count; __DATA__ 61 23 30 444 368 438 467 44 812 430 992 469

This outputs:

Sum: 926, count: 5, mean: 185.2 Sum: 2191, count: 5, mean: 438.2 Final Sum: 1461, count: 2, mean: 730.5

Often, the "something" (the printf in my example, but could be a few statements in length) doesn't warrant putting it into a sub (and paying the function call overhead every Nth time through the main loop), but neither do I like having to copy/paste it just to catch the leftovers after the loop exit.

Is there a good way to refactor this so I don't have to repeat the printf, without sticking it in a sub?

Replies are listed 'Best First'.
Re: Refactor this to eliminate the duplicated printf?
by davido (Cardinal) on Jul 14, 2014 at 04:03 UTC

    Use eof to determine if the next read will fail due to end of file. Here's an example. This is just a demonstration rather than a refactoring of your code. But the technique would prove to adapt to your code in a straightforward way.

    while( my $line = <DATA> ) { print $line if $. % 5 == 0 || eof(\*DATA); } __DATA__ line one line two line three line four line five line six line seven line eight line nine line ten line eleven

    Output:

    line five line ten line eleven

    Update: Here is a revision of your original code that handles end-of-file correctly without duplicating your printf statements:

    my ($sum, $count) = (0,0); while (<DATA>) { $sum += $_; $count++; if ($count == 5 || eof(\*DATA) ) { printf "Sum: %4d, count: %2d, mean: %5.1f\n", $sum, $count, $sum/$count; $sum = 0; $count = 0; } } __DATA__ 61 23 30 444 368 438 467 44 812 430 992 469

    I provided this to show that a solution that eliminates code repetition needn't be complicated or illegible. I literally changed only two things; one, removed 'printf' outside of the loop; and two, added || eof(\*DATA) to your if conditional.

    Philosophically your goal is worthwhile. Today, perhaps it's just a printf. Tomorrow it could grow to a few lines. And the next day you might alter the format string and parameters of the printf. Suddenly you find yourself needing to make the change in two places instead of one, and possibly get one of those changes wrong. It just makes sense; if you can avoid repetition without making the code illegible or grossly inefficient, seek to do so.


    Dave

      Thank you davido. I had a chance to try this with the full code and, no surprise, it indeed does the trick! I'll be adding this little idiom to my bag of tricks.

      A minor nitpick: the control flow is not exactly the same in the OP and the revised version. The difference is with empty input, where the OP code would still print something (or divide by zero).

Re: Refactor this to eliminate the duplicated printf?
by wjw (Priest) on Jul 14, 2014 at 04:18 UTC

    Have not tried this, but I have run into this same thing. Took a look at List::MoreUtils and noticed the natatime EXPR, LIST which seems like it might do exactly what you want assuming you read your __DATA__ into an array. You did not specify whether or not the use of modules was ok for you, so I just assumed it was.

    From the docs:

    Example: my @x = ('a' .. 'g'); my $it = natatime 3, @x; while (my @vals = $it->()) { print "@vals\n"; } This prints a b c d e f g
    Does that suit your requirements?

    ...the majority is always wrong, and always the last to know about it...

    Insanity: Doing the same thing over and over again and expecting different results...

    A solution is nothing more than a clearly stated problem...otherwise, the problem is not a problem, it is a facct

      Unfortunately, my input is streamed, and larger than available RAM. For certain smaller data sets, this would be a decent solution.

Re: Refactor this to eliminate the duplicated printf?
by redbull2012 (Sexton) on Jul 14, 2014 at 06:42 UTC

    Hello there,

    davido has already provided you the answer. Why not use it?

    use 5.014; use warnings; use strict; my ($sum, $count) = (0,0); while (<DATA>) { $sum += $_; $count++; if ($count == 5 || eof(\*DATA)) { printf "Sum: %4d, count: %2d, mean: %5.1f\n", $sum, $count, $sum/$count; $sum = 0; $count = 0; } } __DATA__ 61 23 30 444 368 438 467 44 812 430 992 469 123

    PS: I don't see u thank him so i thought you didn't understand his answer :))

      PS: I don't see u thank him so i thought you didn't understand his answer :))

      That, or I waited until I'd tried out his suggestion, so I could thank him properly. Welcome to PerlMonks. :-)

A reply falls below the community's threshold of quality. You may see it by logging in.