mobiusinversion has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

I recently read the article Good Perl code is the best form of evangelism from perlbuzz.com:

    http://perlbuzz.com/2008/03/good-perl-code-is-the-best-form-of-evangelism.html

The article regards the second event in the Microsoft Scriptnet 2008 Winter Games, namely:

"to read in a text file of comma-separated values of ice skating scores, drop the highest and lowest scores from each skater, and show the top three skaters by the average of their remaining scores"

For example, the first two lines from the actual file is:

    Janko Cajhen,84,80,61,81,71,62,76
    Ryan Calafato,59,93,93,80,67,73,95

Their average scores would be:

    Janko Cajhen => 74
    Ryan Calafato => 81.2

The article provides one solution, and also references Jan Dubois' (ActiveState) solution as exemplary perl. I agree that both solutions are cleaner than Microsoft's.

I would be grateful for the communities feedback on the following solution. Is it good Perl? Why or why not?
use strict; sub sum {my $x; $x += $_ for @_; $x } sub avg {sum(@_)/@_} undef $/; open(my $F, 'skaters.txt'); my @x = map{[split/,/]}grep{/\S/}split/\n+/,<$F>; for(@x){ @$_ = ($_->[0],avg(@{[sort{$a<=>$b}@{$_}[1..$#$_]]}[1..$#$_-2])) } @x = sort{$b->[1] <=> $a->[1]}@x; printf "%-5s %-20s %-20s\n",$_+1,@{$x[$_]} for(0..2);
I have posted skaters.txt to the web:

    skaters.txt

Note that the text file does not require any fancy escapes for its comma separated values as it might if the ice skaters names were given in the last name, first name format. Or in something even more interesting using double quoted escapes like:

"Wojciech, ""the man"" Czupta",89,53,96,81,63,65,85

which would require some fancier unrolling like:
our $csvs = qr/(?:"(?:[^"]|"")*"|[^",])+/; sub csv_file_to_AoA { map{[csv($_)]}grep{/\S/}split/\n/,file_contents($_[0]) } sub csv { my @x = $_[0] =~ /(${csvs})(?:,|$)/g; for(@x){ /^"(.*?)"$/s ? $_ = $1 : undef; $_ =~ s/""/"/g; } @x } sub file_contents { local $/ = undef; open(my$F,$_[0])||die"cant open : $_[0]\n$!\n"; my $x = <$F> }
Do you think Microsoft did that on purpose so as not to expose just how bad their solution would be if the data were in this format? Or perhaps how much better Perl would be than VB Script or Powershell at parsing such data? Ultimately, what would be the most minimalistic and or elegant way to solve the problem given such data? That of the Cookbook?

Dave

Replies are listed 'Best First'.
Re: winter games, perl, ice and bling
by kyle (Abbot) on Mar 26, 2008 at 21:48 UTC

    You've reimplemented List::Util::sum.

    You open without checking whether it succeeded.

    open my $input_fh, '<', 'skaters.txt' or die "Can't read skaters.txt: $!";

    I see no reason to undef $/ and then split on /\n+/

    my @skater_records = map { chomp; [ split /,/ ] } grep { /\S/ } <$input_fh>;

    I'm not going to unravel the the rest. The whole thing reads like an obfuscation or attempt at golfing. This is a simple problem, and the solution is not difficult, but you've made it hard to read. You may have found that entertaining (and I might have as well), but it.doesn't have much more to recommend it.

    Update with a solution.

    It's not super efficient, but it doesn't read every record into memory (it just retains the top three), so it can handle any number of records without exhausting RAM. Nothing too tricky, I don't think.

Re: winter games, perl, ice and bling
by perrin (Chancellor) on Mar 26, 2008 at 21:52 UTC
    People's tolerance for this sort of thing varies, but it looks very obfuscated to me. Some general advice:
    • Never set $/ without local-izing it inside of a block. It's very dangerous.
    • Get some whitespace in there! Even feeding your existing code to perltidy improves the readability because it adds whitespace.
    • Take it easy on $_ and @_ and $x. In most cases, a meaningful variable name is better. Unless your program is about geometry, it should probably never contain a $x.
    • Unpack those map/grep/split pileups. Use multiple lines. Write it differently if necessary to pull them apart. Doing too much on one line makes your code unreadable.
    • Explicit "return" statements make your intent clearer. Use them.
Re: winter games, perl, ice and bling
by mr_mischief (Monsignor) on Mar 26, 2008 at 22:13 UTC
    Since you're asking for opinions, instead of just giving a value judgment on your example as a whole I'll point out what I think about specific parts of your code. That way you can see not just the decision but the reasoning.

    use strict; # strict is good ++ # Where's warnings? -- sub sum {my $x; $x += $_ for @_; $x } # poorly formatted -- sub avg {sum(@_)/@_} # poorly formatted -- # probably should use scalar @_ instead of just @_ here undef $/; # no need to slurp to do an average for a line -- open(my $F, 'skaters.txt'); # lexical filehandle ++ # please use three-argument open -- # no error check on a system call -- my @x = map{[split/,/]}grep{/\S/}split/\n+/,<$F>; # very poorly formatted -- # uses split to achieve what while ( <$F> ) would have done without th +e above slurp-- # makes me think well of those who call Perl line noise # combines two loops and two searches on one line += 0 # (good display of power, but could have built up # the data structure over time more clearly) for(@x){ @$_ = ($_->[0],avg(@{[sort{$a<=>$b}@{$_}[1..$#$_]]}[1..$#$_-2])) # nearly incomprehensible -- # partly due to unnecessary terseness and formatting -- # partly due to painful twists of syntax -- # (The hash solution on perlbuzz was simpler and clearer.) } @x = sort{$b->[1] <=> $a->[1]}@x; # poor formatting again -- printf "%-5s %-20s %-20s\n",$_+1,@{$x[$_]} for(0..2); # poor formatting again --

    I'm sorry if the formatting comments bother you, but part of making a program easy to read and maintain is making it easy to distinguish the syntax tokens.

    Another is to use as clear of syntax as you can in the first place. You're using lots of Perl features and smashing them together quite closely. You're doing math against punctuation variables in places where there are clearer ways. You're even assembling punctuation variables from one another ($#$_).

    What you have is somewhat clever Perl code, but I wouldn't call it idiomatic, clear, or a preferred practice. Overall, I'd say it looks more like Perl than the Microsoft example cribbed from VB. I don't think it serves the purpose of evangelism through code, though, except maybe as a warning against Perl.

    With warnings, some more whitespace, three-argument open, and some adjustments to the syntax in your for loop, it'd be a decent example of TIMTOWTDI. Right now, I'd call it a study in Perl syntax and maybe use it as a code maintenance question on a quiz. I think it's not quite ready to represent Perl as a clean, clear example of a powerful yet approachable tool for quickly developing applications.

    If the point was to show simply the power of multiple ways to express the same process, you'd have a winner. Your example is certainly different. Since impressing people with the power and clarity of Perl with a program written in it is the cornerstone of the story with the original examples, I'm afraid I'd have to say your implementation might be counterproductive in that regard.

Re: winter games, perl, ice and bling
by mobiusinversion (Beadle) on Mar 27, 2008 at 00:01 UTC
    Thanks for the comments.

    I found the comments about shielding $/ to be interesting. Presumably, the code is made to be standalone, or else all of the solutions should have wrapped and closed all of their variables in namespaces and functions less the first few lines of the PerlBuzz solution open the (eek) dangers of global variables!
    use warnings; use strict; my %averages;
    On another note, I feel bad about including the obfuse looking bits near the top (I assure you they were not intentionally so), since the latter half seems to have gone unnoticed!

    That is, what about *challenging* parsing. Surely the people at scriptnet seem to have omitted any challening problems...

    Here's a question: are there any really challenging parsing problems out there?

      are there any really challenging parsing problems out there?

      Yes.

      Surely the people at scriptnet seem to have omitted any challening problems...

      In an earlier discussion, I think the monks agreed that the games would not be any challenge for professional programmers. The target audience seems to be students.

      Global variables are nowhere near as dangerous as modifying $/. If you used any module from this code, and it opened a file but didn't set $/, all hell would break loose. I'm telling you this from personal experience, not from theory.

        Then raise a bug or a patch against the module(s).

        Unless the modules explicitly documents that it uses whatever value you assign to $/ or $\, eg. Tie::File, then no module author has the right to either rely upon some default value, or modify them except for local use. That right belongs exclusively to that author of main::main.

        Even if you scope and localise modifications to these variables, if you also need to call a module's functions or methods within that scope, you will break them if they make assumptions.

        Module authors should never make assumptions about or permanently change the environment they will inherit. That privilege remains the exclusive preserve of the author of main.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: winter games, perl, ice and bling
by jdporter (Paladin) on Mar 27, 2008 at 03:29 UTC

    Just for fun...

    use List::Util qw( sum min max ); use strict; use warnings; @ARGV = ('skaters.txt') unless @ARGV; my %skater_avg = map { chomp; my( $skater, @scores ) = split /,/; { $skater => ( sum(@scores)-min(@scores)-max(@scores) ) / (@scores +-2) } } <>; print "$_: $skater_avg{$_}\n" for ( sort { $skater_avg{$b} <=> $skater_avg{$a} } keys %skater_avg )[0..2];

    What's wrong with my code?

    A word spoken in Mind will reach its own level, in the objective world, by its own weight
      I have a style note. I'd like to see a consistent placement of the opening curly bracket for an anonymous block as an argument to a function when the block is going to span multiple lines. You place the bracket for map on the next line, but on the same line for sort. Either would make sense despite personal tastes, but they should probably be the same.

      Not something wrong with your code, but perhaps worth mentioning is the magic <>. It is Perlish enough in a short code sample like this, but I did mark the OP down for not checking the return on open. Since Perl automatically emits an error message when it can't open a specified file using this idiom, the only trade-off for the extra power is not being able to emit your own custom error message. That's usually not worth much compared to processing all of your command-line arguments in order, but in the case of internationalization it might pose an issue.

        Thank you for your comments.

        they should probably be the same.

        I am of the "do what looks right in each individual situation" school of thought, which I believe is very consistent with the Perl philosophy.

        perhaps worth mentioning is the magic <>.

        Um... I think not. It's Perl 101.

        in the case of internationalization it might pose an issue.

        Really? Is perl's default error message library not sufficiently internationalized?

        A word spoken in Mind will reach its own level, in the objective world, by its own weight
Re: winter games, perl, ice and bling
by ysth (Canon) on Mar 27, 2008 at 03:32 UTC
    What is the reason for the grep{/\S/}?

    warnings is even more important than strict. Ditch the file slurp and split and just use list context <$F>.

    I don't think the people who ran the contest had any motives in particular.