Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Any suggestions as to how to simplify/condense this code snippet ?

my %Y; while (<>) { ... my ($y) = ... ; next unless $y; if(@ARGV) { foreach (@ARGV) { $Y{$y}++,next if $y eq $_; } } else { $Y{$y}++ } }

Replies are listed 'Best First'.
Re: Code review
by stevieb (Canon) on Mar 06, 2024 at 17:17 UTC
    Any suggestions as to how to simplify/condense this code snippet ?

    As opposed to condensing it, if I came across this code I'd rather it be expanded:

    my %Y; while (<>) { ... my ($y) = ...; next unless $y; if (@ARGV) { for (@ARGV) { $Y{$y}++; next if $y eq $_; } } else { $Y{$y}++; } }

    As far as changes, I agree with the implementation of more descriptive variable names. I'm also picky about unless for some weird, unconfirmed reason (probably because 'unless' isn't seen in other languages and/or the '!' stands out loudly), so I'd write that as next if ! $y, but I digress.

    One simplification that can condense things a bit is changing this:

    if (@ARGV) { for (@ARGV) { $Y{$y}++; next if $y eq $_; } }

    ... to this:

    for (@ARGV) { $Y{$y}++; next if $y eq $_; }

    In other words, unless you explicitly need to, there's no need to check whether the @ARGV variable is true before you attempt to iterate over it. The for()/foreach() implicitly does this for you.

Re: Code review
by ikegami (Patriarch) on Mar 06, 2024 at 16:15 UTC
    That indented line could be written as follows:
    ++$Y{$y} if grep { $y eq $_ } @ARGV;

    It does loose the short-circuiting aspect, but that's not likely to be relevant. But what if we take it a step further.

    my $re; if ( @ARGV ) { my $pat = join "|", map quotemeta, @ARGV; $re = qr/^(?:$pat)\z/; } my %Y; while ( <> ) { my ( $y ) = ...; ++$Y{ $y } if $y && ( !$re || $y =~ $re ); }

    Update: Hash simpler than regex.

    my $re; my %valid; ++$valid{ $_ } for @ARGV; my %Y; while ( <> ) { my ( $y ) = ...; ++$Y{ $y } if $y && ( !%valid || $valid{ $_ } ); }

      The first version does not allow for empty @ARGV; both versions do not allow for the possibility that @ARGV may have repeated elements.

      I think a short form that preserves all the current behaviour could look like this:

      ... next unless $y; $Y{$y} += @ARGV ? (grep $y eq $_, @ARGV) : 1;

      It would, though, be preferable to have some clue about what the code is intended to achieve: it is always possible that the behaviour on repeated elements is unintentional or irrelevant.

        The first version does not allow for empty @ARGV

        That's not true. If you remove the if, sure, but noone said to do that.

Re: Code review
by Marshall (Canon) on Mar 07, 2024 at 05:30 UTC
    This is some weird looking code and appears to contain at least one error. To have a clear, concise algorithm, I would go up one level in problem statement and start with the intent of the program in general words.

    my %Y; while (<>) { ... my ($y) = ... ; next unless $y; # Add the scalar number of elements in @ARGV to the hash # Always add at least 1 to the hash even if @ARGV contains # no elements # the value $y is meaningless and plays no role if(@ARGV) { # next is always executed regardless of $y eq $_ foreach (@ARGV) { $Y{$y}++,next if $y eq $_; } } else { $Y{$y}++ } }
    So, this single statement is possible:

    scalar @ARGV ? $Y{$y} += @ARGV : $Y{$y}++;
    1. The comma operator serves no purpose in the above code. Prefer ; instead.
    2. If you want the test to short circuit, then it should go first, not last!:
    foreach (@ARGV) { next if $y eq $_; # don't count any occurrences of # $y $Y{$y}++, }
    Or, this is a possibly of what was meant:

    scalar @ARGV ? $Y{$y} += (grep $y ne $_, @ARGV) : $Y{$y}++; Without some more English language explanation, I have no idea what the code is supposed to do.

      I would characterize the effect of the code as:

      For each input line, process it and extract an id. When run with no arguments, count in %Y the number of times each id appears; when run with a list of ids as arguments, count only how often the specified ids appear (with multiplicity).

      I suspect you have been misled by a misinterpretation of:

      foreach (@ARGV) { ++$Y{$y}, next if $y eq $_; }

      which is equivalent to:

      foreach (@ARGV) { if ($y eq $_) { ++$Y{$y}; next; } }

      (I assume the useless next is left over from cutting down the original code.)

      This is an idiom I actually use quite commonly for compactness when parsing arguments:

      while (@ARGV && $ARGV[0] =~ /^-/) { my $arg = shift @ARGV; last if $arg eq '--'; ($max = $arg || shift(@ARGV)), next if $arg =~ s{^-x}{}; $inject = 1, next if $arg eq '-i'; ++$debug, next if $arg eq '-d'; die "Unknown option '$arg'\n"; }
Re: Code review
by choroba (Cardinal) on Mar 06, 2024 at 16:25 UTC
    Code review: Use meaningful variable names.

    Update: The following is wrong. Ignore it.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      $Y{$y} += (grep $y eq $_, @ARGV) || 1;
      In the OP, it looks like if @ARGV is true but no $y matches $_ then $Y{$y} should not be incremented. But the above code will increment it.
        You're right, my bad.

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: Code review
by cavac (Prior) on Mar 07, 2024 at 08:54 UTC
Re: Code review
by BillKSmith (Monsignor) on Mar 07, 2024 at 19:46 UTC
    I propose replacing the three lines of the OP (shown in sub OP) with the three lines (shown in sub TEST). The test code around them demonstrates that they produce the same result in each of the three cases of interest. The new code is about the same length and probably less efficient, but I find it much easier to understand.
    use strict; use warnings; use Test::More; my $y = 5; my %Y; my @cases = ( [[], 'no @ARGs'], # 1 [[1,2,3], 'no matches'], # undef [[5,2,5], '2 matches'], # 2 ); plan tests=>3; foreach my $case (@cases) { @ARGV = @{$case->[0]}; %Y = (); my $Y_OP = OP(); @ARGV = @{$case->[0]}; %Y = (); my $Y_TEST = TEST(); is($Y_TEST, $Y_OP, $case->[1]); } sub OP { if(@ARGV) { foreach (@ARGV) { $Y{$y}++,next if $y eq $_; } } else { $Y{$y}++ } return $Y{$y}; } sub TEST{ my $number_of_matches = grep { $_ eq $y } @ARGV; $number_of_matches = undef if $number_of_matches == 0; $Y{$y} = (@ARGV != 0) ? $number_of_matches : 1; return $Y{$y}; }

    OUTPUT:

    1..3 ok 1 - no @ARGs ok 2 - no matches ok 3 - 2 matches
    Bill
Re: Code review
by perlfan (Parson) on Mar 08, 2024 at 20:01 UTC
    Superficially, I'd kick this back in code review because the else is "cuddled" and you're not using a LABEL: with the next. You're also don't have use strict; use warnings;. More deeply, you don't need the if and the $Y{$y}++; looks to happen regardless.

    Untested reduction, removed implicit $_ for clarity:

    use strict; use warnings; my %Y = (); MAINLOOP: while (<>) { ... my ($y) = ... ; next MAINLOOP unless $y; $Y{$y}++; INNERLOOP: foreach my $arg (@ARGV) { next INNERLOOP if $y eq $arg; } }