eyepopslikeamosquito has asked for the wisdom of the Perl Monks concerning the following question:

At work we have a simple rule for all our Perl code: it must pass our automated in-house static analysis check -- which includes a Perl::Critic check. No ifs, no buts, it must pass before you are allowed to check it in.

To allow for special cases, and Perl::Critic bugs, we do allow you to switch off specific Perl::Critic warnings, but only in the smallest scope possible.

Though this simple blanket rule has worked very well IMHO over the past few years, today a workmate and good friend of mine, Misha, was agitated because his lovingly-crafted function was rejected by our automated Perl::Critic check.

After I showed him how to silence the Perl::Critic check (as shown below) and get his code checked in, I wondered what the Perl Monks think of his coding style. Do you -- like Perl::Critic -- object to Misha's modification of the list items in the grep block below? How would you write the trim_nonempties function below?

use strict; use warnings; use Data::Dumper; # Returns non-empty strings with leading and trailing spaces removed. sub trim_nonempties { ## no critic (ProhibitMutatingListFunctions) grep { !/^\s*$/ && s/^\s*|\s*$//g } @_; ## use critic } my @x = ( ' hello ', "\thello again\t", '', " ", " \t", 'jock', "\t ab +c", "def \t " ); print Dumper(\@x); my @y = trim_nonempties(@x); print Dumper(\@y);
To clarify the intent of this little function, the output produced by running the above program is shown below:
$VAR1 = [ ' hello ', ' hello again ', '', ' ', ' ', 'jock', ' abc', 'def ' ]; $VAR1 = [ 'hello', 'hello again', 'jock', 'abc', 'def' ];
As you can see, it has trimmed leading and trailing whitepace from non-empty items in the list, while removing any white-space-only items.

Replies are listed 'Best First'.
Re: Code review: function to trim whitespace from all items in a list
by choroba (Cardinal) on Nov 21, 2014 at 09:42 UTC
    Are the lists really that huge that running
    use Syntax::Construct qw{ /r }; # <-- updated #... grep /./, map s/^\s+|\s+$//gr, @_
    would be much slower? I don't like grep modifying a list, it's like a map in a void context :-)
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Code review: function to trim whitespace from all items in a list
by McA (Priest) on Nov 21, 2014 at 09:41 UTC

    Hi,

    to emphasis the two aspects of the function you could do a grep (removing totally empty list elements) with a map (removing leading and trailing whitespace).

    UPDATE: Taste is hardly debatable, but performance often not:

    use strict; use warnings; use Data::Dumper; use Benchmark qw(cmpthese); # Returns non-empty strings with leading and trailing spaces removed. sub mytrim_nonempties { grep { $_ ne '' } map { s/^\s+//; s/\s+$//; $_ } @_; } sub trim_nonempties { ## no critic (ProhibitMutatingListFunctions) grep { !/^\s*$/ && s/^\s*|\s*$//g } @_; ## use critic } my @x = ( ' hello ', "\thello again\t", '', " ", " \t", 'jock', "\t ab +c", "def \t " ); cmpthese(1000000, { 'myfunc' => sub { my @y = mytrim_nonempties(@x); }, 'yourfunc' => sub { my @y = trim_nonempties(@x); }, });

    McA

Re: Code review: function to trim whitespace from all items in a list (side effect)
by LanX (Saint) on Nov 21, 2014 at 09:49 UTC
    The name of the rule (ProhibitMutatingListFunctions) says it already, and your use of this trim function even hides the fact that @_ is "mutated".

    That means that @x will be changed after the call trim_nonempties(@x); creating a potentially buggy and hard to track side effect.

    Either copy the content of @_ to a @temp array or copy the content of alias $_ within grep to a $temp scalar.

    Though not sure if Critic will see that the problem was solved with the latter approach.

    update

    Interestingly the two aforementioned suggestions from McA and Choroba should still mutate and alarm Critic!

    Please note that @x and @y aren't identical after the call, cause @x preserves empty elements.

    A dual use function... Very perlish ;)

    Cheers Rolf

    (addicted to the Perl Programming Language and ☆☆☆☆ :)

      My solution doesn't change the list, as it uses the /r modifier.
      #!/usr/bin/perl use warnings; use strict; use Test::More; sub trim_nonempties { ## no critic (ProhibitMutatingListFunctions) grep { !/^\s*$/ && s/^\s*|\s*$//g } @_; ## use critic } sub choroba { grep !/^$/, map s/^\s+|\s+$//gr, @_ } my @X = ( ' hello ', "\thello again\t", '', " ", " \t", 'jock', "\t ab +c", "def \t " ); my @changed = @X; my @y = trim_nonempties(@changed); my @stay = @X; my @w = choroba(@stay); is_deeply(\@y, \@w, 'same'); is_deeply(\@X, \@stay, 'no mutation'); done_testing();
      لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

        What about undef? It can be even faster:

        use 5.16.2; #use warnings; use Test::More; use Benchmark qw( cmpthese ); # Returns non-empty strings with leading and trailing spaces removed. sub mca # McA { grep { $_ ne "" } map { s/^\s+//; s/\s+$//; $_ } @_; } sub tux # Tux { map { defined $_ ? m/(\S(?:.*\S)?)/ : () } @_; } sub choroba # choroba { grep !m/^$/ => map s/^\s+|\s+$//gr => @_ } sub eplam # eyepopslikeamosquito { grep { !m/^\s*$/ && s/^\s*|\s*$//g } @_; } my @x = (" hello ", "\thello again\t", "", " ", undef, " \t", "jock", "\t abc", "def \t "); my @e = ("hello", "hello again", "jock", "abc", "def"); is_deeply ([ mca (@x) ], \@e, "mca"); is_deeply ([ tux (@x) ], \@e, "tux"); is_deeply ([ choroba (@x) ], \@e, "choroba"); is_deeply ([ eplam (@x) ], \@e, "eplam"); cmpthese (1000000, { "mca" => sub { my @y = mca (@x); }, "tux" => sub { my @y = tux (@x); }, "choroba" => sub { my @y = choroba (@x); }, "eplam" => sub { my @y = eplam (@x); }, }); done_testing; __END__ ok 1 - mca ok 2 - tux ok 3 - choroba ok 4 - eplam Rate eplam choroba mca tux eplam 100908/s -- -40% -57% -66% choroba 169205/s 68% -- -29% -43% mca 236967/s 135% 40% -- -20% tux 297619/s 195% 76% 26% -- 1..4

        Enjoy, Have FUN! H.Merijn
        True my deepest apologies° , I oversaw the /r !

        Partially also bc I haven't got used to this new feature yet. :)

        >= 5.18 ?

        Cheers Rolf

        (addicted to the Perl Programming Language and ☆☆☆☆ :)

        °) Kowtow ;)

Re: Code review: function to trim whitespace from all items in a list
by Eily (Monsignor) on Nov 21, 2014 at 10:29 UTC

    The first problem with this code is indeed the fact that grep's purpose is selection, not modification, so it does not help readability. But the problem mentionned by LanX is the worst I guess, that @x is modified under the hood, without it being explicit in the function description is a serious issue.

    If modifying @x is OK, then there's no need for another array and s/^\s*|\s*$/g for @x would avoid the copy of the array.

    Or, since map can actually do both modification and selection (it can yield a list of one, zero or more values for each input value) here's another way to do the same thing: @y = map { /(\S.*\S)/ ? $1 : () } @x; or if using map to remove elements isn't clear enough: @y = grep defined, map /(\S.*\S)/, @x;

    Edit: Whoops, doesn't work for strings that contain only one non-space char. /(\S.*?)\s*$/ would work but is not as easy to read.

Re: Code review: function to trim whitespace from all items in a list
by Anonymous Monk on Nov 21, 2014 at 12:58 UTC

    I personally don't have a problem with modification of the input; my own trim functions in various incarnations do just that, most of the time in-situ & returning nothing.

    If your code expects the input given not to be fiddled with, then the trim function should make a copy of the input to work on. In either case, I would expect a note about input modifications in pod, saving me from a code dive.

    Devoid of pod, the function needs to work on a copy of the input & not touch input any further. Thus working with P::C.

      > I personally don't have a problem with modification of the input;

      I also see some applications for this...

      ...but then it would be better to force void context by checking wantarray to avoid confusion.

      Not sure about the best naming convention here.

      Maybe trim(@x) for mutator and @y=trimmed(@x) for non-mutator???

      Cheers Rolf

      (addicted to the Perl Programming Language and ☆☆☆☆ :)