MrSnrub has asked for the wisdom of the Perl Monks concerning the following question:

Suppose I have the following code:
my ($var1, $var2, $var3, $var4, $var5) = map { $_ ||= ""; Trim($_); $_ +; } split(/\|/, $str); sub Trim { $_[0] =~ /\s+$//g; $_[0] =~ /^\s+//g; }
As I understand it, that will set to the empty string and trim any undefined strings that are split. However, that is dependent on how many pipe symbols there are in str. So, it is very possible that $var2 through $var5 will still be undefined after this code is executed. What is a better way of writing this code so that I'm sure all the variables in the array are at least defined or set to the empty string, even if there aren't enough pipe symbols in $str? Also, am I correct in thinking that the $_ ||= "" in the map is redundant since the parts of the string that are split will always be defined?

Replies are listed 'Best First'.
Re: How to clean an array of strings after a split
by kcott (Archbishop) on Aug 29, 2013 at 11:36 UTC

    G'day MrSnrub,

    There's a couple of problems here: substitution is s/PATTERN/REPLACEMENT/ (note the s///); the 'g' modifier is for handling multiple matches (you're only dealing with a single match here). See perlretut for the basics and perlre for the details.

    You don't need to do two separate substitutions. This regex will suffice:

    s/^\s*(.*?)\s*$/$1/

    Note that $_ ||= "" will convert zeros (being FALSE values) into empty strings! Did you want to do that?.

    Also, unless you're making multiple calls to Trim(), you can just do the substitution within the map. Here's my test (where I've also attempted to make the split a little clearer):

    $ perl -Mwarnings -Mstrict -E ' my $str = " a | b|c |d|0| "; my @fields = map { $_ ||= ""; s/^\s*(.*?)\s*$/$1/; $_; } split /[| +]/ => $str; say ">>>$_<<<" for @fields; ' >>>a<<< >>>b<<< >>>c<<< >>>d<<< >>><<< >>><<<

    -- Ken

Re: How to clean an array of strings after a split
by choroba (Cardinal) on Aug 29, 2013 at 11:17 UTC
    What is a better way of writing this code so that I'm sure all the variables in the array are at least defined or set to the empty string
    What about using split /\|/, "$str|||||_" at the end?
    $_ ||= ""in the map is redundant since the parts of the string that are split will always be defined
    The || operator is different from the //. It tests for true, not definedness. ||= will replace signle zeros, too.
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: How to clean an array of strings after a split
by AnomalousMonk (Archbishop) on Aug 29, 2013 at 13:45 UTC

    Another approach depends on realizing that an expression like  my ($var1, $var2, ...) = ...; yields a list that can be processed with a Perl-ish for-loop.

    >perl -wMstrict -le "my $str = 'a| b||c | 0 '; ;; $_ //= '' for my ($var1, $var2, $var3, $var4, $var5, $var6) = map Trim($_), split m{\|}xms, $str; ;; printf qq{'$_' } for $var1, $var2, $var3, $var4, $var5, $var6; ;; sub Trim { $_[0] =~ s{ \A \s+ | \s+ \z }''xmsg; return $_[0]; } " 'a' 'b' '' 'c' '0' ''

    Use  defined or $_ = '' if you do not have the  //= operator (Perl 5.10+). Also, 5.14+ offers the  /r modifier for  s/// substitutions, which allows a slight simplification of the  Trim() function to
        sub Trim { return $_[0] =~ s{ \A \s+ | \s+ \z }''xmsgr; }

    Also:
            ... am I correct in thinking that ... the parts of the string that are split will always be defined?
    Yes.

      Thanks. I think I like this approach best because I don't need to concern myself with ensuring there are enough pipe symbols to suffice.

        Actually, if you used kcott's suggestion here and split into a named array, e.g.
            my @fields = map { ... } split ..., $str;
        you wouldn't have to worry about how many pipes there are and 'uninitialized' variables: you get what you get, it's all defined, and it's easy to test how much you've gotten by taking the size of the array and to iterate over the array. Rather than writing  $varn you write  $fields[n] instead, but 0-based.
        But I assume you have your reasons for preferring individual named variables...

Re: How to clean an array of strings after a split
by hdb (Monsignor) on Aug 29, 2013 at 12:40 UTC

    You could also search the pieces of interest with a single regex, and fill the array with sufficiently many empty strings like this:

    my ($var1, $var2, $var3, $var4, $var5) = ( $str =~ /(?:^|\|)\s*(.*?)\s +*(?=$|\|)/g, ("")x5 );
      Thanks, but please explain what this does: (?:^|\|) and this: (?=$|\|)
        (?: non-capturing parentheses ^ start of string | or \| pipe character )
        (?= look-ahead, not part of match $ end of string | or \| pipe character )

        So I am looking for something between the start of string or a pipe and a pipe or the end of the string, ignoring whitespace, matching non-greedily. In order to make it work, I need to use each pipe twice, once for the string before the pipe, and then for the string following it. Thus the look-ahead assertion. HTH.