in reply to Re: Broken regexp
in thread Broken regexp

$line=join('|',map(defined($_) ? $_ : '\N', split('|',$line)));'

Empty fields will be defined, but empty. In other words: they will be strings with a length of 0 (also: "without length"). Besides that, $_ is the default variable. I prefer to see people take advantage of that :)

Your code will also not work because it splits on either nothing or nothing. Using '' instead of // doesn't make split interpret its first argument as a string. It always sees it as a regex, unless it is ' ' (a single chr(32) space). The | needs to be escaped.

Oh, and I dislike parens. Without parens, I think things are much easier to read.

$line = join '|', map { length() ? $_ : '\N' } split /\|/, $line, -1.

Update 1 - added negative third argument for split per Abigail's suggestion.
Update 2 - added some parens per Abigail's second suggestion.

Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

Replies are listed 'Best First'.
Re: Broken regexp
by Abigail-II (Bishop) on Mar 09, 2004 at 13:44 UTC
    Without parens, I think things are much easier to read.
    And I like parseble code. Without parens, Perl guesses wrong:
    Warning: Use of "length" without parentheses is ambiguous at /tmp/j li +ne 8. Search pattern not terminated at /tmp/j line 8.

    Abigail

Re: Re: Re: Broken regexp
by liz (Monsignor) on Mar 09, 2004 at 14:31 UTC
    Besides that, $_ is the default variable. I prefer to see people take advantage of that...

    Please note that the advantage in these cases is only in the number of characters that you need to type. And perhaps in readability (which some people might consider a disadvantage in these cases). For execution, there is no difference in the opcode tree generated, and thus no difference in execution. For example:

    $ perl -MO=Deparse -e 'length' length $_;
    Personally, I prefer to use an explicit $_ if I think it improves the readability of my code.

    Liz

      Please note that the advantage in these cases is only in the number of characters that you need to type.

      Reducing typing has always been its purpose and that is what it does well. I hate that you have to use parens here and would in practice probably indeed use length $_ instead of length(). (I find the latter rather unreadable the more I think about it. It looks as if I'm trying to explicitly pass NO arguments.)

      I prefer to leave out $_ when it can be implied, because that is what I think improves readability. I even like to abuse for for topicalization.

      Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

      This is true of most ops (e.g. "length($_)" compiles exactly the same as "length()").

      The exceptions are m//, s///, and tr///, where leaving out the implicit "$_ =~" will produce a leaner opcode tree that will be only slightly faster.