tuxz0r has asked for the wisdom of the Perl Monks concerning the following question:

So, I've done Preserving empty fields in splitsome searching (Preserving empty fields in split) and I understand the various options to split() to include trailing empty fields. However, we have some scripts that have a number of perl one-liners to do some quick record scrubbing and they use -F to split on commas. I can not however find a way to use -F and have it keep the empty trailing fields (without appending some nonsense character to the end of the record and then having to discard it afterwards).

As an example, this loses all the trailing, empty fields:

head -1 file.txt | perl -F, -lane 'print join(",",@F);'

This keeps them, but I have to do extra work to keep the fields:

head -1 file.txt | sed -e 's/$/x/' | perl -F, -lane 'print join(",",@ +F[0..$#F-1]);'

Is there a modifier to this option that will let it do the same thing split does with the extra argument? I'm pretty sure this could be cleaner and clearer.

---
s;;:<).>|\;\;_>?\\^0<|=!]=,|{\$/.'>|<?.|/"&?=#!>%\$|#/\$%{};;y;,'} -/:-@[-`{-};,'}`-{/" -;;s;;$_;see;
Warning: Any code posted by tuxz0r is untested, unless otherwise stated, and is used at your own risk.

Replies are listed 'Best First'.
Re: -F and trailing empty fields
by kyle (Abbot) on Apr 30, 2008 at 16:52 UTC

    I would suggest that you discard the -F option (because it doesn't do what you want) and instead add "@F=split /,/, $_, -1;" to the beginning of the one-liner (because it does what you want).

      Well, I can always do that, as I did reference the post about split options and preserving empty trailing fields, but my question was specifically geared toward "is there a way to do that same thing with the -F option on the command line." Otherwise, yes, I can just break all this out and write a number of perl scripts using split directly. Just trying to keep this short and on a single line (as I'm doing other stuff I don't show in the example in the original post).

      ---
      s;;:<).>|\;\;_>?\\^0<|=!]=,|{\$/.'>|<?.|/"&?=#!>%\$|#/\$%{};;y;,'} -/:-@[-`{-};,'}`-{/" -;;s;;$_;see;
      Warning: Any code posted by tuxz0r is untested, unless otherwise stated, and is used at your own risk.

Re: -F and trailing empty fields
by tachyon-II (Chaplain) on Apr 30, 2008 at 17:18 UTC

    Split uses $_ as its default argument and delivers its data into @_ unless assigned so:

    C:\>type test.txt a,b,c,d,e,f a,,c,,e, a,b,c,,, C:\>perl -pe "split/,/;$_=join'!',@_" test.txt a!b!c!d!e!f a!!c!!e! a!b!c!!!

    Yours is longer but to be fair if I golf it down it works out exactly the same number of characters, however this does as you want, except @F is @_. It looks like dropping the empty fields is an apparently undocumented feature.

      Maybe this is a quoting thing but on my linux box:

      ~/src> perl -pe "split',';$_=join',',@_" smoke.txt Can't modify concatenation (.) or string in scalar assignment at -e li +ne 1, at EOF Execution of -e aborted due to compilation errors.
      --
      I used to drive a Heisenbergmobile, but every time I looked at the speedometer, I got lost.

        It is a quoting thing. On *nix s/'/tmp/g; s/"/'/g; s/tmp/"/g :-)

        perl -pe 'split/,/;$_=join",",@_' smoke.txt
Re: -F and trailing empty fields
by tuxz0r (Pilgrim) on Apr 30, 2008 at 17:58 UTC
    Ok, I don't think it's the '-F' option after looking at this, it seems that my use of the '-l' option in conjunction with -F/-a is causing the trailing fields to drop off. For example, when I do:
    perl -F, -lane 'print join("!",@F)' test.txt
    I lose the trailing empty fields. But, removing the '-l' option returns them:
    perl -F, -ane 'print join("!",@F)' test.txt
    Does anyone understand why this is happening? According to the command line help and perldoc -l without a parameter enables line ending processing, chomping the input line and setting $\ (output separator) equal to $/ (input separator). Why would that cause the -F splitting to drop the empty, trailing characters?

    ---
    s;;:<).>|\;\;_>?\\^0<|=!]=,|{\$/.'>|<?.|/"&?=#!>%\$|#/\$%{};;y;,'} -/:-@[-`{-};,'}`-{/" -;;s;;$_;see;
    Warning: Any code posted by tuxz0r is untested, unless otherwise stated, and is used at your own risk.

      Is it that the "newline" after the last comma (essentially empty) is seen by split (-F) to be a value in that field, and hence it retains it and any preceding empty fields? Because if the input is line such as a,,,, I only get the 'a' on output (using -l with -F/-a), but if it's ,,,,,a I get all the fields. Is that normal behavior to have split assume the newline at the end of a character separated record is part of the value for that field (even if $/ is '\n')?

      ---
      s;;:<).>|\;\;_>?\\^0<|=!]=,|{\$/.'>|<?.|/"&?=#!>%\$|#/\$%{};;y;,'} -/:-@[-`{-};,'}`-{/" -;;s;;$_;see;
      Warning: Any code posted by tuxz0r is untested, unless otherwise stated, and is used at your own risk.

Re: -F and trailing empty fields
by oko1 (Deacon) on Apr 30, 2008 at 20:25 UTC

    Related bit of behavior:

    ben@Tyr:~$ printf "a,b,c,,,"|perl -F, -ane 'print 0+@F, "\n"' 3 ben@Tyr:~$ printf "a,b,c,,,\n"|perl -F, -ane 'print 0+@F, "\n"' 6

    It does seem as though the autosplit takes that '\n' as some sort of a fencepost and counts the fields up to it if it exists, and chops them off otherwise.

    
    -- 
    Human history becomes more and more a race between education and catastrophe. -- HG Wells
    
Re: -F and trailing empty fields
by cdarke (Prior) on Apr 30, 2008 at 18:39 UTC
    -F only works with autosplit, which requires the -a option (omitted from your code). See perlrun.