japhy has asked for the wisdom of the Perl Monks concerning the following question:
Much to my dismay, using the -F switch to perl does not set a variable. When I write code to quick-fix a file, for instance, I want to be able to do:
perl -F"\t" -lane 'munge(@F); print join($MAGIC_F_VAR, @F)' file
But I have to hard-code the "\t" in the join() function. Thoughts?
Re: Does Perl 5 (or 6?) need another built-in variable for the -F switch? (re)
by tye (Sage) on Nov 02, 2006 at 16:34 UTC
|
I believe I mentioned this before somewhere but (probably just in the CB, actually):
BEGIN { $sep= '' }
if( '' eq $sep && 1 < @F ) {
( $sep )= /\Q$F[0]\E(.*?)\Q$F[1]\E/;
}
Sadly, although join(undef,@F) needn't emit a warning when @F contains fewer than 2 items, it does.
Yes, it still isn't perfect (but I hope it helps some). I'm sure you know where to apply the patch. $^A doesn't appear to be used (for "auto-split separator"), last I checked.
Update: I was trying to think how to wrap this nicely into a module and also that the separator needn't be the same at each point, when I decided the way to do both would be to have the module provide a routine that replaces the old @F values with the new into $_ in-place.
my $pos= 0;
for my $f ( 0..$#F ) {
pos($_)= $pos;
die if ! m/\Q$F[$f]\E/g;
$pos= pos($_) + length($G[$f]) - length($F[$f]);
substr( $_, $-[0], $+[0]-$-[0], $G[$f] );
}
pos($_)= 0;
And I'd also like a module that ties up this trick for avoiding using $& (and capturing) that is also more useful (and faster) than either so I don't have to look up the formula each time. (:
| [reply] [d/l] [select] |
Re: Does Perl 5 (or 6?) need another built-in variable for the -F switch?
by duff (Parson) on Nov 02, 2006 at 16:34 UTC
|
The argument to -F is a pattern, not a string. You could use parentheses in your pattern and take every other item in @F for munging and then put them back when you're done. Though for such a short pattern, you might as well just repeat it.
| [reply] [d/l] |
Re: Does Perl 5 (or 6?) need another built-in variable for the -F switch?
by davido (Cardinal) on Nov 02, 2006 at 16:46 UTC
|
As long as we're talking about patches, perhaps $& is where the patch should be applied so that split-type matches fall within its realm.
The problem with that would be if someone, somewhere, in existing codebase is performing a pattern match, then splitting something, and then relying on $& still pertaining to the pre-split match. That may be a rare potential, but it does exist. So I guess a new variable ought to be used instead of adding functionality to $& (even if it seems to make sense to do so). Just to make it easier to remember, how about calling this new variable $^&, since its behavior would be quite similar to $&
I'm actually kind of surprised that Perl doesn't present a special variable (maybe a special hash) that contains all of the command line switches used to invoke perl. ...not the switches passed on to the script being executed, but the switches passed on to perl itself. If the hash were called %^C, it might contain data like this:
%^C = (
F => "\t",
a => undef,
n => undef,
e => '# The actual one-liner script',
);
| [reply] [d/l] [select] |
|
I like davido's suggestion of a %^C hash with the command line parameters that are used to invoke perl. This allows the programmer to use the value of F (or anything else defined on the commandline) for a join IF he/she knows that it is safe to do so - i.e. if it is not a pattern but a simple string, see duff's post in this thread.
-- Hofmator
Code written by Hofmator and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.
| [reply] |
|
On Linux with a mounted /proc file system, you can find the command line parameters from /proc/$$/cmdline.
But this (and the proposed %^C) would be tricky. Command line parameters maybe clustered. Searching for just '-F' won't do, the parameter may be passed as '-wF:', which sets the split pattern to :. However, a parameter '-iF:' won't.
And then there is PERL5OPT.
| [reply] |
|
|
|
Re: Does Perl 5 (or 6?) need another built-in variable for the -F switch?
by jmcnamara (Monsignor) on Nov 02, 2006 at 16:40 UTC
|
Awk has the FS variable for this:
$ awk -Ffoo 'BEGIN{print FS}'
foo
--
John.
| [reply] [d/l] |
Re: Does Perl 5 (or 6?) need another built-in variable for the -F switch?
by jbert (Priest) on Nov 03, 2006 at 08:45 UTC
|
If it is a one-liner then your perl is probably simple enough to go into " rather than ' (just backlash any $vars or double quotes), so:
export F="\t"
perl -F$F -line "munge(@F); print join($F, @F);" file
The join($F, @F) looks a bit evil, but hey.
Sadly, this doesn't work:
F="\t" perl -F$F -line "munge(@F); print join($F, @F);" file
Presumably because the prefixed env var doesn't apply to the shell's expansion of the line.
Does anyone know a trick to make this (shorter) approach work? | [reply] [d/l] [select] |
|
You didn't try that, did you? It's not going to work. See, if $F is interpolated by the shell, you end up with:
join(t, @F)
This is wrong on several accounts. First, inside the one-liner, you need to place quotes around $F, otherwise, at best you end up with bare words, but more likely, you end up with something that cannot be compiled (Try F=":") for instance. Second, the shell doesn't know "\t" as something special. It just sees an escaped t, and hence, F="\t" is equivalent to "t".
Also, the command line switches -line won't do what you expect it to do. It means "chomp of newlines, and add newlines to print, and modify the input file, leaving a backup with extension 'ne'".
You probably want (untested):
F=' ' # Enter a tab here.
perl -F"$F" -i -wlne "munge(@F); print join('$F',@F);" file
But you can to better, it can also work if you want to use single quotes for the one liner. The shell's quoting mechanism are superior to Perls in some expects. Exploit them:
perl -F"$F" -i -wlpe 'munge(@F); $_ = join("'$F'",@F);'
| [reply] [d/l] [select] |
|
I tried it with a : seperator, but you're right I didn't use the \t. Mea culpa.
| [reply] |
|
|
|
|