But what if I need to apply binmode() on the input file handle (e.g. because the data needs to be read as utf8)? For stdin-stdout usage, that's no problem -- just:myscript some.data > output # or grep -h foo *.data | sort -u | myscript > output
But what about for files in @ARGV? I know these get opened via the "magical" ARGV file handle, and I know (from having just tried it) that this does not DWIM:binmode STDIN, ":utf8";
#!/usr/bin/perl binmode STDIN, ":utf8"; # covers pipe input binmode ARGV, ":utf8"; # does not work -- handle isn't open yet while (<>) { do_whatever( $_ ) }
I've tried using the '-C' option on the shebang line, but it turns out that -C has problems when there are other option flags on the shebang line. In fact there's a thread at perlbug (ticket #34087, for those keeping score) that indicates this flag is known to be broken and apparently will be phased out. (I first realized the problems when a script using -C, which worked in 5.8.7, failed to work in 5.8.8.)
Note that use encoding "utf8"; only affects STDIN and STDOUT -- no effect on ARGV. I know I can do something like this:
But that sucks. I could also give up the convenience of "dual usage" -- e.g. just write scripts to read from STDIN only, and never use the "magical" ARGV file handle -- but that would be sad. Using environment or locale settings would be fairly impractical as well (consecutive command lines might need to use different encodings).#!/usr/bin/perl use strict; my @files; if ( @ARGV ) { @files = @ARGV; } elsif ( -t ) { die "I want file names to open, or else pipeline input"; } else { @files = "stdin"; } for my $file ( @files ) { my $fh; if ( @ARGV ) { open $fh, "<:utf8", $file; } else { binmode STDIN, ":utf8"; $fh = \*STDIN; } while (<$fh>) { do_whatever( $_ ) } }
Can someone point out a better way to do this? Or maybe the powers that be could be talked into fixing and keeping the -C option? (I suppose this will be a non-issue when Perl 6 becomes the tool of choice...)
(update: added declaration for $fh in last code snippet, to make it grammatical)
In reply to Using binmode on ARGV filehandle? by graff
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |