pengyou_ah has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to build a Perlscript to extract data from a ton of text files.
From the command line, I'd like to run the following
ls *.txt | myscript.pl
where myscript.pl looks something like this:
#!/usr/bin/perl -w my $filename = ""; my $inputline = ""; my $siteid = ""; my $sitename = ""; foreach $filename (@ARGV){ open(FILE,$filename) or die "can't open file: $!\n"; print "processing $filename\n"; while ($inputline = <FILE>){ if ($inputline =~ m/^Station/){ print "$inputline\n"; } } close FILE; }
I've seen other code snippets using a pipe in the open(), but I'd rather do it as shown if possible. Any help appreciated. If I must use a pipe in open(), how can I iterate through the results of ls *.txt?

update (broquaint): added <code> tags and dropped extraneous <br> tags

Replies are listed 'Best First'.
Re: command line pipe
by blokhead (Monsignor) on Jul 23, 2003 at 04:01 UTC
    You seem to be confused about the difference between command-line arguments and piping. If you say ls *.txt | ./myscript.pl, the STDOUT output of ls is going to STDIN of your script. Yet you are trying to get the names of those .txt files from @ARGV.

    To actually get the filenames from the piped input, do something like:

    chomp( my @filenames = <STDIN> ); for (@filenames) { ... # open, read, etc }
    A better solution for your particular example, though, is Zaxo's advice above. Use the -n option, or use the magic while (<>) loop: both automatically use the members of @ARGV as filenames and open them in sequence for you, reducing the code you've written to a one-liner.

    Alternately, look at xargs to do something like this:

    $ ls *.txt | xargs ./myscript.pl
    .. which takes the name of each txt file (as given by ls) and passes it as a command-line argument (i.e, an element of @ARGV) to myscript.pl (plus or minus quoting/escaping issues)

    blokhead

      Thanks for the help everyone. The reason I didn't use the oneliner is that I need to open
      the files for further processing. 'Blokhead's solution is ultimately what I needed.
      You're right, I was confused about pipes and args but I've got it working now.
Re: command line pipe
by Zaxo (Archbishop) on Jul 23, 2003 at 03:33 UTC

    From the command line, perl -n -e'/^Station/ and print' *.txt See perlrun for how the -n option produces the loop you want.

    That's no pipe, just a glob of input file names.

    After Compline,
    Zaxo

      perl -n -e'/^Station/ and print' *.txt

      And that's probably how I would do it, however, those who use this common and powerful feature should be aware that there is a very small security risk associated with using the -p or -n switches with a glob on the command line. Don't do it in directories that can hold filenames originating from an untrusted source. Here's a short demo of why that could be bad...

      $ touch '| echo uh oh;#.txt' && perl -n -e0 *.txt && rm '| echo uh oh; +#.txt' uh oh

      I once promised tye that I would try to do my part in spreading the word about this. It isn't at all likely to pose a real threat, but you should be aware of it just the same. For an in-depth discussion on the subject, see Dangerous diamonds!

      -sauoq
      "My two cents aren't worth a dime.";