in reply to Re: Piping many individual files into a single perl script
in thread Piping many individual files into a single perl script

So I tried the code you provided,and while it did pipe all the files in, in listed them all in one line in the results file.
A sample of what I received:
File#   A    B   C
1       5    6   3

What I wanted:
File#   A    B   C
1       2    1   0
2       2    4   2
3       1    1   1
This is my code:
#!C:\Perl use strict; BEGIN{ @ARGV=map glob, @ARGV; } open(RES, ">>results.txt"); print RES "File Number A A% B B% Null Null%\n"; my $A=0; #these three lines set my initial counts at zero my $B=0; my $null=0; my $filenum=0; while (<>){ chomp($_); if ($_ eq "stringa"){ $A++;} elsif ($_ eq "stringb"){ $B++;} else { $null++; } } my $popa=$A/1000; #these lines determine what percent of the populatio +n the strings represent my $popa=sprintf('%.2f',$popa); #cut the percentages to two decimal pl +aces my $popb=$B/1000; my $popb=sprintf('%.2f',$popb); my $popnull=$null/1000; my $popnull=sprintf('%.2f',$popnull); my $filenum++; #Add one to my filenumber print RES "$filenum $A $popa $B $popb $null $pop +null\n"; #print the results out to the "results" file
What am I doing wrong? Edit: Thanks for the help so far!

Replies are listed 'Best First'.
Re^3: Piping many individual files into a single perl script
by BrowserUk (Patriarch) on Sep 28, 2008 at 14:27 UTC

    You need to detect the end of each individual file, print your results for that file and reset the counts. See the explanation of eof(ARGV) in perlfunc:

    #!C:\Perl use strict; BEGIN{ @ARGV=map glob, @ARGV; } open(RES, ">>results.txt"); print RES "File Number A A% B B% Null Null%\n"; my $A = 0; #these three lines set my initial counts at zero my $B = 0; my $null = 0; my $filenum = 0; while( <> ){ chomp($_); if ($_ eq "stringa"){ $A++;} elsif ($_ eq "stringb"){ $B++;} else { $null++; } if( eof( ARGV ) ) { ## true after the end of each individual file my $popa = sprintf( '%.2f', $A / 1000 ); my $popb = sprintf( '%.2f', $B / 1000 ); my $popnull = sprintf( '%.2f', $null / 1000 ); my $filenum++; #Add one to my filenumber print RES "$filenum $A $popa $B $popb $null + $popnull\n"; $A = $B = $null = 0; ## Reset counts for the next file } }

    As I mentioned above, if OS/X is a *nix-like system, you probably don't need the @ARGV = map glob, @ARGV as the shell will take care of that for you. (Though it probably won't do any harm.)

    Also, in your code you have several place where you do:

    ... my $var = ....; my $var = sprintf ... $var; ...

    If you are running with strict and warnings, you should be getting messages of the form: "my" variable $var masks earlier declaration in same scope at .......don't ignore them, they are there for a purpose.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      I was about to post a reply with similar suggestions... Now the only new things I can add are
      • on a Mac the
        BEGIN{ @ARGV=map glob, @ARGV; }
        is definitely not required.
      • I also noticed that the OP is using #!C:\Perl as the shebang line. On a Mac this should be #!/usr/bin/perl for an Apple-supplied Perl, but will need a different path if it's a user-installed Perl (e.g., from MacPorts, or elsewhere)
        I also noticed that the OP is using #!C:\Perl as the shebang line.

        I missed that++. That's a very strange choice for someone using a MAC (which doesn't have drive letters?), and wouldn't work on any *nix variant I'm aware of.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re^3: Piping many individual files into a single perl script
by graff (Chancellor) on Sep 28, 2008 at 15:22 UTC
    I'm a little confused. You said you are running on macosx, but your code starts with:
    #!C:\Perl
    That makes no sense, and it entails that you can only run the script with a command line like this:
    perl path/file_name_of_script arg1 ...
    (where the "path/" part is only needed if the script is not in your shell's current working directory). I would use this as the initial "shebang" line:
    #!/usr/bin/perl
    because macosx is really a unix OS, and in unix, perl is typically found in the /usr/bin/ directory; macosx definitely has perl in that location. With that as the shebang line, and doing the shell command "chmod +x file_name_of_script", the script becomes available as a shell command:
    path/file_name_of_script arg1 ...
    where the "path/" part is only needed if your shell PATH variable does not include the directory where the script is stored.

    As for your question about iterating over a list of file names, a method that I find useful goes like this: the perl script expects as input a list of file names, loads those into an array, and then iterates over the array. At each iteration, if there's a problem with the file or its contents, issue a warning and skip on to the next file in the list; e.g.:

    #!/usr/bin/perl use strict; use Getopt::Long; my $Usage = "Usage: $0 [-p path] filename.list\n or: ls [path] | $0 +[-p path]\n"; my $path = '.'; die $Usage unless ( GetOptions( 'p=s' => \$path ) and -d $path ); die $Usage if (( @ARGV and !-f $ARGV[0] ) or ( @ARGV==0 and -t )); # need file name args or pipeline input my @file_list = <>; # read all input as a list of file names chomp @file_list; # get rid of line-feeds for my $name ( @file_list ) { my $file = "$path/$name"; if ( ! -f $file ) { warn "input value '$file' does not seem to be a data file; ski +pped\n"; next; } if ( ! open( I, "<", $file )) { warn "open failed for input file '$file'; skipped\n"; next; } ... }
    There are already very good shell command tools for creating a list of file names ("ls", "find"), and for filtering lists ("grep"), so I'm inclined not to rewrite those things in a perl script that is supposed to process a list of file names.

    The exception to that rule is when the script is really intended for a specific task that always involves a specific location and/or filter for getting its list of file names to work on, because in that case, I'd rather not have to repeat the selection process on the command line every time I run the script.

Re^3: Piping many individual files into a single perl script
by apl (Monsignor) on Sep 28, 2008 at 11:52 UTC
    $A and $B are the running totals for all of the files. You either need to make them arrays (indexed by file), or you need to print the totals when you reach the end of a file (after which, you would reset the variables to zero).
Re^3: Piping many individual files into a single perl script
by blazar (Canon) on Sep 29, 2008 at 14:45 UTC

    I like BrowserUk's solution below, except that I'd probably rewrite it (I mostly didn't like the chained if-elsif's) in a manner similar to (untested:)

    #/usr/bin/perl use strict; use warnings; use 5.010; BEGIN{ @ARGV=map glob, @ARGV } print "File Number A A% B B% Null Null%"; my $default = ''; # set to something sensible, the empty string seems + good. my @allowed = (qw/stringa stringb/, $default); my (%count, $filenum); while(<>) { chomp; $count{$_ ~~ @allowed ? $_ : $default}++; if (eof) { $filenum++; say "$filenum ", join ' ' => map { my $x=$count{$_}; $x, sprintf('%.2f', $x/1000) } @allo +wed; @count{@allowed}=(0) x @allowed; } } __END__

    I threw in some 5.10-isms in the course of doing so, but it wouldn't be terribly different with pre-5.10 exists.

    --
    If you can't understand the incipit, then please check the IPB Campaign.