MadraghRua has asked for the wisdom of the Perl Monks concerning the following question:

Hello PerlMonks,
I would like to open a file and read it line by line. I've created a subroutine to do this which takes the file path as an input variable. A file handle is then opened for this file.

As the file is read, I would like to count the number of occurances of the word 'Algorithm' at the beginning of a line. If it occurs, I want to increment a counter.

If the words 'Experiment Name' occurs at the beginning of a line, the if loop exits and the subroutine returns the value of the counter.

sub getExpNumber {  my ($filePath) = @_;  my ($n) = 0;    open (INPUT, $filePath) or die "Can't open $filePath: $!\n";  while (<INPUT>) {   chomp;   if (<INPUT> =~ /^Experiment Name/) {    print "There are $n experiments in $filePath: $!\n";   } elsif (<INPUT> =~ /^Algorithm/) {    $n++;    next;   } else {    next;   }  close(INPUT) or die "Can't close $filePath.$!\n";  } return $n;  }
My problem is that it only counts the first occurance of 'Algorithm' - the rest are ignored. Does anyone have a suggestion on what to change?

I am running with use strict and I am using the -w on the #! line

Thanks

MadraghRua
yet another biologist hacking perl....

Replies are listed 'Best First'.
Re: Counting occurances of a pattern in a file
by btrott (Parson) on Aug 31, 2000 at 02:51 UTC
    Your flow is off. Make finding "Experiment Name" the exception rather than the norm; don't use next to say when you want to repeat the loop, use last to break out of it. And put the close outside of the loop, as well.

    Your other problem is that you're reading the line multiple times in the same loop, then discarding some of the results. Each time you read from the file using the diamond ops (<>), you're reading a line. A *new* line.

    Try this:

    sub getExpNumber { my $filePath = shift; my $n = 0;   open (INPUT, $filePath) or die "Can't open $filePath: $!\n"; while (<DATA>) { if (/^Experiment Name/) { last; } elsif (/^Algorithm/) { $n++; } } close(INPUT) or die "Can't close $filePath.$!\n"; print "There are $n experiments in $filePath\n"; return $n;  }
Re: Counting occurances of a pattern in a file
by fundflow (Chaplain) on Aug 31, 2000 at 02:52 UTC
    You are reading a new line each time  <INPUT> appears. Change it to something like:
    while(<INPUT>) { $n+=/^Algorithm/; last if /^Experiment/ }
RE: Counting occurances of a pattern in a file
by BlaisePascal (Monk) on Aug 31, 2000 at 02:53 UTC
    You are definately overusing <INPUT>. Each time you do that, it reads another line... Your loop only checks every 3rd line for /^Algorithm/. It chomps the first line, checks the second one for /^Experiment Name/, and checks the third for /^Algorithm/.

    I'd write that subroutine as:

    sub getExpNumber { my $filePath = shift; my $n; open (INPUT,"<$filePath") or die "Can't open $filePath: $!\n"; while (<INPUT>) { chomp; last if /^Experiment Name/; $n++ if /^Algorithm/ } print "There are $n experiments in $filePath\n" close(INPUT) or die "Can't close $filePath: $!\n"; return $n; }
    /pattern/ works on $_ if nothing is bound to it, and while(<FILE>) {...} puts the line read into $_. So the two statements that have if /pattern/ at the end match the line read.
Re: Counting occurances of a pattern in a file
by gnat (Beadle) on Aug 31, 2000 at 02:54 UTC
    There are a lot of problems. You read multiple lines from the file, you close() inside the while() loop, .... Here's what you want:
    while (<INPUT>) { $n++ if /^Algorithm/; last if /^Experiment Name/; } close INPUT;
    Cheers;

    Nat

RE: Counting occurances of a pattern in a file
by Adam (Vicar) on Aug 31, 2000 at 02:58 UTC
    Untested, and should be treated as psuedocode:
    use strict; # Always use IO::File; # I like to localize filehandles. sub CountPatternInFile { my ( $filename, $pattern, $stop_pattern ) = @_; my $fh = IO::File->new( $filename ) or die $!; my $count = 0; while( <$fh> ) { $count += s/($pattern)/$1/g last if m/$stop_pattern/; } close $fh or die $!; return $count; }
      Don't use IO::File.

      Why not?

      1. Twice as slow as native handles.
      2. You don't need it.
      3. I have tripped over bugs when using IO::* that I didn't with regular IO. Don't remember the combination of things that hit it, but procedural IO worked and the other did not.
      If you want lexical filehandles use the technique that I did at RE (tilly) 1: Merging files. Or with 5.6 don't worry about it because handles autovivify for you. (Just part of the ongoing improvements. :-)
      Or without a loop (also untested):
      use strict; # Always use IO::File; # I like to localize filehandles. sub CountPatternInFile { local $/; my ( $filename, $pattern, $stop_pattern ) = @_; my $fh = IO::File->new( $filename ) or die $!; $_ = <$fh>; close $fh or warn $!; ($_) = split /$stop_pattern/; return s/($pattern)/$1/g; }
Re: Counting occurances of a pattern in a file
by MadraghRua (Vicar) on Aug 31, 2000 at 03:23 UTC
    Thanks folks! I appreciate the help.

    MadraghRua
    yet another biologist hacking perl....

Re: Counting occurances of a pattern in a file
by pschoonveld (Pilgrim) on Aug 31, 2000 at 14:06 UTC
    Ok, amongst the other errors in your code, you could have easily found this elsewhere on the site. In the future, take the time to look around. Although, I think you would have ended up posting anyway judging from the file errors.
RE: Counting occurances of a pattern in a file
by Anonymous Monk on Aug 31, 2000 at 18:00 UTC
    When you use while (<INPUT>), doesn't each line get put in $_ ? Try replacing if (<INPUT> =~ /regexp/ ) with if (/regexp/). If that doesn't work for some reason, here's a cheesier workaround that I think will work. @array = <INPUT> foreach $line (@array) { if ( $line =~ /regexp/ ) { do stuff } elsif ( $line =~ /other regexp/ ) { do other stuff } } As a totally unrelated side note, I believe your "next" commands and your final "else" are totally unnecessary. Hope this helps.