in reply to Perl Matching Question

BrowserUK's solution implies that you want to do something with a file if it contains one of the words 'PASS', 'sweeps', 'Final', but from your post, I take it a file should match all three of them. The following code snippet should do that.

open(FILE, "$some_dirr/$file") or die("can't open $file"); my ($PASSmatch, $sweepsmatch, $Finalmatch); while (<FILE>) { if (/\bPASS\b/) { $PASSmatch = 1; } elsif (/\bsweeps\b/) { $sweepsmatch = 1; } elsif (/\bFinal\b) { $Finalmatch = 1; } } close(FILE); if ($PASSmatch && $sweepsmatch && $Finalmatch) { # do whatever with this file. }
This is not going to win a prize for elegance or generality, but it should be along the lines you want.

Hope this helps, -gjb-

Update: graff makes two good points about the code above: 1) one shouldn't die in a directory scan when a file can't be read and 2) it would be more efficient to last out of the while as soon as the three words have been found for efficiency's sake.

Replies are listed 'Best First'.
Re: Re: Perl Matching Question
by graff (Chancellor) on Sep 11, 2003 at 06:19 UTC
    I agree with your reading of the post (I wonder if biggin777 does too...), and with your approach (well, erm, if you're scanning through a directory, then it seems a bit harsh to die because open fails on a given file).

    Anyway, since the files are big, it would be nice to exit the while loop ASAP. Granting that all three conditions need to be met to trigger further processing, there's no point in keeping track of them separately:

    my @keepers; opendir(DIR, $some_dir) || die "can't opendir $some_dir: $!"; while ($file = readdir(<DIR>)) { if ($file =~ /Somthing/ and -f $file and open(FILE, $file) { my $pass = 0; while (<FILE>) { $pass++ if (/\b(?:PASS|sweeps|Final)\b/); last if ( $pass == 3 ); } close FILE; push @keepers, $file if ( $pass == 3 ); } } # now do whatever needs to be done with @keepers.
    (or maybe something needs to be done with @keepers in that same while loop? but that might complicate things a lot; perhaps there'll be another question from biggin777 about that in a little while...)

    update: Thanks to AM's very astute reply below, I see where it might be important to keep track of each different condition separately. To keep it brief, I would just set a different bit for each condition:

    ... my $pass = 0; while (<FILE>) { $pass |= 1 if ( /\bPASS\b/ ); $pass |= 2 if ( /\bsweep\b/ ); $pass |= 4 if ( /\bFinal\b/ ); if ( $pass == 7 ) { push @keepers, $file; last; } } ...

      With this code if you put PASS sweeps FINAL on the same line, it fails. Also, you do not designate which words you've seen. As such, the file could say PASS on each line and the program will accept it. Too bad the files are so big as you could otherwise have one line inside your while ($file ... loop:

      push @keepers, $file if $file=~/Something/ and -f $file and 3 == @{ [ +do { my @a= (my $temp = do{local(*ARGV,$/)=[$file];<>}) =~ /\b(PASS|s +weeps|Final)\b/g;my %b;undef @b{@a};keys %b } ] };

      Yeah, no error checking or anything. :)

      Anonymously yours
      Anonymous Monk