in reply to search pattern with digits

I gather that your search phrase will be determined at run time, and not only the phrase, but also whether the data comes before or after the phrase.

How is your program going to be notified of the search phrase? Is there some way it can also be notified of the positioning of the numerical data?

If you have any control over the earlier stage where the search phrase is identified, ask to have the search phrase specified as a regular expression complete with capturing parentheses.

sub capture_numbers { my ( $search_phrase ) = @_; croak "Search phrase $search_phrase lacks capturing parentheses" if $search_phrase !~ /\(/; my @results; while <LOG_FILE> { my ( $desired_number ) = /$search_phrase/o; push @results, $desired_number; } }

This sub accepts either of your two example search phrases with their differing syntax:

Replies are listed 'Best First'.
Re^2: search pattern with digits
by mercuryshipz (Acolyte) on Feb 14, 2008 at 22:31 UTC
    im posting the code to give a clear idea...
    #!/usr/bin/perl # use strict; #use warnings; use List::Util q{first}; sub search_phrase{ my @array; my ( $inFile, @phrases ) = @_; my $lastPhrase = $phrases[ -1 ]; open my $inFH, q{<}, $inFile or die qq{open: $inFile: $!\n}; my @lines = <$inFH>; close $inFH or die qq{close: $!\n}; foreach my $phrase ( @phrases ) { my $rxPhrase = qr{\Q$phrase\E}; my $lineNo = first { $lines[ $_ ] =~ $rxPhrase } 0 .. $#lines; unless ( defined $lineNo ) { next; } print "" if ($lines[ $lineNo ] =~ m{\Q$lastPhrase\E\s*(\d*)}); push (@array,$1); $lineNo ++; splice @lines, 0, $lineNo; } return (@array); } my $file_n = "test.txt"; my $phrase1 = "total rows rejected:"; my $phrase2 = "total rejected recors:"; my $phrase3 = "rows rejected for sub"; my $phrase4 = "total rejected rows:"; my @newarray=search_phrase($file_n, $phrase1, $phrase2, $phrase3, $phr +ase4); my $count=($#newarray); $newarray[$count]=~ s/\s+//g; if ((($#newarray+1)>=1) && ($newarray[$count]gt 0)) { print "$newarray[$count]\n"; } else { print "-1\n"; }

    the log file is searched in the sequence the phrases are given and the last phrase's value is returned. if the last phrase is not present in the log file or not in that sequence it returns -1. my problem is, if u look at the log file "rows rejected for sub" if this is given as the last phrase, (the number is present at the beginning of the search phrase) the number present at the beginning must be returned. this program works only if the last phrase (search phrase) has the number after that not at the beginning. and once again, the sequence or the phrases given for search varies everytime according to the log file.

    log file
    This file is to check the number of occurences of the word reject total rows rejected: 80 this file just contains the phrase reject. reject:3 reject 3 reject 4 total rejected rows: 100 total rejected rows: 90 total rejected rows: 60 total rejected rows:40 total rejected rows:40 90 rows rejected for sub 999 rows rejected for sub 100 rows rejected for sub Reject_Ao total rejected recors: 60 total rejected rows:49 reject:1 390 rows rejected for sub

    thanks.

      Thanks for supplying the code and the sample log file. I saved the log file as 'test.txt' and ran your code. The subroutine returns an array of three undefined values, one for each of the first three search phrases, after which the text file is exhausted, so nothing (not even an undef array entry) is returned for the final search phrase.

      The code is way too busy. You don't need to read a file into an array - you can just iterate one line at a time with while <$inFH>. You almost never to use an array index.

      Finally, if your problem is to extract the number from the matching line wherever the number may be, why don't you just use /(\d+)/ to extract the number after you have matched the search phrase?

      #!/usr/bin/perl use strict; use warnings; sub search_phrase { my ( $inFile, @phrases ) = @_; open my $inFH, q{<}, $inFile or die qq{open: $inFile: $!\n}; my $line; PHRASE: foreach my $phrase ( @phrases ) { my $rxPhrase = qr{\Q$phrase\E}; # keep reading down the file while ($line = <$inFH>) { # when one phrase matches, jump to the next next PHRASE if $line =~ /$rxPhrase/; } # end of file, and we haven't matched the last phrase return; } # We have just matched the last phrase. # The number we want is somewhere in $line. my ($number) = $line =~ /(\d+)/; return $number; } my $file_n = "test.txt"; my $phrase1 = "total rows rejected:"; my $phrase2 = "total rejected recors:"; my $phrase3 = "rows rejected for sub"; my $result = search_phrase($file_n, $phrase1, $phrase2, $phrase3); if (defined $result) { print "search_phrase subroutine found $result\n" } else { print "search_phrase subroutine didn't find a number." }

      With your sample data, this returns

      search_phrase subroutine found 390

        thats awesome guys... thanks a lot....


        what if the log file contains a line like this"Inserted rows - Requested: 9642 Applied: 9642 Rejected: 0"... is it possible to parse a log file like this

        #!/usr/bin/perl use strict; use warnings; sub search_phrase { my ( $inFile, @phrases ) = @_; open my $inFH, q{<}, $inFile or die qq{open: $inFile: $!\n}; my $line; PHRASE: foreach my $phrase ( @phrases ) { my $rxPhrase = qr{\Q$phrase\E}; # keep reading down the file while ($line = <$inFH>) { # when one phrase matches, jump to the next next PHRASE if $line =~ /$rxPhrase/; } # end of file, and we haven't matched the last phrase return; } # We have just matched the last phrase. # The number we want is somewhere in $line. my ($number) = $line =~ /(\d+)/; return $number; } my $file_n = "test.txt"; my $phrase1 = "Inserted rows"; my $phrase2 = "Requested:"; my $phrase3 = "Applied:"; my $phrase4 = "Rejected:"; my $result = search_phrase($file_n, $phrase1, $phrase2, $phrase3,$phra +se4); if (defined $result) { print "$result\n" } else { print "-1\n" }
        This file is to check the number of occurences of the word reject total rows rejected: 80 Inserted rows - Requested: 9642 Applied: 9642 Rejected: 0 this file just contains the phrase reject. reject:3 reject 3 reject 4 total rejected rows: 100 total rejected rows: 90 total rejected rows: 60 total rejected rows:40 total rejected rows:40 90 rows rejected for sub 999 rows rejected for sub 100 rows rejected for sub Reject_Ao total rejected recors: 60 total rejected rows:49 reject:1 390 rows rejected for sub

        thanks.