kanikilu has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I'm pretty new to Perl, and I was wondering if I could do this, and if so, if I could get some input from you guys.

What I would like to do is have a script that searches all the .log files in the current directory (where the script is located). It should then look for a certain word in each file. If it finds the word, it should write out a text file that includes the line it was found on, and the two previous lines. It should do this for each instance of the word in each file of that directory.

So if for instance the word was "hello", the text file would look something like this after one instance:

-----------------------------------------
(1) From file nameoffile.log:

.............previous line.............
.............previous line.............
...current line........hello...........
-----------------------------------------

I'm sure this can be done, but I'm just not sure where to start. If it helps, I'm on Windows 2000, using ActiveState's ActivePerl 5.6.1.626

Thanks in advance for any help!

Replies are listed 'Best First'.
Re: Newbie Text Parsing Question
by Abigail (Deacon) on Jul 06, 2001 at 01:18 UTC
    Here's a slight variation. This one prints the line before and after the line(s) that match (the amount of lines is controlled by the variable $range). It uses a circular buffer, but all the functionality of dealing with circularity is hidden inside a tie mechanism.
    #!/opt/perl/bin/perl use strict; use warnings; my $file = "/usr/dict/words"; my $word = "perl"; my $range = 1; # -$range .. $range my $size = 2 * $range + 1; sub TIEARRAY {bless [("") x $_ [1]] => $_ [0]} sub STORE {${$_ [0]} [$_ [1] % @{$_ [0]}] = $_ [2]} sub FETCH {${$_ [0]} [$_ [1] % @{$_ [0]}]} sub FETCHSIZE {scalar @{$_[0]}} sub STORESIZE {die} tie my @buffer => 'main', $size; open my $fh => $file or die "Failed to open $file: $!"; while (<$fh>) { $buffer [$.] = $_; if ($buffer [$. - $range] =~ /$word/) { print @buffer [$. - $size + 1 .. $.]; } } # Borderline, matches at the end: for my $line ($. - $range + 1 .. $.) { print @buffer [$line - $range .. $.] if $buffer [$line] =~ /$word/ +; } __END__

    -- Abigail

Re: Newbie Text Parsing Question
by VSarkiss (Monsignor) on Jul 06, 2001 at 00:05 UTC
    Yes and yes.

    If the files are small, the easiest way to "back up" is to read the whole thing into memory. Something like this:

    # Loop over all files with a .log suffix foreach my $fn (<*.log>) { # Open the file, if possible, and read it all into @f open I, $fn or warn("Couldn't open $fn: $!"), next; my @f = <I>; close I; # Go through it a line at a time for (my $i = 0; $i < @f; $i++) { # If you find "hello" anywhere in the line, # Back up two lines and print if possible if ($f[$i] =~ /hello/) { print $f[$i-2] if $i > 1; print $f[$i-1] if $i > 0; print $f[$i]; } # Note, if you don't care about "undefined value" # warnings, print the three elements without any # condition. } }
    If the files are large, this could eat up lots of memory. In that case, you'll have to play games with backing up inside the file, which is trickier. (An exercise for the reader. ;-)

    HTH

      Thanks for the reply! The files should be between about 5 and 45 KB. Is this too big? I'll try what you suggested and reply back...
        Thanks! It worked perfectly. I added a couple lines to output it to a file and make the output a little "prettier", but it suits my purposes just fine. And there doesn't seem to be any memory 'issues'... Thanks again.
Re: Newbie Text Parsing Question
by Albannach (Monsignor) on Jul 06, 2001 at 00:29 UTC
    If file size is a concern, just keep the recent lines in an array instead of slurping up everything. I just threw this thing together but it seems to do the trick:
    use strict; my $fileglob = shift || '*.pl'; my $pattern = shift || 'hello'; my $keeplines = shift || 3; for my $file (<${fileglob}>) { unless(open(IN, $file)) { warn "Can't read from $file: $!"; next; } my @lines; while(<IN>) { push @lines, $_; shift @lines if(@lines > $keeplines); if(/$pattern/i) { print "--- From $file:---\n@lines\n" } } }

    --
    I'd like to be able to assign to an luser

Re: Newbie Text Parsing Question
by tachyon (Chancellor) on Jul 06, 2001 at 00:31 UTC

    This does the trick with inplace editing - lets you do really big files without a really big memory :-)

    #!/usr/bin/perl -w use strict; my $logfile = "/path/to/logfile"; my $outfile = "/path/to/outfile"; my $find = "hello"; my $second = ''; my $first = ''; my $line = 0; # allow regex unfriendly chars in $find $find = quotemeta $find; open (FILE, "<$logfile") or die "Oops perl says $!"; while (<FILE>) { chomp; $line++; &print_found if /$find/; $second = $first; $first = $_; } sub print_found { open (OUT, ">>$outfile") or die "Oops perl says $!"; print OUT "Line: $line\n"; print OUT "$second\n$first\n$_\n\n"; close OUT; }

    Hope this helps

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: Newbie Text Parsing Question
by Brovnik (Hermit) on Jul 06, 2001 at 00:49 UTC
    My version, editted from the first reply.
    foreach my $fn (<*.log>) { # start with 3 entries to ensure 3 lines my @fifo = ('','',''); open I, $fn or warn("Couldn't open $fn: $!"), next; while (<I>) { #Add current line on one end and remove the first entry push(@fifo,$_); shift(@fifo); if (/monk/) { print '-'x40 ,$/; print "From file [$fn]:\n\n"; print @fifo; print '-'x40 , $/; } } close I; }

    --
    Brovnik
Re: Newbie Text Parsing Question
by particle (Vicar) on Jul 06, 2001 at 01:37 UTC
    this ought to let you look at files of any size, and i think the output format is just what you wanted.
    #!/usr/bin/perl -w use strict; my $ext = '.log'; # extension to look for in filenames my $pat = 'bob'; # adjust to value to search for my $match; # track how many matches opendir(DH, '.') or die("CANT! $!"); # open directory foreach my $diritem ( readdir(DH) ) { # read directory next unless ( -f $diritem && # is it a file? $diritem =~ m/$ext$/ ); # does it end in $ext? open(FH, '<', $diritem) or die("CANT! $!"); # open the file for re +ading my @buffer; # buffer for previous lines push @buffer, scalar <FH> for 1 .. 3; # create three line bu +ffer while(<FH>) { # read the file line by line push @buffer, $_; # add line to end of buffer shift @buffer; # remove line from front of buff +er if( /$pat/ ) { # did i find the search pattern? $match++; # increment my match count # print fancy output, with # separator line, # match counter, filename, # two previous lines and matching line print "-----------------------------------------\n"; print "($match) From file $diritem\n"; print $buffer[0], $buffer[1], $buffer[2]; } # if } # while close FH; # close the file } # foreach
    aah, that was a fun diversion! i searched for 'bob'. you might like to search for something more useful....

    ~Particle

(Follow-Up)Re: Newbie Text Parsing Question
by Hofmator (Curate) on Jul 06, 2001 at 14:08 UTC

    Reading this question I tought at once of the grep family. Something along the lines of % grep -B 2 pattern *.log > outfile which should do the trick. Then I saw Windows 2000 so I thought the PPT or the utilities 'find' or 'findstr' could help out. But sadly all of them don't support any of the context options (like -2 or -B 2 or -C) which I considered standard for these kind of utilities.

    Can somebody tell me why these were not included in the PPT - to be more precise in tcgrep??

    -- Hofmator