Re: Newbie Text Parsing Question
by Abigail (Deacon) on Jul 06, 2001 at 01:18 UTC
|
Here's a slight variation. This one prints the line before and after the
line(s) that match (the amount of lines is controlled by the variable
$range). It uses a circular buffer, but all the functionality
of dealing with circularity is hidden inside a tie mechanism.
#!/opt/perl/bin/perl
use strict;
use warnings;
my $file = "/usr/dict/words";
my $word = "perl";
my $range = 1; # -$range .. $range
my $size = 2 * $range + 1;
sub TIEARRAY {bless [("") x $_ [1]] => $_ [0]}
sub STORE {${$_ [0]} [$_ [1] % @{$_ [0]}] = $_ [2]}
sub FETCH {${$_ [0]} [$_ [1] % @{$_ [0]}]}
sub FETCHSIZE {scalar @{$_[0]}}
sub STORESIZE {die}
tie my @buffer => 'main', $size;
open my $fh => $file or die "Failed to open $file: $!";
while (<$fh>) {
$buffer [$.] = $_;
if ($buffer [$. - $range] =~ /$word/) {
print @buffer [$. - $size + 1 .. $.];
}
}
# Borderline, matches at the end:
for my $line ($. - $range + 1 .. $.) {
print @buffer [$line - $range .. $.] if $buffer [$line] =~ /$word/
+;
}
__END__
-- Abigail
| [reply] [d/l] [select] |
Re: Newbie Text Parsing Question
by VSarkiss (Monsignor) on Jul 06, 2001 at 00:05 UTC
|
Yes and yes.
If the files are small, the easiest way to "back up" is to read the whole thing into memory. Something like this:
# Loop over all files with a .log suffix
foreach my $fn (<*.log>)
{
# Open the file, if possible, and read it all into @f
open I, $fn or warn("Couldn't open $fn: $!"), next;
my @f = <I>;
close I;
# Go through it a line at a time
for (my $i = 0; $i < @f; $i++)
{
# If you find "hello" anywhere in the line,
# Back up two lines and print if possible
if ($f[$i] =~ /hello/)
{
print $f[$i-2] if $i > 1;
print $f[$i-1] if $i > 0;
print $f[$i];
}
# Note, if you don't care about "undefined value"
# warnings, print the three elements without any
# condition.
}
}
If the files are large, this could eat up lots of memory. In that case, you'll have to play games with backing up inside the file, which is trickier. (An exercise for the reader. ;-)
HTH
| [reply] [d/l] |
|
|
Thanks for the reply! The files should be between about 5 and 45 KB. Is this too big? I'll try what you suggested and reply back...
| [reply] |
|
|
Thanks! It worked perfectly. I added a couple lines to output it to a file and make the output a little "prettier", but it suits my purposes just fine.
And there doesn't seem to be any memory 'issues'...
Thanks again.
| [reply] |
Re: Newbie Text Parsing Question
by Albannach (Monsignor) on Jul 06, 2001 at 00:29 UTC
|
If file size is a concern, just keep the recent lines in
an array instead of slurping up everything. I just
threw this thing together but it seems to do the trick:
use strict;
my $fileglob = shift || '*.pl';
my $pattern = shift || 'hello';
my $keeplines = shift || 3;
for my $file (<${fileglob}>) {
unless(open(IN, $file)) { warn "Can't read from $file: $!"; next; }
my @lines;
while(<IN>) {
push @lines, $_;
shift @lines if(@lines > $keeplines);
if(/$pattern/i) {
print "--- From $file:---\n@lines\n"
}
}
}
--
I'd like to be able to assign to an luser | [reply] [d/l] |
Re: Newbie Text Parsing Question
by tachyon (Chancellor) on Jul 06, 2001 at 00:31 UTC
|
This does the trick with inplace editing - lets you do really big files without a really big memory :-)
#!/usr/bin/perl -w
use strict;
my $logfile = "/path/to/logfile";
my $outfile = "/path/to/outfile";
my $find = "hello";
my $second = '';
my $first = '';
my $line = 0;
# allow regex unfriendly chars in $find
$find = quotemeta $find;
open (FILE, "<$logfile") or die "Oops perl says $!";
while (<FILE>) {
chomp;
$line++;
&print_found if /$find/;
$second = $first;
$first = $_;
}
sub print_found {
open (OUT, ">>$outfile") or die "Oops perl says $!";
print OUT "Line: $line\n";
print OUT "$second\n$first\n$_\n\n";
close OUT;
}
Hope this helps
cheers
tachyon
s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print
| [reply] [d/l] |
Re: Newbie Text Parsing Question
by Brovnik (Hermit) on Jul 06, 2001 at 00:49 UTC
|
My version, editted from the first reply.
foreach my $fn (<*.log>)
{
# start with 3 entries to ensure 3 lines
my @fifo = ('','','');
open I, $fn or warn("Couldn't open $fn: $!"), next;
while (<I>)
{
#Add current line on one end and remove the first entry
push(@fifo,$_);
shift(@fifo);
if (/monk/)
{
print '-'x40 ,$/;
print "From file [$fn]:\n\n";
print @fifo;
print '-'x40 , $/;
}
}
close I;
}
-- Brovnik | [reply] [d/l] |
Re: Newbie Text Parsing Question
by particle (Vicar) on Jul 06, 2001 at 01:37 UTC
|
this ought to let you look at files of any size, and i think the output format is just what you wanted.
#!/usr/bin/perl -w
use strict;
my $ext = '.log'; # extension to look for in filenames
my $pat = 'bob'; # adjust to value to search for
my $match; # track how many matches
opendir(DH, '.') or die("CANT! $!"); # open directory
foreach my $diritem ( readdir(DH) ) { # read directory
next unless ( -f $diritem && # is it a file?
$diritem =~ m/$ext$/ ); # does it end in $ext?
open(FH, '<', $diritem) or die("CANT! $!"); # open the file for re
+ading
my @buffer; # buffer for previous lines
push @buffer, scalar <FH> for 1 .. 3; # create three line bu
+ffer
while(<FH>) { # read the file line by line
push @buffer, $_; # add line to end of buffer
shift @buffer; # remove line from front of buff
+er
if( /$pat/ ) { # did i find the search pattern?
$match++; # increment my match count
# print fancy output, with
# separator line,
# match counter, filename,
# two previous lines and matching line
print "-----------------------------------------\n";
print "($match) From file $diritem\n";
print $buffer[0], $buffer[1], $buffer[2];
} # if
} # while
close FH; # close the file
} # foreach
aah, that was a fun diversion! i searched for 'bob'. you might like to search for something more useful....
~Particle | [reply] [d/l] |
(Follow-Up)Re: Newbie Text Parsing Question
by Hofmator (Curate) on Jul 06, 2001 at 14:08 UTC
|
Reading this question I tought at once of the grep family. Something
along the lines of
% grep -B 2 pattern *.log > outfile which should do the trick.
Then I saw Windows 2000 so I thought the PPT
or the utilities 'find' or 'findstr' could help out. But sadly
all of them don't support any of the context options (like
-2 or -B 2 or -C) which I considered standard for these
kind of utilities.
Can somebody tell me why these were not included in the
PPT - to be more
precise in tcgrep??
-- Hofmator
| [reply] [d/l] |