Re: grab 'n' lines from a file above and below a /match/

Obligatory Tie::File solution. I haven't done any benchmarks, but I would guess it's as fast as the other Perl solutions while being less memory-intensive.

As others have said, /bin/grep is the way to go here.

#!/usr/bin/perl 

use strict;
use warnings;
use Tie::File;
use Fcntl 'O_RDONLY';

my $DEBUG = 0;
my $text = qr/c9391b56-b174-441b-921c-7d63/;
my $file = 'GWSvc.log';
my $context = 3;

sub dprint { print @_ if $DEBUG };

my @lines;
tie @lines, 'Tie::File', $file, mode => O_RDONLY 
    or die "tie failed: $!";

for (my $i = 0; $i <= $#lines; $i++) {
    dprint "SCAN: line $i\n";
    if ($lines[$i] =~ /$text/) {
        dprint "MATCH at line $i\n";
        my $start = $i - $context;
        if ($start < 0) {
            $start = 0;
        };
        my $end = $i + $context;
        for my $j ($start .. $end) {
            dprint "$j: ";
            print "$lines[$j]\n";
        };
        print "\n";
        $i += $context;
    };
};
[download]

Comment on Re: grab 'n' lines from a file above and below a /match/ Download Code

Replies are listed 'Best First'.
Re^2: grab 'n' lines from a file above and below a /match/ by Aristotle (Chancellor) on Sep 17, 2004 at 06:23 UTC
It's actually slower and more memory intensive than any of the other solutions. Tie::File internally keeps a list of byte offsets for all the lines, and it needs lot of additional overhead that is supposed to optimize writes which you never make any use of. Your code also doesn't get the edge cases right: if there's a match within less than `$context` lines of the previous, it will be missed. You gave me an idea with regards to memory consumption, though: `#!/usr/bin/perl use strict; use warnings; use Fcntl qw( :seek ); my $rx = qr/c9391b56-b174-441b-921c-7d63/; my $to_print = 0; my $context = 10; my @offs = ( 0 ) x ( 1 + $context ); while(<>) { my $context_start = shift @offs; my $here = tell ARGV; push @offs, $here; if( /$rx/ ) { if( not $to_print ) { my $length = $here - $context_start; seek ARGV, $context_start, SEEK_SET; read ARGV, $_, $length; } $to_print = 1 + $context; } --$to_print, print if $to_print; }` [download] This only needs to keep `$context` offsets in memory. Update: fixed bugs. It was `( 0 ) x $context` which gave one too few lines of before-context and `$here - $context_start + length` which of course ate too much input — but that wasn't obvious with my test data. Oopsie. Makeshifts last the longest.	[reply] [d/l]
Re^3: grab 'n' lines from a file above and below a /match/ by mrpeabody (Friar) on Sep 20, 2004 at 03:07 UTC
It's actually slower and more memory intensive than any of the other solutions. Tie::File internally keeps a list of byte offsets for all the lines, and it needs lot of additional overhead that is supposed to optimize writes which you never make any use of. Oops. Guessed wrong, then. Your code also doesn't get the edge cases right: if there's a match within less than $context lines of the previous, it will be missed. That was intentional, and it depends on your definition of "missed". That hit will be printed with the context of the previous hit. Changing the behavior would just require removing the line: `$i += $context;` [download]	[reply] [d/l]


XP is just a number
	PerlMonks