in reply to Grab 3 lines before and 2 after each regex hit
I can think of two approaches off the top of my head, other monks will likely have MTOWTDI:
1. Read multiple lines at a time (if your files aren't too big, just read the whole file), write a regex that matches your pattern and also captures the lines before and after your pattern. Writing a regex that matches multiple lines is not too difficult once you've read about the following topics in perlre: the /m and /s modifiers, and the exact meaning of ^, $, ., \s and \n. Also, see I'm having trouble matching over more than one line. What's wrong?
Here's a somewhat inelegant regex that captures the lines before and after a match:
my $input = "line1\nline2\nline3 foo\nline4\nline5"; my ($before,$match,$after) = $input=~/^ (?:(.*)\n)? (.*foo.*) (?:\n (? +:(.*)\n?)? )? /xm; print "before=<$before>, match=<$match>, after=<$after>\n";
2. Keeping a buffer of lines, i.e. an array which always contains the most recent N lines. Such an array could be managed via push and shift. In other words, a sliding window of sorts. This approach would probably be considered less "perlish" than the first, but might be more efficient on large files. Actually, Tie::File may be better than managing the array yourself.
InfiniteSilence just posted an answer that uses the array approach.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Grab 3 lines before and 2 after each regex hit
by HarryPutnam (Novice) on Apr 25, 2014 at 21:33 UTC |