Re: Grab 3 lines before and 2 after each regex hit

Replies are listed 'Best First'.
Re^2: Grab 3 lines before and 2 after each regex hit (sliding window) by LanX (Saint) on Apr 25, 2014 at 20:31 UTC
> This is a fairly primitive way to do it: using a sliding window (safer with huge streams) `use strict; use warnings; use Data::Dump; my @window; push @window, scalar <DATA> for 1..5; # init while (my $line = <DATA>) { push @window, $line; chomp @window; if( $window[3] =~ m/[^\d]+\d+/ ){ dd \@window; } shift @window; } __END__ alpha beta something a07607 b-alpha b-beta b-something b-something else c-alpha c-beta c-somethin a9706 d-alpha d-beta d-something d-something else` [download] --> `["alpha", "beta", "something", "a07607", "b-alpha", "b-beta"] ["c-alpha", "c-beta", "c-somethin", "a9706", "d-alpha", "d-beta"]` [download] Cheers Rolf ( addicted to the Perl Programming Language) update maybe more elegant `use strict; use warnings; use Data::Dump; my @window; while (my $line = <DATA>) { push @window, $line; next if @window < 6; # init if( $window[3] =~ m/[^\d]+\d+/ ){ dd \@window; } shift @window; }` [download] Update Oh the latter (more elegant) approach has a clear advantage, if you want to avoid overlapping results you just need to empty the window after a match and it gets automatically refilled. :)	[reply] [d/l] [select]
Re^3: Grab 3 lines before and 2 after each regex hit (sliding window) by HarryPutnam (Novice) on Apr 25, 2014 at 21:48 UTC
The sliding window sounds like another great suggestion Thank you.	[reply]
Re^2: Grab 3 lines before and 2 after each regex hit by HarryPutnam (Novice) on Apr 25, 2014 at 19:32 UTC
Your techinque answers the need nicely... thank you for(1..$#lines) { if($lines$_=~m/^\d+\d+/){ print qq~ ....... ... ~; I guess that `pp~' operates something like a here document? Can you explain a bit?	[reply]
Re^3: Grab 3 lines before and 2 after each regex hit by choroba (Cardinal) on Apr 25, 2014 at 19:36 UTC
Can you explain a bit? Quote-Like Operators in perlop - Perl operators and precedence. لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ	[reply]
Re^4: Grab 3 lines before and 2 after each regex hit by HarryPutnam (Novice) on Apr 25, 2014 at 20:26 UTC
http://perldoc.perl.org/perlop.html#Quote-Like-Operators Egad... I understood about 1/10 of a percent of that. I see the authors have made a fairly extensive effort to make the explanations readable... but still seems aimed at an audience several good steps above me. Or possibly just a few layers less of `thickskulledness'. Finally resorted to skipping thru and reading every place qq appears. However, I came away mostly with my poor pea brain swimming. I never really recognized your usage in those pages. One small thing that did stay with me: qq means what follows is interpolated... beyond that and even that itself, sails right over my head.	[reply]
Re^5: Grab 3 lines before and 2 after each regex hit by choroba (Cardinal) on Apr 25, 2014 at 20:42 UTC
Re^6: Grab 3 lines before and 2 after each regex hit by HarryPutnam (Novice) on Apr 25, 2014 at 21:23 UTC
Re^2: Grab 3 lines before and 2 after each regex hit by HarryPutnam (Novice) on Apr 25, 2014 at 21:00 UTC
Can we go a little deeper into the intended usage of the techniques mentioned in this thread? I haven't understood everything that has been presented but enough to use some of the infomation posted and complete a working script for my purpose soon. There was some talk of slurping sections or even whole filesfiles: On that topic; let me explain very briefly what the intended usage is. The code will be used to search and extract thru some fairly massive piles of files at times Once File::Find is added into the script it will likely be expected to recurse thru usenet style hierarchies (hierarchies of my own creation, so smaller than real ones) that might consist of as many as 45000-55000 messages in total (not per group) So, with that scale of usage in mind would slurping of whole files still be a wise way to go? Or would that be so labor intensive as to make it worth while to do it a different way?	[reply]
Re^2: Grab 3 lines before and 2 after each regex hit by locked_user sundialsvc4 (Abbot) on Apr 24, 2014 at 18:25 UTC
Marvelously elegant, if the file-size is not too big ... as these days it is unlikely to be. ++
Re^3: Grab 3 lines before and 2 after each regex hit by Anonymous Monk on Apr 27, 2014 at 06:34 UTC
This is the usual sycophantic crap you post after you've been called out on a series of junk posts filled with lies and bad advice. Anyone reading your post history will be familiar with this pattern.	[reply]

update

Update