how to extract text between 2 strings on separate lines

ozosan has asked for the wisdom of the Perl Monks concerning the following question:

Im trying to write the script for my task which is to extract text between to strings on separate lines, while I have the command to run it from the command line I am not sure how to do it in order to get the results into the file, any ide how to do it ? Heres is one liner that works fine :

perl -ne "BEGIN { @ARGV = map glob, @ARGV }; print if /^start\b$/ .. /^end\b$/ " input/*

so far I have this routine which prints entire content of the files which is not what I want :(:

use strict;
use warnings;
my $record = "";
opendir (DIR, "C:/Users/input/") or die "$!";
my @files = readdir DIR;
close DIR;
splice (@files,0,2);

open(MYOUTFILE, ">>output/output.txt");
foreach my $file (@files) {
open (CHECKBOOK, "binput/$file")|| die "$!";
while ($record = <CHECKBOOK>) {

     if ($record=~ /^start\b$/ .. /^end\b$/)  {
     print MYOUTFILE "$file;$record\n";

    }
}
   close(CHECKBOOK);

}
close(MYOUTFILE);
[download]

Comment on how to extract text between 2 strings on separate lines Select or Download Code

Replies are listed 'Best First'.
Re: how to extract text between 2 strings on separate lines by NetWallah (Canon) on Nov 13, 2013 at 16:44 UTC
A couple of other nits: You do not need the "\b" in the regex - /^start$/ provides sufficient bounding The flip-flop operator does not reset between files. See articles in PM and SO. Update:Fixed "flop" typo (Thanks, LanX) When in doubt, mumble; when in trouble, delegate; when in charge, ponder. -- James H. Boren	[reply]
Re^2: how to extract text between 2 strings on separate lines by ozosan (Initiate) on Nov 14, 2013 at 09:36 UTC
Thank you very much for your help indeed there was a precedence issue so $record =~ /^start\b$/ .. $record =~ /^end\b$/ fixed the issue...many thanks indeed. and as for the "\b" in the regex yes you are right its not needed, thank you.	[reply]
Re: how to extract text between 2 strings on separate lines by Eily (Monsignor) on Nov 13, 2013 at 16:27 UTC
Look at the precedence of operators, `=~` is of higher precedence than `..`. So `$record=~ /^start\b$/ .. /^end\b$/` is interpreted as `(scalar $record =~ /^start\b$/) .. (scalar /^end\b$/)` (in this case the scalar keyword doesn't change the result, so you can pretend they are not there to make it clearer). So you actually have: `(scalar $record =~ /^start\b$/) .. (scalar $_ =~ /^end\b$/)` because match operations work on $_ by default. `while (<CHECKBOOK>) { if (/^start\b$/ .. /^end\b$/) { print MYOUTFILE "$file;$_\n"; } }` [download] should work better. Or `$record =~ /^start\b$/ .. $record =~ /^end\b$/` if you want to use $record, but that looks more noisy.	[reply] [d/l] [select]