how to get context between two flag

cxfcxf has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: how to get context between two flag by ELISHEVA (Prior) on Aug 07, 2009 at 01:18 UTC
If your ending tag is a constant string, e.g. `'end'` then the easiest way is to just set your end-of-record marker to that value using the Perl variable `$/`. You can learn more about `$/` in perlvar. That way you will be sure to read in the entire run from `'start'` to `'end'` in a single gulp and the new lines won't cause you any trouble if 'start' is on one line and 'end' is on another. For example, use strict; use warnings; local $/='end'; my @aFound; while (my $line=<DATA>) { # normally we would chomp to get rid of 'end' # but this may not be a good idea if the file ends in # junk outside of start ... end. # chomp $line; # make sure we really have a match just in case there is # an 'end' without a preceding "start"! # also use the s modifier at the end of your regex so that # . matches "\n" # see http://perldoc.perl.org/perlre.html#Modifiers # for further information next unless ($line =~ /\s+start\s+(.)end\z/s); # store extracted string for later use push @aFound, $1; } # do something with the text between start...end # you'll want to change this: # your post looks like you would like to further separate # each word on a separate line, but for now, lets just # print out the run of characters between start...end. print join("\n", @aFound), "\n"; __DATA__ asdasd start asdasd asdasdasd asdasdas end asdasdas adasdas start as asdas dasdasdad asdasddas end qweqwe asdasd start asdsadsdasddasds sdasdas asdasdasdasd asdasdsa asdasd asdasdasd end here is some trailing garbage [download] which outputs `asdasd asdasdasd asdasdas as asdas dasdasdad asdasddas asdsadsdasddasds sdasdas asdasdasdasd asdasdsa asdasd asdasdasd` [download] Best, beth Update* Fixed bug (missing s modifier on regex).	[reply] [d/l] [select]
Re^2: how to get context between two flag by Marshall (Canon) on Aug 07, 2009 at 07:47 UTC
I liked this $/='end'; idea. I've been working on another approach using the / /.../ / operator. I will admit that I have not mastered this technique, but it appears to be designed for processing multi-line records. The code below produces the correct result, but my "gut feeling" is that it is overly complex. I hope some other Monk can show a better way with the "..." operator. This operator is weird in that it sometimes returns values in exponential format, like 3E0 instead of just 3. I haven't figured out how to use this info in the most efficient way yet. Actually below, this info is not used, turn on the print statements to see what this does - it is interesting. Anyway here is yet another approach for the OP to experiment with! Update: my brain is working slowly today, but Perl DBI folks will be familiar with 0E0. This is the Perl way to return a "TRUE" value for numeric zero. I'm not sure how this xE0 stuff can be used here... Update:I guess this is tangential to this discussion, but if you ever wondered "how can I return a "true" value meaning that the function worked and at the same time say that "zero" results were produced, returning the string '0E0' will do that trick. #!/usr/bin/perl -w use strict; my $line=(); while (<DATA>) { next if /^\s$/; # $flag is not necessary here, it is there to # show the return value of this triple dot operator # for /start/.../end/ if ( my $flag = ( /start /.../end/) ) { s/end./end/s; s/.*?start/start/; s/\n//; $line .= "$_"; # print "$flag\n"; #interesting 1, 2, 3E0 etc.... if ( $_=~ m/end$/ ) { print "OUT:$line\n"; $line =(); } } } #Prints: #OUT:start 1asdasd asdasdasd asdasdas end #OUT:start 2as asdas dasdasdad asdasddas end #OUT:start 3asdsadsdasddasds sdasdas asdasdasdasd asdasdsa asdasd asda +sdasd and this is an evenlonger way to stop a line with )&)9867 some +end #OUT:start 4another line end __DATA__ asdasd start 1asdasd asdasdasd asdasdas end asdasdas adasdas start 2as asdas dasdasdad asdasddas end qweqwe asdasd start 3asdsadsdasddasds sdasdas asdasdasdasd asdasdsa asdasd asdasdasd and this is an even longer way to stop a line with )&)9867 some end garbage start 4another line end abc [download]	[reply] [d/l]
Re: how to get context between two flag by Marshall (Canon) on Aug 07, 2009 at 00:44 UTC
Use of / /../ / operator is a good idea! But here I just show one simple approach that is easy to debug. There are Lot's of ways to do what you need. Update: I just assumed that the last line sequence was all one line as you had blank lines before other examples - maybe a bad assumption - bichonfrise74's code looks good to me also - there are many ways Rome here. #!/usr/bin/perl -w use strict; while (<DATA>) { next if /^\s$/; #skip blank lines chomp; #optional as \n would get deleted anyway s/^.?start\s+//; #remove start and all before s/end.//; #remove end and all after print "$_\n"; } #prints: # asdasd asdasdasd asdasdas # as asdas dasdasdad asdasddas # asdsadsdasddasds sdasdas asdasdasdasd asdasdsa asdasd asdasdasd __DATA__ asdasd start asdasd asdasdasd asdasdas end asdasdas adasdas start as asdas dasdasdad asdasddas end qweqwe asdasd start asdsadsdasddasds sdasdas asdasdasdasd asdasdsa asdasd asd +asdasd end [download] Update:* sorry for goof, brain isn't working full speed today! `s/^.?start/start/; s/end./end/;` [download] preserves start and end tokens.	[reply] [d/l] [select]
Re: how to get context between two flag by bichonfrise74 (Vicar) on Aug 07, 2009 at 00:23 UTC
Are you looking for something like this? `#!/usr/bin/perl use strict; while(<DATA>) { my ($line) = $_ =~ /\b(start\s.*end)\b/; print "$line\n" if ( $line ); } __DATA__ asdasd start asdasd asdasdasd asdasdas end asdasdas adasdas start as asdas dasdasdad asdasddas end qweqwe asdasd start asdsadsdasddasds sdasdas asdasdasdasd asdasdsa asdasd asd +asdasd end` [download]	[reply] [d/l]
Re^2: how to get context between two flag by bichonfrise74 (Vicar) on Aug 07, 2009 at 00:38 UTC
Oops, I thought your 3rd line is just one big line... Didn't see that it was broken into 3rd and 4th line. Anyway, I modified the code... so, this should do the trick. I'm not sure how to update my existing comment, that's why I had to create a new one. `#!/usr/bin/perl use strict; local $/ = "\n\n"; while( <DATA>) { my ($line) = $_ =~ /\b(start\s.\n?.end)\b/; $line =~ s/\n/ /g if ( $line ); print "$line\n" if ( $line ); } __DATA__ asdasd start asdasd asdasdasd asdasdas end asdasdas adasdas start as asdas dasdasdad asdasddas end qweqwe asdasd start asdsadsdasddasds sdasdas asdasdasdasd asdasdsa asdasd asdasdasd end ds start asda end` [download]	[reply] [d/l]
Re^2: how to get context between two flag by cxfcxf (Novice) on Aug 07, 2009 at 00:51 UTC
the result is the same to mine... last line of file is `asdasd start asdsadsdasddasds sdasdas asdasdasdasd asdasdsa asdasd asdasdasd end` [download] there is a \n at end of the first line	[reply] [d/l]