in reply to regular expression - grabbing everything problem
If you wish to process the file line by line, you can use the flip-flop operator. This would make it unnecessary to use explicit control flags. Here's an example:
while( <DATA> ) { print if /^Header2:/ .. eof; } __DATA__ <lsmothers@example.com> SMTP 0<001501c4db9b$db8b2680$2d01a8c0@ryand9v889t9uc> .X-Intermail-Unknown-MIME-Type=unparsedmessage Header2: <headertwo@example.com Received: from server.cluster1.example.com ([10.20.201.160]) line 12
Updated as suggested in a followup to this post, by using eof as the RHS of the flip-flop. Nice if the script is altered to read from <>, as per the suggestion in a followup to this post.
It seems strange to use Flip flop if you're only concerned with the initial flip. But it does work nicely. And if you're processing more than one file it can be used to catch the end of file to reset the search for the next file. The flip flop operator is discussed in the "Range Operators" section of perlop, as it's the same '..' operator.
If you prefer to slurp the file into a string and process accordingly, you can do it like this:
my $input = do { local $/ = undef; <DATA> }; if ( $input =~ /^(Header2:.+)/ms ) { print $1; }
Or even...
my $input; { local $/ = undef; $input = <DATA>; } print join '', ( split /^(Header2:)/m, $input, 3 )[ 1, 2 ];
The split method could be altered to avoid capturing by using a lookahead assertion as the split point, like this:
print join '', ( split /^(?=Header2:)/m, $input, 2)[1];
This method creates only two elements; the one we don't want, and the one we do. The other split method created three elements; the one we don't want, the trigger text, and the rest of what we want to keep, so for that we have to specify that we want both elements 1 and 2.
One liner versions of each of the above:
perl -ne 'print if /^Header2:/ .. eof' testdata.txt perl -0777 -ne '/^(Header2:.+)/ms and print $1' testdata.txt perl -0777 -pe '$_=join q//,(split /^(Header2:)/m,$_,3)[1,2]' testdata +.txt perl -0777 -pe '$_=join q//,(split /^(?=Header2:)/m,$_,2)[1]' testdata +.txt
Dave
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: regular expression - grabbing everything problem
by jwkrahn (Abbot) on Aug 09, 2011 at 00:42 UTC | |
by davido (Cardinal) on Aug 09, 2011 at 00:53 UTC |