#!/usr/bin/perl
use Modern::Perl;
# 934221
# find a string, then print the next ten chars
my @content = qw/abcdABCD1234567890xyz abcd12345ABCD0ABCD ABCD1234ABC
+qwertyABCD1234567890/;
for my $content(@content) {
$content =~ /.+?ABCD(.{10}).*/;
say "Current array element is: $content";
if ($1) {
say "\t Next 10 char after the match: $1";
next;
}
}
And the result of executing this script is:
Current array element is: abcdABCD1234567890xyz
Next 10 char after the match: 1234567890
Current array element is: abcd12345ABCD0ABCD
Current array element is: ABCD1234ABC
Current array element is: qwertyABCD1234567890
Next 10 char after the match: 1234567890
The second and third array elements don't satisfy the regex because there are NOT 10 chars after the last instance of the sequence ABCD.
SOLVED, below Now a question for wiser heads: add, immediately after creation of the array, another element to @content, namely, $content[4] with this line: push @content, 'ABCD 123 456 789';. Run the code. This is the output from 5.012 on a win32 box:
Current array element is: abcdABCD1234567890xyz
Next 10 char after the match: 1234567890
Current array element is: abcd12345ABCD0ABCD
Current array element is: ABCD1234ABC
Current array element is: qwertyABCD1234567890
Next 10 char after the match: 1234567890
Current array element is: ABCD 123 456 789
Why doesn't the regex see a match in $content[4]
<UPDATE:> And WTH does this minor revision,
my @content = qw/abcdABCD1234567890xyz abcd12345ABCD0ABCD ABCD1234ABC
+qwertyABCD1234567890/;
push @content, 'ABCD 123 456 789';
say "===> \$content[4]: $content[4] \n\n";
for my $content(@content) {
$content =~ /.+?ABCD(.{10}).*/;
say "Current array element is: $content";
if ($1) {
say "\t Next 10 char after the match: $1";
}else{
say "No match on $content";
}
}
...produce this:
===> $content[4]: ABCD 123 456 789
Current array element is: abcdABCD1234567890xyz
Next 10 char after the match: 1234567890
Current array element is: abcd12345ABCD0ABCD
Next 10 char after the match: 1234567890
Current array element is: ABCD1234ABC
Next 10 char after the match: 1234567890
Current array element is: qwertyABCD1234567890
Next 10 char after the match: 1234567890
Current array element is: ABCD 123 456 789
Next 10 char after the match: 1234567890
Duh! The answer to the stricken question is that $1 remains unchanged unless a new match is found... so its content is unchanged from the initial (successful) match when the regex fails on $content[1] and $content[2], then gets replaced (with the exact same thing) in $content[3] and remains unchanged when the regex fails on $content[4].
Update2: For a discussion of the "defensive programing" required to avoid the dumb coding in the stricken material, see What happens with empty $1 in regular expressions? (was: Regular Expression Question). The following code uses that practice, as best I understand it with regard to numbered captures:
for my $content(@content) {
# my $match = ''; # explicit but verbose
# my $match; # still explicit and only slightly less verbo
+se; same effect
my ($match) = $content =~ /.+?ABCD(.{10}).*/; # less code; same e
+ffect
say "Current array element is: $content";
if ($match) {
say "\t Next 10 char after the match: $1";
}else{
say "No match on $content";
}
}
:-) (...and apologies to all the electrons inconvenienced by the posting of the stricken part of this node)
Update 3 (10/30): Moritz pointed out that the initial code -- that which I initially tested -- failed because $1 is not reset unless there is a new match. His kind comments led me to discover that I had solved that (as in Update 2, above, o/a 10/27) and thus to get me off that track. Lo-and-behold, curing the tunnel vision led to a more open-minded review of the regex. Aha! (The explanation appears in the note, 'SOLVED!':
my @content = qw/abcdABCD1234567890xyz
abcd12345ABCD0ABCD
ABCD1234ABC
qwertyABCD12diff7890/;
# push @content, 'ABCD 123 456 789'; # See note "SOLVED!" below
push @content, 'x ABCD 123 456 789'; # afterthought addition
for my $content(@content) {
my ($match) = $content =~ /.+?ABCD(.{10}).*/; # avoid probs w/no
+n-reset of $1
say "Current array element is: $content";
if ( $match ) {
say "\t MATCH! Next 10 char after the match: $1";
} else {
say "\t No match on $content";
}
}
=head
# SOLVED! why last array element failed to match: it initially began w
+ith 'ABCD...'
# BUT the regex required something ( '.+?' ) before ' ABCD(.{10} '
# And a better fix would be to write the regex as:
# '/.+?ABCD(.{10}).*/'
# or as: '/ABCD(.{10}).*/'
C:\>934221.pl
Current array element is: abcdABCD1234567890xyz
MATCH! Next 10 char after the match: 1234567890
Current array element is: abcd12345ABCD0ABCD
No match on abcd12345ABCD0ABCD
Current array element is: ABCD1234ABC
No match on ABCD1234ABC
Current array element is: qwertyABCD12diff7890
MATCH! Next 10 char after the match: 12diff7890
Current array element is: x ABCD 123 456 789
MATCH! Next 10 char after the match: 123 456 7
C:\>
=cut
|