You need to enter capture groups into your $expression, otherwise $1, $2, etc will be undef. For example, if you enter (\w\w\w) instead of \w\w\w, your program prints $1 is sil.
| [reply] [d/l] [select] |
This is regarding regex backreference.
I'm not sure you're using the right terminology, but here's an example in which a regex with a backreference is entered, matches and captures:
>perl -wMstrict -le
"$_ = '1:A silly sentence (495,a), silly but useful.(3)';
;;
print 'Enter a regular expression:';
my $expression = <STDIN>;
chomp($expression);
print qq{expression is '$expression'};
;;
if (/$expression/) {
print 'The expression matches the string';
print qq{\$1 is '$1'} if defined $1;
print qq{\$2 is '$2'} if defined $2;
print qq{\$3 is '$3'} if defined $3;
}
else {
print 'The expression does not match';
}
"
Enter a regular expression:
(\w+).*(\1)
expression is '(\w+).*(\1)'
The expression matches the string
$1 is 'silly'
$2 is 'silly'
See discussion of backreferences in Capture groups in perlre.
| [reply] [d/l] |
| [reply] |
BTW: Here's a version that handles any number of capture groups.
Question: Why does capture group 2 ($2) in the (\w+).*(\d{2,}).*(\1) example only capture '95'? Shouldn't (\d{2,}) match and capture "the maximum of 2 or more decimal digits", i.e., '495', as it did in the third example?
>perl -wMstrict -le
"$_ = '1:A silly sentence (495,a), silly but useful.(3)';
;;
EXPRESSION: {
print qq{\n};
print 'Enter a regular expression:';
my $expression = <STDIN>;
last EXPRESSION unless $expression =~ m{ \S }xms;
chomp($expression);
print qq{Expression is '$expression'};
;;
if (! defined($expression = eval qq{qr/$expression/})) {
print qq{Regex error: $@};
redo EXPRESSION;
}
;;
if ($_ !~ $expression) {
print 'The expression does not match the string';
redo EXPRESSION;
}
print 'The expression matches the string';
;;
if ($#- < 1) {
print qq{No capture groups};
redo EXPRESSION;
}
;;
for my $cg (1 .. $#-) {
printf qq{capture group \$$cg is '%s' starting at offset %d \n},
substr($_, $-[$cg], $+[$cg]-$-[$cg]), $-[$cg];
}
redo EXPRESSION;
}
;;
print 'done';
"
Enter a regular expression:
foo
Expression is 'foo'
The expression does not match the string
Enter a regular expression:
\d{2,}
Expression is '\d{2,}'
The expression matches the string
No capture groups
Enter a regular expression:
(\d{2,})
Expression is '(\d{2,})'
The expression matches the string
capture group $1 is '495' starting at offset 20
Enter a regular expression:
(\w+).*(\1)
Expression is '(\w+).*(\1)'
The expression matches the string
capture group $1 is 'silly' starting at offset 4
capture group $2 is 'silly' starting at offset 28
Enter a regular expression:
(\w+).*(\d{2,}).*(\1)
Expression is '(\w+).*(\d{2,}).*(\1)'
The expression matches the string
capture group $1 is 'silly' starting at offset 4
capture group $2 is '95' starting at offset 21
capture group $3 is 'silly' starting at offset 28
Enter a regular expression:
\d***
Expression is '\d***'
Regex error: Nested quantifiers in regex; marked by <-- HERE
in m/\d** <-- HERE */ at (eval 6) line 1, <STDIN> line 6.
Enter a regular expression:
silly
Expression is 'silly'
The expression matches the string
No capture groups
Enter a regular expression:
done
| [reply] [d/l] [select] |
| [reply] [d/l] |