vyeddula has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

This is regarding regex backreference. I am not able to retrieve the matched string by writing the code below.As usual i need some enlightenment here.

#!/usr/bin/perl -w use strict; $_='1:A silly sentence (495,a) *BUT* one which will be useful.(3)'; print"Enter a regular expression:"; my $expression=<STDIN>; chomp($expression); if(/$expression/) { print"The expression matches the pattern\n"; print"\$1 is $1\n",if defined $1; print"\$2 is $2\n",if defined $2; print"\$3 is $3\n",if defined $3; } else { print"The pattern is not found \n"; }

Output

Enter a regular expression:\w\w\w The expression matches the pattern root@# This is not printing what is stored in $1,$2 etc.. if exists.

Replies are listed 'Best First'.
Re: Regex backreference
by hdb (Monsignor) on Jun 12, 2013 at 17:19 UTC

    You need to enter capture groups into your $expression, otherwise $1, $2, etc will be undef. For example, if you enter (\w\w\w) instead of \w\w\w, your program prints $1 is sil.

Re: Regex backreference
by AnomalousMonk (Archbishop) on Jun 12, 2013 at 18:26 UTC
    This is regarding regex backreference.

    I'm not sure you're using the right terminology, but here's an example in which a regex with a backreference is entered, matches and captures:

    >perl -wMstrict -le "$_ = '1:A silly sentence (495,a), silly but useful.(3)'; ;; print 'Enter a regular expression:'; my $expression = <STDIN>; chomp($expression); print qq{expression is '$expression'}; ;; if (/$expression/) { print 'The expression matches the string'; print qq{\$1 is '$1'} if defined $1; print qq{\$2 is '$2'} if defined $2; print qq{\$3 is '$3'} if defined $3; } else { print 'The expression does not match'; } " Enter a regular expression: (\w+).*(\1) expression is '(\w+).*(\1)' The expression matches the string $1 is 'silly' $2 is 'silly'

    See discussion of backreferences in Capture groups in perlre.

      I copied and executed the same script but it is not prompting me what stored in $1 or $2 in spite of pattern match.

        BTW: Here's a version that handles any number of capture groups.
        Question: Why does capture group 2 ($2) in the  (\w+).*(\d{2,}).*(\1) example only capture '95'? Shouldn't  (\d{2,}) match and capture "the maximum of 2 or more decimal digits", i.e., '495', as it did in the third example?

        >perl -wMstrict -le "$_ = '1:A silly sentence (495,a), silly but useful.(3)'; ;; EXPRESSION: { print qq{\n}; print 'Enter a regular expression:'; my $expression = <STDIN>; last EXPRESSION unless $expression =~ m{ \S }xms; chomp($expression); print qq{Expression is '$expression'}; ;; if (! defined($expression = eval qq{qr/$expression/})) { print qq{Regex error: $@}; redo EXPRESSION; } ;; if ($_ !~ $expression) { print 'The expression does not match the string'; redo EXPRESSION; } print 'The expression matches the string'; ;; if ($#- < 1) { print qq{No capture groups}; redo EXPRESSION; } ;; for my $cg (1 .. $#-) { printf qq{capture group \$$cg is '%s' starting at offset %d \n}, substr($_, $-[$cg], $+[$cg]-$-[$cg]), $-[$cg]; } redo EXPRESSION; } ;; print 'done'; " Enter a regular expression: foo Expression is 'foo' The expression does not match the string Enter a regular expression: \d{2,} Expression is '\d{2,}' The expression matches the string No capture groups Enter a regular expression: (\d{2,}) Expression is '(\d{2,})' The expression matches the string capture group $1 is '495' starting at offset 20 Enter a regular expression: (\w+).*(\1) Expression is '(\w+).*(\1)' The expression matches the string capture group $1 is 'silly' starting at offset 4 capture group $2 is 'silly' starting at offset 28 Enter a regular expression: (\w+).*(\d{2,}).*(\1) Expression is '(\w+).*(\d{2,}).*(\1)' The expression matches the string capture group $1 is 'silly' starting at offset 4 capture group $2 is '95' starting at offset 21 capture group $3 is 'silly' starting at offset 28 Enter a regular expression: \d*** Expression is '\d***' Regex error: Nested quantifiers in regex; marked by <-- HERE in m/\d** <-- HERE */ at (eval 6) line 1, <STDIN> line 6. Enter a regular expression: silly Expression is 'silly' The expression matches the string No capture groups Enter a regular expression: done

        what pattern did you enter?