OCTweak has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

On the following code, could someone explain me, why in the first match, the \1 reference, did not block the double quote from being considered part of $2 ?

#!/usr/bin/perl use strict; use warnings; my $line; print "\n"; $line = '<a href="http://www.mysite.com/" target="_blank">'; $line =~ /\shref=(["'])?([^\s\>\1]+)/; # FIRT MATCH if (defined $1) {print "\$1= $1\n";} else {print "1) not defined\n";} if (defined $2) {print "\$2= $2\n";} else {print "2) not defined\n";} print "\n"; $line =~ /\shref=(["'])?([^\s\>"']+)/; # SECOND MATCH if (defined $1) {print "\$1= $1\n";} else {print "1) not defined\n";} if (defined $2) {print "\$2= $2\n";} else {print "2) not defined\n";}

==Program Output
$1= "
$2= http://www.mysite.com/"

$1= "
$2= http://www.mysite.com/

Replies are listed 'Best First'.
Re: Backreferences
by Abigail-II (Bishop) on Jul 23, 2003 at 15:00 UTC
    You can't use \1 inside a character class. It doesn't have a special meaning.

    Abigail

Re: Backreferences
by snax (Hermit) on Jul 23, 2003 at 15:07 UTC
    You want:
    $line =~ /\shref=(["'])?([^\s\>]+)\1/;
    -- you're grabbing the first matched quote character in the parentheses because, as noted above, you can't use the backreference the way you wanted. Once you have the opening quote, though, you can use it as shown in this regex to terminate your match.

    (edited for clarity)

      You want:
      $line =~ /\shref=(["'])?([^\s\>]+)\1/;
      No, that's not quite right, because he wants to find hrefs that are (probably) mistakes, stopping at the first space or > character. Something like this is closer, although I'm sure the regex gurus can improve on it:
      $line =~ /\shref=(["'])?(.+?)(?=[\s>]|\1|$)/;
Re: Backreferences
by artist (Parson) on Jul 23, 2003 at 15:06 UTC
    \1 as an item in character class doesn't mean anything.
Re: Backreferences
by Anonymous Monk on Jan 06, 2018 at 07:02 UTC

    It seems your else-statements have syntax errors.

      "...syntax errors."
      #!/usr/bin/env perl use strict; use warnings; my $line; print "\n"; $line = '<a href="http://www.mysite.com/" target="_blank">'; $line =~ /\shref=(["'])?([^\s\>\1]+)/; # FIRT MATCH if ( defined $1 ) { print "\$1= $1\n"; } else { print "1) not defined\n"; } if ( defined $2 ) { print "\$2= $2\n"; } else { print "2) not defined\n"; } print "\n"; $line =~ /\shref=(["'])?([^\s\>"']+)/; # SECOND MATCH if ( defined $1 ) { print "\$1= $1\n"; } else { print "1) not defined\n"; } if ( defined $2 ) { print "\$2= $2\n"; } else { print "2) not defined\n"; } __END__ karls-mac-mini:playground karl$ perl -c testomato.pl testomato.pl syntax OK

      Apparently not. How did you jump to this conclusion?

      «The Crux of the Biscuit is the Apostrophe»

      perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help

        Apparently the AM is referring to the wrong spelling of firth in the comment just before the first if

        Ah, yes. See also Firt.

        Thanks and best regards, Karl

        «The Crux of the Biscuit is the Apostrophe»

        perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help