samizdat has asked for the wisdom of the Perl Monks concerning the following question:

Hello, wise and compassionate Monks:

I have a situation in my code where I'm identifying data using patterns, and sometimes the patterns are not matched. What is happening to me is that when a pattern does not match, a previous value is being kept and used. According to perldoc perlre and Programming Perl, these values are not reset until the end of the current context block or the next successful pattern match.

I've managed to fix the problem using an artificial block, like this example, but I'm wondering if there's a better way.

if (1) # extra if resets $1 backreference { $filename =~ /^([^_]+)_([^_]+)/; $rs[0] = $1; $rs[1] = $2; } # end of extra if (1) open (TESTF, "<./$filename") or die "open\n"; # now process lines which may or may not have matches while (my $line = <TESTF>) { chomp $line; if ($line =~ /^\x7CTARGET/) { my ( $kw, $dtarget, $kw2, $dorient ) = split( /\x7C/, $line ); $dorient =~ /^([^_]+)_/; $rs[2] = $1; if (!defined($rs[2])) { $rs[2] = 'nocount'; } } elsif (...) { ... } } # test @rs values
My problem was that if $dorient doesn't have SOMETHING_, $rs2 was set to the previous value of $1. In this particular case, I could sometimes get away with using /^(^_+)/ as the pattern, but I'm looking for the more general solution here.

UPDATE: made example more clear

Don Wilde
"There's more than one level to any answer."

Replies are listed 'Best First'.
Re: resetting Perl RegEx backreferences
by davido (Cardinal) on Mar 08, 2006 at 19:58 UTC

    I believe the error in your ways is not checking to see if the match succeeded before using the match capture variable values. Instead of this:

    $dorient =~ /^([^_]+)_/; $rs[2] = $1;

    You should be checking to ensure that the match succeeded:

    if( $dorient =~ /^([^_]+)_/ ) { $rs[2] = $1; }

    This should be true of almost any time you rely on $1; you should first check to see if $1 is meaningful by verifying the success of its corresponding regex match.

    The behavior is described in perlop wherein it states: "NOTE: failed matches in Perl do not reset the match variables...."


    Dave

      Ah, very good. Much cleaner! Thanks, Dave! <g>

      Don Wilde
      "There's more than one level to any answer."
Re: resetting Perl RegEx backreferences
by pKai (Priest) on Mar 08, 2006 at 21:07 UTC

    Alternatively you can assign to your variable directly in list context:

    ($rs[2]) = $dorient =~ /^([^_]+)_/;

    $rs[2] will be undefined, when the match wasn't successful.

Re: resetting Perl RegEx backreferences
by hv (Prior) on Mar 08, 2006 at 20:11 UTC

    As davido says, checking that the match succeeded is almost always the right solution. If for some reason you don't want to do that, easiest way is to throw a cheap guaranteed match in before the real match:

    "" =~ /(?=)/; # reset match vars $dorient =~ /^([^_]+)_/; $rs[2] = $1; # or undef if no match

    Hugo