Your method of enclosing the match operation only *appears* to work (in terms of leaving $1 unmodified after the sub call, and returning undef on failure), but not at all for the reasons you provide. The use of 'strict' has nothing to do with producing the undef value, and 'strict' has nothing to do with how variables are scoped. Had any successful match been applied in the outer scope prior to your sub calls, then $1 (which is global) would have been set there and its value would be the return value on any of your sub calls that failed to match. Check this minor variation on your example:

use strict; "blah" =~ /(a)/; # now we've set $1 at the global scope my $i = "mmmm9"; my $a = match_rtn( $i ); print $a, "\n"; $i = "mmm"; $a = match_rtn( $i ); print $a, "\n"; # ook! this isn't undefined! sub match_rtn { my $str = shift; $str =~ m/(\d+)/; return $1; }

Match variables ($1, $2, etc.) are global variables. When match variables are set (due to a successful match operation), they are always localized to the enclosing block. So, they retain their value until another successful pattern match, or the end of the current block. Witness:

{ $_ = 'blah'; /(a)/; print "$1\n"; # prints: a } print "$1\n"; # unitialized warning

Now try this longer example and you'll see that the match variables are implicitly localized (ie, in the sense of local()):

$_ = 'blah'; /(\w)/; print "$1\n"; # prints: b /(\d)/; print "$1\n"; # still prints: b { /(a)/; print "$1\n"; # prints: a } print "$1\n"; # prints: b

The proper way to protect yourself from using unintended old values in $1 and friends is to program defensively and check if a pattern match succeeded before trying to use captured subexpressions (as previous messages in this thread have shown).

I looked up exactly what 'strict' is supposed to do, and the Camel book says its supposed to disallow "unsafe" code. My question to anyone else is what is considered "unsafe"?

Please see the documentation for strict and tye's review of strict.pm for starters.


In reply to Re: Re: Regular Expression Question by danger
in thread What happens with empty $1 in regular expressions? (was: Regular Expression Question) by marcblecher

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.