in reply to Regex Capturing: Is this a bug or a feature?

$1 and friends are neither global nor lexical variables. They're Weird Magical Things, and they're tied to the optree, not to any scratchpad or symbol table.

The optree is the parsed and processed form of your program that the perl interpreter actually executes. When you have a regex, perl reserves space for the potential match variables and hangs a pointer to them off the optree (sort of a mini-scratchpad. Sort of) which the rest of the code in the lexical scope (and any subsequent inner scopes, at least some times) will access. (The compiler handles the visibility to inner lexical scopes thing--there's still no scratchpads involved)

Because it's just some odd, regex-engine-private memory, it doesn't really behave the same way that other variables do. They don't get explicitly reset--they're just set when a match happens. So if you skip a match, they retain their old values.

This semi-sorta-global behavior also makes for fun with threads--you must mutex-protect any regex with capturing parens in a threaded program when using 5.005-style threads, or when using ithreads with a 5.6.x perl. (5.8.0 unshares them so its safe) This includes ActivePerl's fork-emulation, though since that doesn't actually expose mutexes to do so it's kind of tricky. (Recent ActivePerl releases might have fixed this--check the release notes)

  • Comment on Re: Regex Capturing: Is this a bug or a feature?

Replies are listed 'Best First'.
Re: Re: Regex Capturing: Is this a bug or a feature?
by shotgunefx (Parson) on Sep 30, 2002 at 05:03 UTC
    Thanks for revealing under the hood. I don't have a problem with them working different, I just didn't expect it though perhaps I should have :)

    Still think perlre could be clearer though.

    -Lee

    "To be civilized is to deny one's nature."
      perlre could definitely be clearer. Part of the problem is very few people understand it properly, so the documentation's often done by people who don't quite get it. (As the people who do understand the regex engine well enough to notice the problems are generally too heavily medicated to do anything... :)
        Who's suggestion@box should I put this in? :)
        regex's are certainly my weakest area in Perl. I haven't had to get as up close and personal with them as the other areas. I use them often but nothing that requires huge contortions. I'm very suprised it took me this many years to get bit by this. I posted another message in this thread that suprised me even more, any thoughts?

        -Lee

        "To be civilized is to deny one's nature."