Does this mean that a regexp that captures $1 implies $& which implies the performance hit for maintaining $`, $&, and $' ?
Only for maintaining them for that regex. The way that $DIGIT variables are supported is thus:
  1. The string being matched against is copied (via savepvn()) to rx->subbeg.
  2. The offsets of the $DIGIT vars are stored in the two arrays rx->startp and rx->endp.
  3. When you access $2, Perl does magic:
    1. It takes the beginning and ending offsets, rx->startp[2] and rx->endp[2], and takes a substring of rx->subbeg.
    2. It savepvn()s (copies) that substring to a scalar and returns it.
However, this only happens in a regex that has capturing parentheses! If you have a regex that does NOT have capturing parentheses, it does not need to copy the string.

The $DIGIT vars are like tiny instances of $& that only appear when you need them. $& appears all the time if you use it once. Here's an example that shows that a regex that uses capturing parentheses gives you the ability to use $& and the like. These are three separate programs. I'm using eval '' so that $& isn't seen at the time the regexes are executed.

#!/usr/bin/perl "simple" =~ /im/ and eval q{ print "<$`><$&><$'>\n" }; ### #!/usr/bin/perl "complex" =~ /.p/ and eval q{ print "<$`><$&><$'>\n" }; #<co><mp><lex> ### #!/usr/bin/perl "capture" =~ /(.t)./ and eval q{ print "<$`><$&><$'>:<$1>\n" }; #<ca><ptu><re>:<pt>
Does that make sense? In order to have $1, you have to have the string that is also used for $&. From perlre:
WARNING: Once Perl sees that you need one of $&, $`, or $' anywhere in the program, it has to provide them for every pattern match. This may substantially slow your program. Perl uses the same mechanism to produce $1, $2, etc, so you also pay a price for each pattern that contains capturing parentheses. (To avoid this cost while retaining the grouping behaviour, use the extended regular expression (?: ... ) instead.) But if you never use $&, $` or $', then patterns without capturing parentheses will not be penalized. So avoid $&, $', and $` if you can, but if you can't (and some algorithms really appreciate them), once you've used them once, use them at will, because you've already paid the price. As of 5.005, $& is not so costly as the other two.
Some of that will be rewritten with the advent of this pragma, though. It's nice to "rewrite the books".

_____________________________________________________
Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;


In reply to Re: Re: Finally, a $& compromise! by japhy
in thread Finally, a $& compromise! by japhy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.