re::ampersand - Perl pragma to alter $& support in regular expressions
"Perl" =~ /../ and print "<$&>"; # <Pe> "Perl" =~ /er/ and print "<$&>"; # <er>
{ # disable $& support no re::ampersand; "Perl" =~ /../ and print "<$&>"; # <> "Perl" =~ /er/ and print "<$&>"; # <> }
{ # disable $& support for simple regexes no re::ampersand 'simple'; "Perl" =~ /../ and print "<$&>"; # <Pe> "Perl" =~ /er/ and print "<$&>"; # <> }
{ # disable $& support for complex regexes no re::ampersand 'complex'; "Perl" =~ /../ and print "<$&>"; # <> "Perl" =~ /er/ and print "<$&>"; # <er> }
When Perl sees you using $`, $&, or $', it has to prepare these variable after every successful pattern match. This can slow a program down because these variables are "prepared" by copying the string you matched against to an internal location. This copying is also how $DIGIT variables are made accessible, but that only occurs on a per-regex basis: if a regex has capturing parentheses, the string will be copied, otherwise it will not be.
Some regexes are simple enough to be matched via the Boyer-Moore substring matching algorithm. This is a fast approach at finding a substring in a string. Regexes that only rely on constant text and anchors can be matched via the Boyer-Moore algorithm. (These regexes cannot have capturing parentheses.) Because of this, they don't get solved through the standard regex engine, and end up not preparing $& and its friends -- there is no copying of the string that was matched.
However, if Perl has seen you using $&, it decides that the simple regex has to go through the engine so it can prepare $&. This means that there is a two-fold slow-down: first, the simple regex has to go through both the Boyer-Moore algorithm and the rest of the regex engine, and second, it has to copy the string that was being matched against.
The re::ampersand pragma allows you to ignore the fact that $& (or its friends) has been used in your program. This produces a speed-up in portions of your code that do not need support for $&. This pragma is lexically scoped, which means it works in the block you call it in.
This module does not turn off capturing support -- if a regex has capturing parentheses in it, you will inadvertently get support for $&, because it is based on the copied string that $1, $2, ... are based on.
Your program will run the same way it did before if you do not use this pragma. Default behavior has not been changed.
You can turn off support for $& and friends with no re::ampersand, which turns off support for all regexes (unless they have capturing parentheses). If you only want to turn off support for simple regexes, send it the argument 'simple'. If you only want to turn off support for complex regexes, send it the argument 'complex'.
Turn on support for $& with use re::ampersand which turns on support for all regexes. To only supply support to simple regexes, send it the argument 'simple'. To only supply support to complex regexes, send it the argument 'complex'. Again, any regex with capturing parentheses will always have support for $& because of the mechanism that provides $DIGIT variables.
#!/usr/bin/perl -w
no re::ampersand;
# simple regex is not weighed down by $& "Perl" =~ /..$/ and print "<$&>\n"; # <>
{ use re::ampersand; "Perl" =~ /^../ and print "<$&>\n"; # <Pe> }
"Perl" =~ /..(?=.$)/ and print "<$&>\n"; # <>
#!/usr/bin/perl -w
# regexes set $& "Perl" =~ /(?<=.)./ and print "<$&>\n"; # <e>
{ no re::ampersand; # matching on a string you'd rather not have copied! $huge_string =~ /a+bc+/ and print "<$&>\n"; # <> }
# regexes set $& "Perl" =~ /.(?!..)/ and print "<$&>\n"; # <r>
Jeff japhy Pinyan, japhy@pobox.com.
_____________________________________________________
Jeff[japhy]Pinyan:
Perl,
regex,
and perl
hacker.
s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Finally, a $& compromise!
by dws (Chancellor) on Nov 28, 2001 at 07:06 UTC | |
by japhy (Canon) on Nov 28, 2001 at 07:38 UTC | |
Re: Finally, a $& compromise!
by TheDamian (Vicar) on Nov 28, 2001 at 11:41 UTC | |
by japhy (Canon) on Nov 28, 2001 at 11:53 UTC | |
by TheDamian (Vicar) on Nov 29, 2001 at 01:31 UTC | |
(tye)Re: Finally, a $& compromise!
by tye (Sage) on Nov 28, 2001 at 20:18 UTC | |
by japhy (Canon) on Nov 28, 2001 at 21:34 UTC | |
Re: Finally, a $& compromise!
by perrin (Chancellor) on Nov 28, 2001 at 12:04 UTC | |
Re (tilly) 1: Finally, a $& compromise!
by tilly (Archbishop) on Nov 28, 2001 at 20:18 UTC | |
by tye (Sage) on Nov 28, 2001 at 20:52 UTC | |
by tilly (Archbishop) on Nov 28, 2001 at 21:06 UTC | |
by tye (Sage) on Nov 28, 2001 at 21:39 UTC | |
by tilly (Archbishop) on Nov 28, 2001 at 21:55 UTC | |
| |
Re: Finally, a $& compromise!
by BrentDax (Hermit) on Nov 28, 2001 at 06:59 UTC |