QM has asked for the wisdom of the Perl Monks concerning the following question:

I came across the term "parameterized regex" today. Apparently it's been around (and misused) since at least 2009.

Here's a "parameterized regex":

m/^The quick brown (?<first>\w+) jumps over the lazy (?<second>\w+)\.$ +/

(This is called a named capture.)

What is the prevalent name for this feature, beyond my small Perl worldview?

-QM
--
Quantum Mechanics: The dreams stuff is made of

Replies are listed 'Best First'.
Re: Parameterized Regex (updated)
by haukex (Archbishop) on Mar 02, 2017 at 18:23 UTC

    The thing I've mainly used named captures for so far is named access to the capture groups via %+. This can be very useful e.g. in cases like long regexes and/or if you've got one qr// regex inside another qr// regex inside another... keeping track of numbered capture groups quickly becomes impossible.

    my $one = qr{ one (?<one> \d+ ) }xi; my $two = qr{ $one two (?<two> \d+ ) }xi; "One9Two8Three7" =~ m{ $two three (?<three> \d+ ) }xi; use Data::Dump; dd \%+; __END__ { # tied Tie::Hash::NamedCapture one => 9, three => 7, two => 8, }

    One bigger example that comes to my mind is this one: if you happen to have a 4th edition Camel (not sure if it's in the earlier ones), have a look at Chapter 5, Section "Fancy Patterns", subsection "Grammatical Patterns". You can basically build entire grammars out of them, for example using (?&NAME) you can recurse to a named subpatten (perlre; Update: a shorter example can be found under the documentation of (DEFINE)). Whether you want to build a full grammar using Perl's regex engine instead of one of the many well-established modules is of course another question :-)

    Update 2: It seems I misread the question "What is the prevalent name for this feature" as "What is the prevalent use for this feature".

    Update: Used /x and brackets to make the regexes more readable, eliminating a typo in the process. Changed the order of the paragraphs. Updated wording a little bit.

      Thank you for this reply.

      I was looking more at the (ab)use of the name "parameterized regex", since it's not parameterized. davido's reply gives the example I would have expected, based on the name.

      -QM
      --
      Quantum Mechanics: The dreams stuff is made of

Re: Parameterized Regex
by davido (Cardinal) on Mar 03, 2017 at 05:11 UTC

    What you demonstrated is called named captures, in Perl. In Python, it's called named groupings. I think that parameterized regexes are different.

    sub match { my $args = shift; my $entity = $args->{entity}; my $quantity = $args->{quantity}; return $args->{target} =~ m/^(?:$entity){$quantity}/; } match ({entity => 'a', quantity => 5, target => 'aaaaa'}) && print "Ma +tch\n";

    This is a little verbose, and wholly contrived, but the idea here is that you can pass into the regex things like metacharacters and quantifier totals, thus treating the regexp as a general template and letting the variables that get interpolated into it as specializations of the template. In this way, the regexp is accepting parameters, or is parameterized.

    This use of parameterized seems consistent with other examples where this term, or the term parametric are used. Parametric templates in C++, for example, which allow for generic functionality to be specialized by binding parameters to a template.

    I did some searching online on the name parameterized regex. The search results found a few hits that support my notion, and then one hit that presented a research paper on parameterized regexes. That paper was completely different from this explanation, diving into set theory and NFA and DFA state machines. So it's possible that casual use has rendered the term overloaded.


    Dave

      If you use a named capture, then you can use that capture for a subsequent match, perhaps a new regex. BTW, this gets you closer to a possible grammar. Maybe using a named capture this way in a new regex is what is meant by a parametrized regex.
        If you use a named capture, then you can use that capture for a subsequent match, perhaps a new regex. ... using a named capture this way in a new regex ...

        Could you give an example of what you mean? AFAIU, the  %+ hash is re-initialized when a new regex match is entered.

        c:\@Work\Perl\monks>perl -wMstrict -le "my $s = 'one two three four'; ;; $s =~ m{ \A (?<save> \w+) \s+ (?<this> \w+) \s+ (?<for> \w+) \s+ (?<later> \w+) \s* \z }xms; print join '-', @+{ qw(save this for later) }; ;; $s = 'fee fie foe fum'; ;; $s =~ m{ (?{ print join '==', @+{ qw(save this for later) } }) \A (?<save> \w+) \s+ (?<this> \w+) \s+ (?<for> \w+) \s+ (?<later> \w+) \s* \z }xms; print join '+++', @+{ qw(save this for later) }; " one-two-three-four Use of uninitialized value in join or string at (re_eval 1) line 1. Use of uninitialized value in join or string at (re_eval 1) line 1. Use of uninitialized value in join or string at (re_eval 1) line 1. Use of uninitialized value in join or string at (re_eval 1) line 1. ====== fee+++fie+++foe+++fum
        To do something like you suggest, you would have to "capture the capture" in a separate set of variables before the new match was entered.


        Give a man a fish:  <%-{-{-{-<

      Thanks. That's exactly what I would expect, based on the name "parameterized regex". I have used that approach many times.

      Since my OP, the purveyor of the poorly chosen phrase has admitted the name was chosen in haste, and he may change it. Besides the normal clutter of imprecise language from native speakers, the primary developers on the software in question are not native speakers. I always try to give non-native speakers more slack. (But marketing types and others, whose job it is to find the right words, get the sharp edge of the dictionary.)

      -QM
      --
      Quantum Mechanics: The dreams stuff is made of

Re: Parameterized Regex (named)
by Anonymous Monk on Mar 03, 2017 at 00:02 UTC

    I call it a named pattern or named capture

    There are no parameters to be seen and nothing is parameterized -- simple english is simple