John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

Too bad regex's don't take @var to mean "any of those". So I wrote:
my $goodnames= join ('|', map { quotemeta} (@names)); # do that once # later... s/stuff (?:$goodnames) stuff/o # each time I need that
To improve upon this, what's a good idiom for doing the equivilent of:
my $goodnames= join ('|', map { quotemeta} (@names)); $goodnames= qr/$goodnames/;
in one statement?

Replies are listed 'Best First'.
Re: Good Idiom for Matching List?
by Abigail-II (Bishop) on Dec 11, 2002 at 17:36 UTC
    Eh, why bother with the $goodnames = qr /$goodnames/ line? But, if you really wanted to, something like:
    my $goodnames = do {local $" = "|"; qr {@{[map {"\Q$_"} @names]}}};

    Abigail

      Abigail is (as usual) correct. However, if I were doing this, I'd code that as:
      my $goodnames = qr/@{[join '|', map "\Q$_", sort @names]}/;
      Or, if I were golfing:
      my $goodnames = qr/@{[join'|',map"\Q$_",sort@names]}/; ;-)

      And, if I were going to be interpolating arrays into regexes on a regular basis, I'd probably modularize the process thus:

      use Interpolation OR => sub { join '|', map "\Q$_", sort @{$_[0]} }; # and later... my $goodnames = qr/$OR{\@names}/;

      BTW, arrays will interpolate disjunctively in Perl 6, so eventually you'll be able to just write:

      # Perl 6 code my $goodnames = /@names/;

      Yet another feature for Abigail to not be impressed by, I guess. ;-)

        I think my problem with this construct, that is, why I think it should be natural and simple, is that what I'm writing is basically a composition of two functions.
        my $x= f(); $x= g($x);
        That is, the first result is the argument to the second line, and is not used for anything else.

        So I would naturally write it as my $x=g(f()); instead, as one expression without a named temporary.

        Here's the rub: qr// does not use the normal function syntax. It uses the quoting syntax, which doesn't naturally handle arbitrary nesting. So, use the @{[]} hack (or worse, if the list context would be a problem) or use Interpolation to work-around.

        But what I really want is a callable function that takes a string and returns the corresponding compiled regex. We have quotemeta and lc that let us access special syntax features in a normal functional way; I want that for everything.

        sub mk_qr ($) { eval "qr/$_[0]/"; }
        —John
        # Perl 6 code my $goodnames = /@names/;
        No qr? How can it tell that I didn't mean to match against $_ now and set $goodnames to the success condition?

        Update: Downvoters: This is a serious question, not a slam on Damian. I saw something "different", from someone who knows about it, and wanted more details. You can see from the reply that there is indeed something interesting and different happening here. In fact, I can imagine a section in the docs will discuss this very issue!

      Why bother? Because it means I don't need grouping around it or the /o modifier. If I interpolate the (?: ... ) into $goodnames, I might as well turn it into a qr instead.
(tye)Re: Good Idiom for Matching List?
by tye (Sage) on Dec 11, 2002 at 19:53 UTC

    That's a good point. I've never used array interpolation in a regex and, when I've run into it, it was usually accidental (and broke things). I've occasionally wondered why it was implemented in the first place.

    We already do special processing of interpolation in a regex and \Q already treats interpolation specially. So it'd be really cool if /(@x|\Q@y\E)/ defaulted to being the same as     /(${\join"|",@x,map(quotemeta,@y)})/ instead of the rather silly     /(${\join($",@x)}${\quotemeta(join($",@y))})/ That is

    my @x= qw( : ; - ); my @y= qw( . | ? ); /(@x|\Q@y\E)/
    should be the same as the useful     /(:|;|-|\.|\||\?)/ not the silly     /(: ; -|\.\ \|\ \?)/ that is, use "|" in place of $" and have \Q not escape it.

            - tye
Re: Good Idiom for Matching List?
by particle (Vicar) on Dec 11, 2002 at 18:32 UTC

    i'm off to a meeting, but you'll want to make sure you sort @names by length, longest first. that way, carpet will match before car or pet.

    ~Particle *accelerates*

Re: Good Idiom for Matching List?
by particle (Vicar) on Dec 11, 2002 at 19:39 UTC

    i'm back.

    what about using the (??{}) ("postponed" regular subexpression) construct?

    try (untested)

    my @names = qw( car carpet pet); my $goodnames = sub{ join( '|', map {quotemeta $_->[0] } sort { $b->[1] <=> $a->[1] } map { [$_, length ] } @_; ); }; # later... s/stuff (??{ $goodnames->(@names) }) stuff//;

    ~Particle *accelerates*

Re: Good Idiom for Matching List?
by princepawn (Parson) on Dec 11, 2002 at 18:17 UTC
    would Quantum::Superpositions help any?

    Carter's compass: I know I'm on the right track when by deleting something, I'm adding functionality

      Conceptually, any(@namelist) eq $candidate will be what I want, but how do you get that into a regex, where $candidate is actually the next however many chars from the input?
Re: Good Idiom for Matching List?
by Molt (Chaplain) on Dec 12, 2002 at 10:39 UTC

    Have you thought of having a look at Jarkko's Regex::PreSuf module? It's main use is to create a regular expression from wordlists using prefix and suffix analysis, which should create a much faster regex than just joining them together.

    Not sure how well it works, found it whilst trying to find something else recently so not used it myself.

      My list only has three items in it, so it's not critical. My goal is to isolate the names into a simple maintainable definition at the top of the script. It's not speed critical.
Re: Good Idiom for Matching List?
by Aristotle (Chancellor) on Dec 15, 2002 at 00:05 UTC
    Though I'd probably go with Abigail's suggestion, you can abuse map for things like this one: my ($goodnames) = map "(?:$_)", join "|", map quotemeta, @names;

    Makeshifts last the longest.