in reply to Re: Re: qr// with /e?
in thread qr// with /e?

To avoid the extra variable hanging around, can't you just do:

my $days_re = join('|', @days); $days_re = qr/$days_re/;

Of course, I'd like some way to aggregate regular expressions and nicely apply operations like "concatenate" or "alternative" to them without having to drop back into string representations - that is, I'd like to be able to do:

my @days = qw( Sun Mon Tue Wed Thu Fri Sat ); my $days_re = re_alternative(map {qr/\Q$_\E/} @days);

Right now you have to do this, which is ok I guess, but I wonder if there are efficiency issues with the regular expression constructed this way:

my @days = qw( Sun Mon Tue Wed Thu Fri Sat ); my $days_re = join("|", map {qr/\Q$_\E/} @days); $days_re = qr($days_re);
-- @/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/ map{y/X_/\n /;print}map{pop@$_}@/for@/

Replies are listed 'Best First'.
Re: Re: Re: Re: qr// with /e?
by tkil (Monk) on Apr 24, 2004 at 05:30 UTC

    *slaps forehead*

    Yes, re-using the same variable works fine. Grr. Nothing like a bad case of the blindingly obvious, is there?

    In the particular cases you bring up, using quotemeta might be a better option than mapping it to individual protected qr// objects. Not sure it matters all that much.

    As for effeciency, I would hope that perl's regex engine would be smart enough to discard unnecessary non-capturing groups (which is the main side-effect of building up regexes with a bunch of smaller qr//s...)

      Non-capturing parens ((?:...)) and flags ((?x-sim:...)) are used only in the parsing/compilation of the regular expression - they don't contribute nodes of their own to the resulting regexp program, but they may change what nodes something else compiles to.

      For example, qr{.} gives the node 'SANY', while qr{.}s gives 'REG_ANY'.

      However the compiler cannot stitch together subpatterns in compiled form, so whenever you build up a pattern from two previously qr'd patterns, or a qr'd pattern and some new text, perl constructs the stringified form of the new pattern and then compiles this new string from scratch.

      Because of this, it is usually simpler for this sort of code to construct the string representing the complete pattern (with the added benefit that your diagnostics become more readable), so I'd tend to use:

      my @days = qw( Sun Mon Tue Wed Thu Fri Sat ); my $days_re = join '|', map "\Q$_\E", @days; $days_re = qr{$days_re};
      rather than putting an additional qr{} in the map.

      Hugo

        However the compiler cannot stitch together subpatterns in compiled form, so whenever you build up a pattern from two previously qr'd patterns, or a qr'd pattern and some new text, perl constructs the stringified form of the new pattern and then compiles this new string from scratch.

        Interesting! I wonder if there is a need for a low-level regex-manipulating package that would allow us to do manipulations like this on the actual regex objects. (Or if such an interface already exists!)

        One comment / question on your preferred solution:

        my $days_re = join '|', map "\Q$_\E", @days;

        I've seen this used a few times now, and I wonder if there is a meme out there people are copying from. Is there any reason to prefer this over:

        my $days_re = join '|', map quotemeta, @days;