in reply to Defining your own regex character class

Further to LanX's reply and haukex's reply: Note that in addition to being used to compose more complex regexes, a  qr// object can be quantified as discussed here (with one small exception discussed below) in the same way as other regex atoms.

The quantifier exception is for the case of a counting quantifier on a regex object that looks "too much" like a hash element. The problem is rare (albeit potentially completely silent if it is present!) and easily fixed:

c:\@Work\Perl\monks>perl -wMstrict -le "my %rx = ( 2 => 'Oops...' ); my $rx = qr{ \b foo \b }xms; ;; my $n = 2; my $ry = qr{ $rx{2} X $rx{$n} Y (?:$rx){$n} }xms; print $ry; " (?msx-i: (?msx-i: \b foo \b ){2} X Oops... Y (?:(?msx-i: \b foo \b )){ +2} )
(Update: Changed this code example to make it shorter, hopefully clearer.)

(BTW: Note also that  $RE{net}{IPv4} from Regexp::Common::net is by design not delimited, so there can be a match in certain undesired or surprising cases:

c:\@Work\Perl\monks>perl -wMstrict -le "use Regexp::Common qw(net); ;; my $ipv4_A = qr{ $RE{net}{IPv4} }xms; my $ipv4_B = qr{ \b $RE{net}{IPv4} \b }xms; ;; print 'match A' if '99999.9.9.99999' =~ $ipv4_A; print 'match B' if '99999.9.9.99999' =~ $ipv4_B; " match A
Caveat Programmor. :)

Update: Here's a fun (for some definition of "fun") little problem. A decimal (i.e., base-10) IPv4 address regex could be neatly defined as follows:

my $octet = qr{ \d+ }xms; my $ipv4 = qr{ \b $octet (?: [.] $octet){3} \b }xms;
Unfortunately, this matches an IP address with octets like 256 or 99999. How would you define  $octet as a pure (i.e., no  (?{ code }) or  (?{{ code }}) constructs) regex so that only decimal octets in the range 0 .. 255 were matched? (Please, no experienced regex wranglers need reply!)


Give a man a fish:  <%-{-{-{-<

Replies are listed 'Best First'.
Re^2: Defining your own regex character class
by Laurent_R (Canon) on Dec 18, 2017 at 18:47 UTC
    I was going to suggest better definitions of $octet (one using pure regexes and one using a code assertion), but I'll refrain from that after having read your last paragraph. ;-)

    And, BTW, to the OP: what you're looking for is not called a character class (but this has been pointed out already).

      ... definitions of $octet ... your last paragraph.

      Yeah, I had that in mind as something for a regex novice to play around with, especially in light of the topic of the OP (partial hint, hint).


      Give a man a fish:  <%-{-{-{-<