in reply to Re^2: regular expressions
in thread regular expressions

I agree that doubly-negated character classes can be very tricky, but with care, they can be managed to good effect.

I think of it this way: Start with  [^\W] which is the same as  [\w] (or just \w). As you point out, this includes digits and _ (underscore) as well as alphas. "Subtract", as it were, the digits with  [^\W\d] and underscore with  [^\W\d_] and you're left with all alpha characters. Then subtract your chosen vowels  [^\W\d_aeiouyAEIUOY] and you're done!

c:\@Work\Perl\monks>perl -wMstrict -le "my $s = '123 annn xyzzy wwwewww xxx9xxx vvv_vvv eieio p pp ppp 2015 v +wxz vwxzpdq'; ;; my $consonant = qr{ [^\W\d_aeiouyAEIUOY] }xms; ;; printf qq{'$_' } for $s =~ m{ $consonant{4,} }xmsg; " 'vwxz' 'vwxzpdq'

All this is easier to manage, IMHO, with POSIX character classes or Unicode properties (if you're brave enough to venture out onto the thin, slippery ice of Unicode); both the following definitions work the same in the code above:
    my $consonant = qr{ [^[:^alpha:]aeiouyAEIUOY] }xms;
    my $consonant = qr{ [^\P{PosixAlpha}aeiouyAEIUOY] }xms;
YMMV. See perlrecharclass, perluniprops.

(See also the experimental Extended Bracketed Character Classes of version 5.18+; I can't give any examples using these ATM.)


Give a man a fish:  <%-(-(-(-<

Replies are listed 'Best First'.
Re^4: regular expressions
by Laurent_R (Canon) on Jun 07, 2015 at 18:43 UTC
    I agree with you, doubly-negated character classes can be tricky but can also be very useful. I was really reacting to the patterns proposed by Anonymous Monk and by toolic which were just not quite right.

      If a digit is not a vowel - is it a consonant? Unclear spec - hehehe. Easy fix:

      [^\WaeiouyAEIOUY0-9_]

      It's a valid solution if it passes the test cases. What? There were no test cases? Nevermind :)