in reply to Re^2: regular expressions
in thread regular expressions
I agree that doubly-negated character classes can be very tricky, but with care, they can be managed to good effect.
I think of it this way: Start with [^\W] which is the same as [\w] (or just \w). As you point out, this includes digits and _ (underscore) as well as alphas. "Subtract", as it were, the digits with [^\W\d] and underscore with [^\W\d_] and you're left with all alpha characters. Then subtract your chosen vowels [^\W\d_aeiouyAEIUOY] and you're done!
c:\@Work\Perl\monks>perl -wMstrict -le "my $s = '123 annn xyzzy wwwewww xxx9xxx vvv_vvv eieio p pp ppp 2015 v +wxz vwxzpdq'; ;; my $consonant = qr{ [^\W\d_aeiouyAEIUOY] }xms; ;; printf qq{'$_' } for $s =~ m{ $consonant{4,} }xmsg; " 'vwxz' 'vwxzpdq'
All this is easier to manage, IMHO, with POSIX character classes or Unicode properties (if you're brave enough to venture out onto the thin, slippery ice of Unicode); both the following definitions work the same in the code above:
my $consonant = qr{ [^[:^alpha:]aeiouyAEIUOY] }xms;
my $consonant = qr{ [^\P{PosixAlpha}aeiouyAEIUOY] }xms;
YMMV. See perlrecharclass, perluniprops.
(See also the experimental Extended Bracketed Character Classes of version 5.18+; I can't give any examples using these ATM.)
Give a man a fish: <%-(-(-(-<
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: regular expressions
by Laurent_R (Canon) on Jun 07, 2015 at 18:43 UTC | |
by Anonymous Monk on Jun 07, 2015 at 19:03 UTC |